E5 All Nli Triplet Matryoshka
This is a sentence-transformers model fine-tuned on intfloat/multilingual-e5-small, designed to map sentences and paragraphs into a 384-dimensional dense vector space, supporting tasks such as semantic text similarity and semantic search.
Downloads 14
Release Time : 7/15/2024
Model Overview
This model is specifically designed for semantic representation of sentences and paragraphs, capable of generating high-quality embedding vectors suitable for various natural language processing tasks.
Model Features
Multilingual support
Based on the multilingual-e5-small model, it supports text processing in multiple languages.
Efficient semantic representation
Converts text into 384-dimensional dense vectors, capturing deep semantic information.
MatryoshkaLoss training
Trained using MatryoshkaLoss and MultipleNegativesRankingLoss to optimize representation capabilities across different dimensions.
High performance
Demonstrates outstanding performance on multiple evaluation datasets, with Spearman cosine similarity reaching up to 0.7972.
Model Capabilities
Calculate sentence similarity
Semantic search
Text feature extraction
Text classification
Text clustering
Paraphrase mining
Use Cases
Information retrieval
Document retrieval
Quickly retrieve relevant documents based on query semantics
Achieved a score of 33.441 on the MTEB MIRACLRetrievalHardNegatives (ar) dataset
Question answering system
Match user questions with answers in the knowledge base
Achieved a score of 64.488 on the MTEB MLQARetrieval (ara-ara) dataset
Text analysis
Semantic similarity calculation
Compare the semantic similarity between two sentences or paragraphs
Spearman cosine similarity on the sts-test-384 dataset is 0.7972
Text clustering
Automatically group semantically similar texts
Featured Recommended AI Models