Mini Gte
M
Mini Gte
Developed by prdev
A lightweight sentence embedding model based on DistilBERT, suitable for various text processing tasks
Downloads 1,240
Release Time : 1/29/2025
Model Overview
mini-gte is a lightweight sentence embedding model based on the DistilBERT architecture, primarily designed for natural language processing tasks such as text classification, information retrieval, and clustering. The model demonstrates excellent performance across multiple MTEB benchmarks, making it particularly suitable for scenarios requiring efficient text representation.
Model Features
Lightweight and Efficient
Based on the DistilBERT architecture, significantly reducing model size while maintaining performance
Multi-task Support
Performs well in various tasks including text classification, information retrieval, and clustering
Excellent Benchmark Performance
Achieves competitive results across multiple MTEB benchmarks
Model Capabilities
Text Classification
Information Retrieval
Text Clustering
Sentence Embedding Generation
Use Cases
E-commerce
Product Review Sentiment Analysis
Analyze sentiment tendencies in Amazon product reviews
Achieved 92.94% accuracy in Amazon polarity classification task
Counterfactual Review Detection
Identify counterfactual reviews on Amazon platform
Achieved 74.90% accuracy in Amazon counterfactual classification task
Academic Research
Paper Clustering
Topic clustering for arXiv papers
Achieved V-measure of 47.25 in arXiv paper clustering task
Information Retrieval
Argument Retrieval
Retrieve relevant arguments in debate datasets
Achieved NDCG@10 of 56.61 in ArguAna retrieval task
🚀 prdev/mini-gte
This is a model based on distilbert/distilbert-base-uncased
using the sentence-transformers
library. It has been tested on multiple datasets in the MTEB benchmark, covering various NLP tasks such as classification, retrieval, clustering, and more.
📚 Documentation
Model Information
Property | Details |
---|---|
Base Model | distilbert/distilbert-base-uncased |
Library Name | sentence-transformers |
Model Name | prdev/mini-gte |
Evaluation Results
1. MTEB AmazonCounterfactualClassification (en)
- Task Type: Classification | Metric | Value | |--------|-------| | Accuracy | 74.8955 | | F1 | 68.84209999999999 | | F1 Weighted | 77.1819 | | AP | 37.731500000000004 | | AP Weighted | 37.731500000000004 | | Main Score | 74.8955 |
2. MTEB AmazonPolarityClassification (default)
- Task Type: Classification | Metric | Value | |--------|-------| | Accuracy | 92.9424 | | F1 | 92.9268 | | F1 Weighted | 92.9268 | | AP | 89.2255 | | AP Weighted | 89.2255 | | Main Score | 92.9424 |
3. MTEB AmazonReviewsClassification (en)
- Task Type: Classification | Metric | Value | |--------|-------| | Accuracy | 53.09199999999999 | | F1 | 52.735299999999995 | | F1 Weighted | 52.735299999999995 | | Main Score | 53.09199999999999 |
4. MTEB ArguAna (default)
- Task Type: Retrieval | Metric | Value | |--------|-------| | NDCG@1 | 31.791999999999998 | | NDCG@3 | 47.205999999999996 | | NDCG@5 | 51.842999999999996 | | NDCG@10 | 56.614 | | NDCG@20 | 59.211999999999996 | | NDCG@100 | 60.148999999999994 | | NDCG@1000 | 60.231 | | MAP@1 | 31.791999999999998 | | MAP@3 | 43.35 | | MAP@5 | 45.928000000000004 | | MAP@10 | 47.929 | | MAP@20 | 48.674 | | MAP@100 | 48.825 | | MAP@1000 | 48.827999999999996 | | Recall@1 | 31.791999999999998 | | Recall@3 | 58.392999999999994 | | Recall@5 | 69.63000000000001 | | Recall@10 | 84.211 | | Recall@20 | 94.23899999999999 | | Recall@100 | 99.004 | | Recall@1000 | 99.644 | | Precision@1 | 31.791999999999998 | | Precision@3 | 19.464000000000002 | | Precision@5 | 13.926 | | Precision@10 | 8.421 | | Precision@20 | 4.712000000000001 | | Precision@100 | 0.9900000000000001 | | Precision@1000 | 0.1 | | MRR@1 | 32.4324 | | MRR@3 | 43.6463 | | MRR@5 | 46.1569 | | MRR@10 | 48.1582 | | MRR@20 | 48.9033 | | MRR@100 | 49.0537 | | MRR@1000 | 49.0569 | | NAUC NDCG@1 Max | -4.8705 | | NAUC NDCG@1 Std | -9.1757 | | NAUC NDCG@1 Diff1 | 17.743000000000002 | | NAUC NDCG@3 Max | -3.916 | | NAUC NDCG@3 Std | -10.424 | | NAUC NDCG@3 Diff1 | 12.3928 | | NAUC NDCG@5 Max | -2.5090000000000003 | | NAUC NDCG@5 Std | -10.1328 | | NAUC NDCG@5 Diff1 | 13.3086 | | NAUC NDCG@10 Max | -1.4653 | | NAUC NDCG@10 Std | -9.3154 | | NAUC NDCG@10 Diff1 | 13.7827 | | NAUC NDCG@20 Max | -2.4534000000000002 | | NAUC NDCG@20 Std | -9.0213 | | NAUC NDCG@20 Diff1 | 13.764399999999998 | | NAUC NDCG@100 Max | -2.8207 | | NAUC NDCG@100 Std | -9.0492 | | NAUC NDCG@100 Diff1 | 14.3422 | | NAUC NDCG@1000 Max | -3.0108 | | NAUC NDCG@1000 Std | -9.2507 | | NAUC NDCG@1000 Diff1 | 14.2345 | | NAUC MAP@1 Max | -4.8705 | | NAUC MAP@1 Std | -9.1757 | | NAUC MAP@1 Diff1 | 17.743000000000002 | | NAUC MAP@3 Max | -4.2874 | | NAUC MAP@3 Std | -10.1539 | | NAUC MAP@3 Diff1 | 13.6101 | | NAUC MAP@5 Max | -3.5856 | | NAUC MAP@5 Std | -9.9657 | | NAUC MAP@5 Diff1 | 14.1354 | | NAUC MAP@10 Max | -3.2553 | | NAUC MAP@10 Std | -9.6771 | | NAUC MAP@10 Diff1 | 14.402899999999999 | | NAUC MAP@20 Max | -3.5541000000000005 | | NAUC MAP@20 Std | -9.6286 | | NAUC MAP@20 Diff1 | 14.3927 | | NAUC MAP@100 Max | -3.5811999999999995 | | NAUC MAP@100 Std | -9.6278 | | NAUC MAP@100 Diff1 | 14.4922 | | NAUC MAP@1000 Max | -3.5881000000000003 | | NAUC MAP@1000 Std | -9.6335 | | NAUC MAP@1000 Diff1 | 14.488400000000002 | | NAUC Recall@1 Max | -4.8705 | | NAUC Recall@1 Std | -9.1757 | | NAUC Recall@1 Diff1 | 17.743000000000002 | | NAUC Recall@3 Max | -2.7195 | | NAUC Recall@3 Std | -11.2342 | | NAUC Recall@3 Diff1 | 8.7116 | | NAUC Recall@5 Max | 1.7492 | | NAUC Recall@5 Std | -10.6963 | | NAUC Recall@5 Diff1 | 10.569 | | NAUC Recall@10 Max | 10.7433 | | NAUC Recall@10 Std | -6.339599999999999 | | NAUC Recall@10 Diff1 | 10.6275 | | NAUC Recall@20 Max | 14.802499999999998 | | NAUC Recall@20 Std | 3.9196 | | NAUC Recall@20 Diff1 | 6.0286 | | NAUC Recall@100 Max | 40.8859 | | NAUC Recall@100 Std | 57.965500000000006 | | NAUC Recall@100 Diff1 | 30.7703 | | NAUC Recall@1000 Max | 24.2175 | | NAUC Recall@1000 Std | 70.9234 | | NAUC Recall@1000 Diff1 | 5.9272 | | NAUC Precision@1 Max | -4.8705 | | NAUC Precision@1 Std | -9.1757 | | NAUC Precision@1 Diff1 | 17.743000000000002 | | NAUC Precision@3 Max | -2.7195 | | NAUC Precision@3 Std | -11.2342 | | NAUC Precision@3 Diff1 | 8.7116 | | NAUC Precision@5 Max | 1.7492 | | NAUC Precision@5 Std | -10.6963 | | NAUC Precision@5 Diff1 | 10.569 | | NAUC Precision@10 Max | 10.7433 | | NAUC Precision@10 Std | -6.339599999999999 | | NAUC Precision@10 Diff1 | 10.6275 | | NAUC Precision@20 Max | 14.802499999999998 | | NAUC Precision@20 Std | 3.9196 | | NAUC Precision@20 Diff1 | 6.0286 | | NAUC Precision@100 Max | 40.8859 | | NAUC Precision@100 Std | 57.965500000000006 | | NAUC Precision@100 Diff1 | 30.7703 | | NAUC Precision@1000 Max | 24.2175 | | NAUC Precision@1000 Std | 70.9234 | | NAUC Precision@1000 Diff1 | 5.9272 | | NAUC MRR@1 Max | -5.1491 | | NAUC MRR@1 Std | -8.8127 | | NAUC MRR@1 Diff1 | 15.857099999999999 | | NAUC MRR@3 Max | -5.083200000000001 | | NAUC MRR@3 Std | -9.8967 | | NAUC MRR@3 Diff1 | 11.9042 | | NAUC MRR@5 Max | -4.530399999999999 | | NAUC MRR@5 Std | -9.900599999999999 | | NAUC MRR@5 Diff1 | 12.2957 | | NAUC MRR@10 Max | -4.2387 | | NAUC MRR@10 Std | -9.6123 | | NAUC MRR@10 Diff1 | 12.4769 | | NAUC MRR@20 Max | -4.5254 | | NAUC MRR@20 Std | -9.5502 | | NAUC MRR@20 Diff1 | 12.4674 | | NAUC MRR@100 Max | -4.5576 | | NAUC MRR@100 Std | -9.549100000000001 | | NAUC MRR@100 Diff1 | 12.556899999999999 | | NAUC MRR@1000 Max | -4.5645999999999995 | | NAUC MRR@1000 Std | -9.5548 | | NAUC MRR@1000 Diff1 | 12.552900000000001 | | Main Score | 56.614 |
5. MTEB ArxivClusteringP2P (default)
- Task Type: Clustering | Metric | Value | |--------|-------| | V-Measure | 47.2524 | | V-Measure Std | 13.7772 | | Main Score | 47.2524 |
6. MTEB ArxivClusteringS2S (default)
- Task Type: Clustering | Metric | Value | |--------|-------| | V-Measure | 40.7262 | | V-Measure Std | 14.125499999999999 | | Main Score | 40.7262 |
7. MTEB AskUbuntuDupQuestions (default)
- Task Type: Reranking | Metric | Value | |--------|-------| | MAP | 61.57319999999999 | | MRR | 74.6714 | | nAUC MAP Max | 21.8916 | | nAUC MAP Std | 17.9941 | | nAUC MAP Diff1 | 1.5548 | | nAUC MRR Max | 34.139399999999995 | | nAUC MRR Std | 18.133499999999998 | | nAUC MRR Diff1 | 13.3597 | | Main Score | 61.57319999999999 |
8. MTEB BIOSSES (default)
- Task Type: STS | Metric | Value | |--------|-------| | Pearson | 86.7849 | | Spearman | 84.7302 | | Cosine Pearson | 86.7849 | | Cosine Spearman | 84.7302 | | Manhattan Pearson | 84.48179999999999 | | Manhattan Spearman | 84.0507 | | Euclidean Pearson | 84.8613 | | Euclidean Spearman | 84.6266 | | Main Score | 84.7302 |
9. MTEB Banking77Classification (default)
- Task Type: Classification | Metric | Value | |--------|-------| | Accuracy | 85.7175 | | F1 | 85.6781 | | F1 Weighted | 85.6781 | | Main Score | 85.7175 |
10. MTEB BiorxivClusteringP2P (default)
- Task Type: Clustering | Metric | Value | |--------|-------| | V-Measure | 40.0588 | | V-Measure Std | 0.8872 | | Main Score | 40.0588 |
11. MTEB BiorxivClusteringS2S (default)
- Task Type: Clustering | Metric | Value | |--------|-------| | V-Measure | 36.382799999999996 | | V-Measure Std | 1.167 | | Main Score | 36.382799999999996 |
12. MTEB CQADupstackAndroidRetrieval (default)
- Task Type: Retrieval | Metric | Value | |--------|-------| | NDCG@1 | 37.196 | | NDCG@3 | 42.778 | | NDCG@5 | 45.013999999999996 | | NDCG@10 | 47.973 | | NDCG@20 | 50.141000000000005 | | NDCG@100 | 53.31399999999999 | | NDCG@1000 | 55.52 | | MAP@1 | 30.598 | | MAP@3 | 38.173 | | MAP@5 | 40.093 | | MAP@10 | 41.686 | | MAP@20 | 42.522 | | MAP@100 | 43.191 | | MAP@1000 | 43.328 | | Recall@1 | 30.598 | | Recall@3 | 45.019999999999996 | | Recall@5 | 51.357 | | Recall@10 | 60.260000000000005 | | Recall@20 | 67.93299999999999 | | Recall@100 | 82.07 | | Recall@1000 | 96.345 | | Precision@1 | 37.196 | | Precision@3 | 20.552999999999997 | | Precision@5 | 14.707 | | Precision@10 | 9.213000000000001 | | Precision@20 | 5.522 | | Precision@100 | 1.4949999999999999 | | Precision@1000 | 0.198 | | MRR@1 | 37.196 | | MRR@3 | 44.4683 | | MRR@5 | 45.9776 | | MRR@10 | 47.1884 | | MRR@20 | 47.6763 | | MRR@100 | 47.957 | | MRR@1000 | 48.0103 | | NAUC NDCG@1 Max | 38.1056 | | NAUC NDCG@1 Std | -1.5731 | | NAUC NDCG@1 Diff1 | 52.3965 | | NAUC NDCG@3 Max | 35.8655 | | NAUC NDCG@3 Std | 0.2057 | | NAUC NDCG@3 Diff1 | 46.299600000000005 | | NAUC NDCG@5 Max | 36.3806 | | NAUC NDCG@5 Std | 1.542 | | NAUC NDCG@5 Diff1 | 45.3674 | | NAUC NDCG@10 Max | 36.6053 | | NAUC NDCG@10 Std | 2.7934 | | NAUC NDCG@10 Diff1 | 45.3474 | | NAUC NDCG@20 Max | 37.2333 | | NAUC NDCG@20 Std | 3.3346 | | NAUC NDCG@20 Diff1 | 45.6105 | | NAUC NDCG@100 Max | 38.168400000000005 | | NAUC NDCG@100 Std | 4.618 | | NAUC NDCG@100 Diff1 | 45.7041 | | NAUC NDCG@1000 Max | 37.911 | | NAUC NDCG@1000 Std | 4.2068 | | NAUC NDCG@1000 Diff1 | 46.0349 | | NAUC MAP@1 Max | 33.6794 | | NAUC MAP@1 Std | -0.7946 | | NAUC MAP@1 Diff1 | 55.799699999999994 | | NAUC MAP@3 Max | 35.216300000000004 | | NAUC MAP@3 Std | -0.3286 | | NAUC MAP@3 Diff1 | 49.5727 | | NAUC MAP@5 Max | 35.583999999999996 | | NAUC MAP@5 Std | 0.4626 | | NAUC MAP@5 Diff1 | 48.621900000000004 | | NAUC MAP@10 Max | 35.837 | | NAUC MAP@10 Std | 1.1462999999999999 | | NAUC MAP@10 Diff1 | 48.302499999999995 | | NAUC MAP@20 Max | 36.1877 | | NAUC MAP@20 Std | 1.5263 | | NAUC MAP@20 Diff1 | 48.2105 | | NAUC MAP@100 Max | 36.452 | | NAUC MAP@100 Std | 1.958 | | NAUC MAP@100 Diff1 | 48.1781 | | NAUC MAP@1000 Max | 36.4422 | | NAUC MAP@1000 Std | 1.9560000000000002 | | NAUC MAP@1000 Diff1 | 48.166399999999996 | | NAUC Recall@1 Max | 33.6794 | | NAUC Recall@1 Std | -0.7946 | | NAUC Recall@1 Diff1 | 55.799699999999994 | | NAUC Recall@3 Max | 33.591 | | NAUC Recall@3 Std | 0.7802 | | NAUC Recall@3 Diff1 | 42.728100000000005 | | NAUC Recall@5 Max | 34.1456 | | NAUC Recall@5 Std | 3.803 | | NAUC Recall@5 Diff1 | 39.3889 | | NAUC Recall@10 Max | 34.2228 | | NAUC Recall@10 Std | 7.394399999999999 | | NAUC Recall@10 Diff1 | 37.660900000000005 | | NAUC Recall@20 Max | 35.9338 | | NAUC Recall@20 Std | 9.6754 | | NAUC Recall@20 Diff1 | 36.626999999999995 | | NAUC Recall@100 Max | 43.0721 | | NAUC Recall@100 Std | 21.493499999999997 | | NAUC Recall@100 Diff1 | 34.809 | | NAUC Recall@1000 Max | 61.345499999999994 | | NAUC Recall@1000 Std | 66.2789 | | NAUC Recall@1000 Diff1 | 43.5024 | | NAUC Precision@1 Max | 38.1056 | | NAUC Precision@1 Std | -1.5731 | | NAUC Precision@1 Diff1 | 52.3965 | | NAUC Precision@3 Max | 31.2978 | | NAUC Precision@3 Std | 0.0904 | | NAUC Precision@3 Diff1 | 25.9668 | | NAUC Precision@5 Max | 28.2209 | | NAUC Precision@5 Std | 3.6561000000000003 | | NAUC Precision@5 Diff1 | 16.3544 | | NAUC Precision@10 Max | 21.8709 | | NAUC Precision@10 Std | 7.3919 | | NAUC Precision@10 Diff1 | 4.4909 | | NAUC Precision@20 Max | 16.3885 | | NAUC Precision@20 Std | 9.8527 | | NAUC Precision@20 Diff1 | -3.9433000000000002 | | NAUC Precision@100 Max | 4.612 | | NAUC Precision@100 Std | 6.9627 | | NAUC Precision@100 Diff1 | -14.0135 | | NAUC Precision@1000 Max | -10.599699999999999 | | NAUC Precision@1000 Std | -4.5693 | | NAUC Precision@1000 Diff1 | -21.0926 | | NAUC MRR@1 Max | 38.1056 | | NAUC MRR@1 Std | -1.5731 | | NAUC MRR@1 Diff1 | 52.3965 | | NAUC MRR@3 Max | 37.4199 | | NAUC MRR@3 Std | -0.5046 | | NAUC MRR@3 Diff1 | 46.5936 | | NAUC MRR@5 Max | 38.1046 | | NAUC MRR@5 Std | -0.284 | | NAUC MRR@5 Diff1 | 46.795 | | NAUC MRR@10 Max | 38.1046 | | NAUC MRR@10 Std | -0.284 | | NAUC MRR@10 Diff1 | 46.795 | | NAUC MRR@20 Max | 38.1046 | | NAUC MRR@20 Std | -0.284 | | NAUC MRR@20 Diff1 | 46.795 | | NAUC MRR@100 Max | 38.1046 | | NAUC MRR@100 Std | -0.284 | | NAUC MRR@100 Diff1 | 46.795 | | NAUC MRR@1000 Max | 38.1046 | | NAUC MRR@1000 Std | -0.284 | | NAUC MRR@1000 Diff1 | 46.795 | | Main Score | 47.973 |
Jina Embeddings V3
Jina Embeddings V3 is a multilingual sentence embedding model supporting over 100 languages, specializing in sentence similarity and feature extraction tasks.
Text Embedding
Transformers Supports Multiple Languages

J
jinaai
3.7M
911
Ms Marco MiniLM L6 V2
Apache-2.0
A cross-encoder model trained on the MS Marco passage ranking task for query-passage relevance scoring in information retrieval
Text Embedding English
M
cross-encoder
2.5M
86
Opensearch Neural Sparse Encoding Doc V2 Distill
Apache-2.0
A sparse retrieval model based on distillation technology, optimized for OpenSearch, supporting inference-free document encoding with improved search relevance and efficiency over V1
Text Embedding
Transformers English

O
opensearch-project
1.8M
7
Sapbert From PubMedBERT Fulltext
Apache-2.0
A biomedical entity representation model based on PubMedBERT, optimized for semantic relation capture through self-aligned pre-training
Text Embedding English
S
cambridgeltl
1.7M
49
Gte Large
MIT
GTE-Large is a powerful sentence transformer model focused on sentence similarity and text embedding tasks, excelling in multiple benchmark tests.
Text Embedding English
G
thenlper
1.5M
278
Gte Base En V1.5
Apache-2.0
GTE-base-en-v1.5 is an English sentence transformer model focused on sentence similarity tasks, excelling in multiple text embedding benchmarks.
Text Embedding
Transformers Supports Multiple Languages

G
Alibaba-NLP
1.5M
63
Gte Multilingual Base
Apache-2.0
GTE Multilingual Base is a multilingual sentence embedding model supporting over 50 languages, suitable for tasks like sentence similarity calculation.
Text Embedding
Transformers Supports Multiple Languages

G
Alibaba-NLP
1.2M
246
Polybert
polyBERT is a chemical language model designed to achieve fully machine-driven ultrafast polymer informatics. It maps PSMILES strings into 600-dimensional dense fingerprints to numerically represent polymer chemical structures.
Text Embedding
Transformers

P
kuelumbus
1.0M
5
Bert Base Turkish Cased Mean Nli Stsb Tr
Apache-2.0
A sentence embedding model based on Turkish BERT, optimized for semantic similarity tasks
Text Embedding
Transformers Other

B
emrecan
1.0M
40
GIST Small Embedding V0
MIT
A text embedding model fine-tuned based on BAAI/bge-small-en-v1.5, trained with the MEDI dataset and MTEB classification task datasets, optimized for query encoding in retrieval tasks.
Text Embedding
Safetensors English
G
avsolatorio
945.68k
29
Featured Recommended AI Models