🚀 gte-multilingual-base (dense)
This is a multilingual model that supports a wide range of languages. It has been tested on various tasks such as Clustering, STS, Classification, Reranking, and Retrieval, showing different performance metrics on multiple datasets.
📄 License
The model is licensed under the Apache-2.0 license.
📚 Documentation
Supported Languages
The model supports the following languages:
- af, ar, az, be, bg, bn, ca, ceb, cs, cy, da, de, el, en, es, et, eu, fa, fi, fr, gl, gu, he, hi, hr, ht, hu, hy, id, is, it, ja, jv, ka, kk, km, kn, ko, ky, lo, lt, lv, mk, ml, mn, mr, ms, my, ne, nl, 'no', pa, pl, pt, qu, ro, ru, si, sk, sl, so, sq, sr, sv, sw, ta, te, th, tl, tr, uk, ur, vi, yo, zh
Model Performance
The following table shows the performance of the gte-multilingual-base (dense)
model on different tasks and datasets:
Task Type |
Dataset Name |
Metric Type |
Metric Value |
Clustering |
PL-MTEB/8tags-clustering |
v_measure |
33.66681726329994 |
STS |
C-MTEB/AFQMC |
cos_sim_spearman |
43.54760696384009 |
STS |
C-MTEB/ATEC |
cos_sim_spearman |
48.91186363417501 |
Classification |
PL-MTEB/allegro-reviews |
accuracy |
41.689860834990064 |
Clustering |
lyon-nlp/alloprof (AlloProfClusteringP2P) |
v_measure |
54.20241337977897 |
Clustering |
lyon-nlp/alloprof (AlloProfClusteringS2S) |
v_measure |
44.34083695608643 |
Reranking |
lyon-nlp/mteb-fr-reranking-alloprof-s2p |
map |
64.91495250072002 |
Retrieval |
lyon-nlp/alloprof |
ndcg_at_10 |
53.638 |
Classification |
mteb/amazon_counterfactual |
accuracy |
75.95522388059702 |
Classification |
mteb/amazon_polarity |
accuracy |
80.717625 |
Classification |
mteb/amazon_reviews_multi (en) |
accuracy |
43.64199999999999 |
Classification |
mteb/amazon_reviews_multi (de) |
accuracy |
40.108 |
Classification |
mteb/amazon_reviews_multi (es) |
accuracy |
40.169999999999995 |
Classification |
mteb/amazon_reviews_multi (fr) |
accuracy |
39.56799999999999 |
Classification |
mteb/amazon_reviews_multi (ja) |
accuracy |
35.75000000000001 |
Classification |
mteb/amazon_reviews_multi (zh) |
accuracy |
33.342000000000006 |
Retrieval |
mteb/arguana |
ndcg_at_10 |
58.231 |
Retrieval |
clarin-knext/arguana-pl |
ndcg_at_10 |
53.166000000000004 |
Clustering |
mteb/arxiv-clustering-p2p |
v_measure |
46.01900557959478 |
Clustering |
mteb/arxiv-clustering-s2s |
v_measure |
41.06626465345723 |
Reranking |
mteb/askubuntudupquestions-reranking |
map |
61.87514497610431 |
STS |
mteb/biosses-sts |
cos_sim_spearman |
81.21450112991194 |
STS |
C-MTEB/BQ |
cos_sim_spearman |
51.71589543397271 |
Retrieval |
maastrichtlawtech/bsard |
ndcg_at_10 |
26.115 |
BitextMining |
mteb/bucc-bitext-mining (de-en) |
f1 |
98.6169102296451 |
BitextMining |
mteb/bucc-bitext-mining (fr-en) |
f1 |
97.89603052314916 |
BitextMining |
mteb/bucc-bitext-mining (ru-en) |
f1 |
97.12388869645537 |
BitextMining |
mteb/bucc-bitext-mining (zh-en) |
f1 |
98.15692469720906 |
Classification |
mteb/banking77 |
accuracy |
85.36038961038962 |
Clustering |
mteb/biorxiv-clustering-p2p |
v_measure |
37.5903826674123 |
Clustering |
mteb/biorxiv-clustering-s2s |
v_measure |
34.21474277151329 |
Classification |
PL-MTEB/cbd |
accuracy |
62.519999999999996 |
PairClassification |
PL-MTEB/cdsce-pairclassification |
cos_sim_ap |
74.90132799162956 |
STS |
PL-MTEB/cdscr-sts |
cos_sim_spearman |
90.30727955142524 |
Clustering |
C-MTEB/CLSClusteringP2P |
v_measure |
37.94850105022274 |
Clustering |
C-MTEB/CLSClusteringS2S |
v_measure |
38.11958675421534 |
Reranking |
C-MTEB/CMedQAv1-reranking |
map |
86.10950950485399 |
Reranking |
C-MTEB/CMedQAv2-reranking |
map |
87.28038294231966 |
Retrieval |
mteb/cqadupstack-android |
ndcg_at_10 |
47.099000000000004 |
Retrieval |
mteb/cqadupstack-english |
ndcg_at_10 |
45.973000000000006 |
Retrieval |
mteb/cqadupstack-gaming |
ndcg_at_10 |
55.606 |
Retrieval |
mteb/cqadupstack-gis |
ndcg_at_10 |
36.638 |
Retrieval |
mteb/cqadupstack-mathematica |
ndcg_at_10 |
30.711 |
Retrieval |
mteb/cqadupstack-physics |
ndcg_at_10 |
44.523 |
Retrieval |
mteb/cqadupstack-programmers |
ndcg_at_10 |
37.940000000000005 |
Retrieval |
mteb/cqadupstack |
... |
... |