🚀 xiaobu-embedding
xiaobu-embedding
is a model that participates in multiple tasks of the MTEB benchmark, demonstrating its performance in various natural language processing tasks such as semantic text similarity, classification, clustering, reranking, and retrieval.
📚 Documentation
Model Information
Property |
Details |
Model Name |
xiaobu-embedding |
Tags |
mteb |
Performance Metrics
1. STS (Semantic Textual Similarity) Tasks
- C-MTEB/AFQMC (Validation Split)
Metric Type |
Value |
cos_sim_pearson |
49.37874132528482 |
cos_sim_spearman |
54.84722470052176 |
euclidean_pearson |
53.0495882931575 |
euclidean_spearman |
54.847727301700665 |
manhattan_pearson |
53.0632140838278 |
manhattan_spearman |
54.8744258024692 |
- C-MTEB/ATEC (Test Split)
Metric Type |
Value |
cos_sim_pearson |
48.15992903013723 |
cos_sim_spearman |
55.13198035464577 |
euclidean_pearson |
55.435876753245715 |
euclidean_spearman |
55.13215936702871 |
manhattan_pearson |
55.41429518223402 |
manhattan_spearman |
55.13363087679285 |
- C-MTEB/BQ (Test Split)
Metric Type |
Value |
cos_sim_pearson |
63.517830355554224 |
cos_sim_spearman |
65.57007801018649 |
euclidean_pearson |
64.05153340906585 |
euclidean_spearman |
65.5696865661119 |
manhattan_pearson |
63.95710619755406 |
manhattan_spearman |
65.48565785379489 |
- C-MTEB/LCQMC (Test Split)
Metric Type |
Value |
cos_sim_pearson |
69.96711977441642 |
cos_sim_spearman |
75.54747609685569 |
euclidean_pearson |
74.62663478056035 |
euclidean_spearman |
75.54761576699639 |
manhattan_pearson |
74.60983904582241 |
manhattan_spearman |
75.52758938061503 |
2. Classification Tasks
- mteb/amazon_reviews_multi (Test Split, zh Config)
Metric Type |
Value |
accuracy |
46.722 |
f1 |
45.039340641893205 |
- C-MTEB/IFlyTek-classification (Validation Split)
Metric Type |
Value |
accuracy |
49.74220854174683 |
f1 |
38.01399980618159 |
- C-MTEB/JDReview-classification (Test Split)
Metric Type |
Value |
accuracy |
86.73545966228893 |
ap |
55.72394235169542 |
f1 |
81.58550390953492 |
3. Clustering Tasks
- C-MTEB/CLSClusteringP2P (Test Split)
Metric Type |
Value |
v_measure |
43.24046498507819 |
- C-MTEB/CLSClusteringS2S (Test Split)
Metric Type |
Value |
v_measure |
41.22618199372116 |
4. Reranking Tasks
- C-MTEB/CMedQAv1-reranking (Test Split)
Metric Type |
Value |
map |
87.12213224673621 |
mrr |
89.57150793650794 |
- C-MTEB/CMedQAv2-reranking (Test Split)
Metric Type |
Value |
map |
87.57290061886421 |
mrr |
90.19202380952382 |
- C-MTEB/Mmarco-reranking (Dev Split)
Metric Type |
Value |
map |
28.076927649720986 |
mrr |
26.98015873015873 |
5. Retrieval Tasks
- C-MTEB/CmedqaRetrieval (Dev Split)
Metric Type |
Value |
map_at_1 |
25.22 |
map_at_10 |
37.604 |
map_at_100 |
39.501 |
map_at_1000 |
39.614 |
map_at_3 |
33.378 |
map_at_5 |
35.774 |
mrr_at_1 |
38.385000000000005 |
mrr_at_10 |
46.487 |
mrr_at_100 |
47.504999999999995 |
mrr_at_1000 |
47.548 |
mrr_at_3 |
43.885999999999996 |
mrr_at_5 |
45.373000000000005 |
ndcg_at_1 |
38.385000000000005 |
ndcg_at_10 |
44.224999999999994 |
ndcg_at_100 |
51.637 |
ndcg_at_1000 |
53.55799999999999 |
ndcg_at_3 |
38.845 |
ndcg_at_5 |
41.163 |
precision_at_1 |
38.385000000000005 |
precision_at_10 |
9.812 |
precision_at_100 |
1.58 |
precision_at_1000 |
0.183 |
precision_at_3 |
21.88 |
precision_at_5 |
15.974 |
recall_at_1 |
25.22 |
recall_at_10 |
54.897 |
recall_at_100 |
85.469 |
recall_at_1000 |
98.18599999999999 |
recall_at_3 |
38.815 |
recall_at_5 |
45.885 |
- C-MTEB/CovidRetrieval (Dev Split)
Metric Type |
Value |
map_at_1 |
76.87 |
map_at_10 |
84.502 |
map_at_100 |
84.615 |
map_at_1000 |
84.617 |
map_at_3 |
83.127 |
map_at_5 |
83.99600000000001 |
mrr_at_1 |
77.02799999999999 |
mrr_at_10 |
84.487 |
mrr_at_100 |
84.59299999999999 |
mrr_at_1000 |
84.59400000000001 |
mrr_at_3 |
83.193 |
mrr_at_5 |
83.994 |
ndcg_at_1 |
77.134 |
ndcg_at_10 |
87.68599999999999 |
ndcg_at_100 |
88.17099999999999 |
ndcg_at_1000 |
88.21 |
ndcg_at_3 |
84.993 |
ndcg_at_5 |
86.519 |
precision_at_1 |
77.134 |
precision_at_10 |
9.841999999999999 |
precision_at_100 |
1.006 |
precision_at_1000 |
0.101 |
precision_at_3 |
30.313000000000002 |
precision_at_5 |
18.945999999999998 |
recall_at_1 |
76.87 |
recall_at_10 |
97.418 |
recall_at_100 |
99.579 |
recall_at_1000 |
99.895 |
recall_at_3 |
90.227 |
recall_at_5 |
93.888 |
- C-MTEB/DuRetrieval (Dev Split)
Metric Type |
Value |
map_at_1 |
25.941 |
map_at_10 |
78.793 |
map_at_100 |
81.57799999999999 |
map_at_1000 |
81.626 |
map_at_3 |
54.749 |
map_at_5 |
69.16 |
mrr_at_1 |
90.45 |
mrr_at_10 |
93.406 |
mrr_at_100 |
93.453 |
mrr_at_1000 |
93.45700000000001 |
mrr_at_3 |
93.10000000000001 |
mrr_at_5 |
93.27499999999999 |
ndcg_at_1 |
90.45 |
ndcg_at_10 |
86.44500000000001 |
ndcg_at_100 |
89.28399999999999 |
ndcg_at_1000 |
89.739 |
ndcg_at_3 |
85.62100000000001 |
ndcg_at_5 |
84.441 |
precision_at_1 |
90.45 |
precision_at_10 |
41.19 |
precision_at_100 |
4.761 |
precision_at_1000 |
0.48700000000000004 |
precision_at_3 |
76.583 |
precision_at_5 |
64.68 |
recall_at_1 |
25.941 |
recall_at_10 |
87.443 |
recall_at_100 |
96.54 |
recall_at_1000 |
98.906 |
recall_at_3 |
56.947 |
recall_at_5 |
73.714 |
- C-MTEB/EcomRetrieval (Dev Split)
Metric Type |
Value |
map_at_1 |
52.900000000000006 |
map_at_10 |
63.144 |
map_at_100 |
63.634 |
map_at_1000 |
63.644999999999996 |
map_at_3 |
60.817 |
map_at_5 |
62.202 |
mrr_at_1 |
52.900000000000006 |
mrr_at_10 |
63.144 |
mrr_at_100 |
63.634 |
mrr_at_1000 |
63.644999999999996 |
mrr_at_3 |
60.817 |
mrr_at_5 |
62.202 |
ndcg_at_1 |
52.900000000000006 |
ndcg_at_10 |
68.042 |
ndcg_at_100 |
70.417 |
ndcg_at_1000 |
70.722 |
ndcg_at_3 |
63.287000000000006 |
ndcg_at_5 |
65.77 |
precision_at_1 |
52.900000000000006 |
precision_at_10 |
8.34 |
precision_at_100 |
0.9450000000000001 |
precision_at_1000 |
0.097 |
precision_at_3 |
23.467 |
precision_at_5 |
15.28 |
recall_at_1 |
52.900000000000006 |
recall_at_10 |
83.39999999999999 |
recall_at_100 |
94.5 |
recall_at_1000 |
96.89999999999999 |
recall_at_3 |
70.39999999999999 |
recall_at_5 |
76.4 |
- C-MTEB/MMarcoRetrieval (Dev Split)
Metric Type |
Value |
map_at_1 |
65.58 |
map_at_10 |
74.763 |
map_at_100 |
75.077 |
map_at_1000 |
75.091 |
map_at_3 |
72.982 |
map_at_5 |
74.155 |
mrr_at_1 |
67.822 |
mrr_at_10 |
75.437 |
mrr_at_100 |
75.702 |
mrr_at_1000 |
75.715 |
mrr_at_3 |
73.91 |
6. PairClassification Task
- C-MTEB/CMNLI (Validation Split)
Metric Type |
Value |
cos_sim_accuracy |
83.22309079975948 |
cos_sim_ap |
89.94833400328307 |
cos_sim_f1 |
84.39319055464031 |
cos_sim_precision |
79.5774647887324 |
cos_sim_recall |
89.82931961655366 |
dot_accuracy |
83.22309079975948 |
dot_ap |
89.95618559578415 |
dot_f1 |
84.41173239591345 |
dot_precision |
79.61044343141317 |
dot_recall |
89.82931961655366 |
euclidean_accuracy |
83.23511725796753 |
euclidean_ap |
89.94836342787318 |
euclidean_f1 |
84.40550133096718 |
euclidean_precision |
80.29120067524794 |
euclidean_recall |
88.9642272620996 |
manhattan_accuracy |
83.23511725796753 |
manhattan_ap |
89.9450103956978 |
manhattan_f1 |
84.44444444444444 |
manhattan_precision |
80.09647651006712 |
manhattan_recall |
89.29155950432546 |
max_accuracy |
83.23511725796753 |
max_ap |
89.95618559578415 |
max_f1 |
84.44444444444444 |