Yinka
Y
Yinka
由Classical開發
該模型在中文文本嵌入基準(MTEB)上進行了多項任務的評估,包括文本相似度、分類、聚類和檢索等任務。
下載量 388
發布時間 : 5/30/2024
模型概述
這是一個在中文文本嵌入基準(MTEB)上評估的模型,支持多種自然語言處理任務,如語義相似度計算、文本分類、聚類和信息檢索等。
模型特點
多任務評估
在MTEB中文基準的多個任務上進行了全面評估,包括STS、分類、聚類和檢索等。
中文優化
專門針對中文文本處理進行了優化,在多箇中文數據集上表現良好。
多樣化指標
提供多種評估指標,包括皮爾遜相關係數、斯皮爾曼相關係數、準確率、F1分數等。
模型能力
文本相似度計算
文本分類
文本聚類
信息檢索
語義匹配
問答重排序
使用案例
電子商務
商品評論分類
對電商平臺的商品評論進行情感分類
在JDReview數據集上達到88.48%的準確率
商品檢索
電商平臺的商品搜索和推薦
在EcomRetrieval數據集上MAP@10達到63.11
醫療健康
醫療問答檢索
醫療領域的問題檢索和匹配
在CMedQAv1和CMedQAv2數據集上MAP分別達到89.26和90.05
醫學文獻檢索
醫學相關文獻的檢索和排序
在MedicalRetrieval數據集上NDCG@10達到65.20
通用語義理解
語義相似度計算
計算兩段文本的語義相似度
在LCQMC數據集上餘弦相似度皮爾遜相關係數達到73.68
文本分類
對文本進行多類別分類
在IFlyTek數據集上準確率達到51.77%
🚀 Yinka嵌入模型
Yinka嵌入模型是一個支持可變向量維度的模型。它基於開源模型stella-v3.5-mrl進行續訓,並採用了piccolo2中提到的多任務混合損失訓練方法。
🚀 快速開始
Yinka嵌入模型的使用方法與stella-v3.5-mrl相同,無需任何前綴。以下是使用示例:
from sentence_transformers import SentenceTransformer
from sklearn.preprocessing import normalize
model = SentenceTransformer("Classical/Yinka")
# 注意先不要normalize! 選取前n維後再normalize
vectors = model.encode(["text1", "text2"], normalize_embeddings=False)
print(vectors.shape) # shape is [2,1792]
n_dims = 768
cut_vecs = normalize(vectors[:, :n_dims])
✨ 主要特性
- 續訓優化:在開源模型stella-v3.5-mrl基礎上進行續訓,提升模型性能。
- 多任務混合損失:採用piccolo2提到的多任務混合損失訓練方法。
- 可變向量維度:支持可變的向量維度,滿足不同場景需求。
💻 使用示例
基礎用法
from sentence_transformers import SentenceTransformer
from sklearn.preprocessing import normalize
model = SentenceTransformer("Classical/Yinka")
# 注意先不要normalize! 選取前n維後再normalize
vectors = model.encode(["text1", "text2"], normalize_embeddings=False)
print(vectors.shape) # shape is [2,1792]
n_dims = 768
cut_vecs = normalize(vectors[:, :n_dims])
📚 詳細文檔
結果對比
模型名稱 | 模型大小 (GB) | 維度 | 序列長度 | 分類 (9) | 聚類 (4) | 成對分類 (2) | 重排序 (4) | 檢索 (8) | STS (8) | 平均 (35) |
---|---|---|---|---|---|---|---|---|---|---|
Yinka | 1.21 | 1792 | 512 | 74.30 | 61.99 | 89.87 | 69.77 | 74.40 | 63.30 | 70.79 |
stella-v3.5-mrl | 1.21 | 1792 | 512 | 71.56 | 54.39 | 88.09 | 68.45 | 73.51 | 62.48 | 68.56 |
piccolo-large-zh-v2 | 1.21 | 1792 | 512 | 74.59 | 62.17 | 90.24 | 70 | 74.36 | 63.5 | 70.95 |
更多評估指標
展開查看詳細評估指標
任務類型 | 數據集 | 指標類型 | 指標值 |
---|---|---|---|
STS | C-MTEB/AFQMC | cos_sim_pearson | 56.306314279047875 |
STS | C-MTEB/AFQMC | cos_sim_spearman | 61.020227685004016 |
STS | C-MTEB/AFQMC | euclidean_pearson | 58.61821670933433 |
STS | C-MTEB/AFQMC | euclidean_spearman | 60.131457106640674 |
STS | C-MTEB/AFQMC | manhattan_pearson | 58.6189460369694 |
STS | C-MTEB/AFQMC | manhattan_spearman | 60.126350618526224 |
STS | C-MTEB/ATEC | cos_sim_pearson | 55.8612958476143 |
STS | C-MTEB/ATEC | cos_sim_spearman | 59.01977664864512 |
STS | C-MTEB/ATEC | euclidean_pearson | 62.028094897243655 |
STS | C-MTEB/ATEC | euclidean_spearman | 58.6046814257705 |
STS | C-MTEB/ATEC | manhattan_pearson | 62.02580042431887 |
STS | C-MTEB/ATEC | manhattan_spearman | 58.60626890004892 |
Classification | mteb/amazon_reviews_multi | accuracy | 49.496 |
Classification | mteb/amazon_reviews_multi | f1 | 46.673963383873065 |
STS | C-MTEB/BQ | cos_sim_pearson | 70.73971622592535 |
STS | C-MTEB/BQ | cos_sim_spearman | 72.76102992060764 |
STS | C-MTEB/BQ | euclidean_pearson | 71.04525865868672 |
STS | C-MTEB/BQ | euclidean_spearman | 72.4032852155075 |
STS | C-MTEB/BQ | manhattan_pearson | 71.03693009336658 |
STS | C-MTEB/BQ | manhattan_spearman | 72.39635701224252 |
Clustering | C-MTEB/CLSClusteringP2P | v_measure | 56.34751074520767 |
Clustering | C-MTEB/CLSClusteringS2S | v_measure | 48.4856662121073 |
Reranking | C-MTEB/CMedQAv1-reranking | map | 89.26384109024997 |
Reranking | C-MTEB/CMedQAv1-reranking | mrr | 91.27261904761905 |
Reranking | C-MTEB/CMedQAv2-reranking | map | 90.0464058154547 |
Reranking | C-MTEB/CMedQAv2-reranking | mrr | 92.06480158730159 |
Retrieval | C-MTEB/CmedqaRetrieval | map_at_1 | 27.236 |
Retrieval | C-MTEB/CmedqaRetrieval | map_at_10 | 40.778 |
Retrieval | C-MTEB/CmedqaRetrieval | map_at_100 | 42.692 |
Retrieval | C-MTEB/CmedqaRetrieval | map_at_1000 | 42.787 |
Retrieval | C-MTEB/CmedqaRetrieval | map_at_3 | 36.362 |
Retrieval | C-MTEB/CmedqaRetrieval | map_at_5 | 38.839 |
Retrieval | C-MTEB/CmedqaRetrieval | mrr_at_1 | 41.335 |
Retrieval | C-MTEB/CmedqaRetrieval | mrr_at_10 | 49.867 |
Retrieval | C-MTEB/CmedqaRetrieval | mrr_at_100 | 50.812999999999995 |
Retrieval | C-MTEB/CmedqaRetrieval | mrr_at_1000 | 50.848000000000006 |
Retrieval | C-MTEB/CmedqaRetrieval | mrr_at_3 | 47.354 |
Retrieval | C-MTEB/CmedqaRetrieval | mrr_at_5 | 48.718 |
Retrieval | C-MTEB/CmedqaRetrieval | ndcg_at_1 | 41.335 |
Retrieval | C-MTEB/CmedqaRetrieval | ndcg_at_10 | 47.642 |
Retrieval | C-MTEB/CmedqaRetrieval | ndcg_at_100 | 54.855 |
Retrieval | C-MTEB/CmedqaRetrieval | ndcg_at_1000 | 56.449000000000005 |
Retrieval | C-MTEB/CmedqaRetrieval | ndcg_at_3 | 42.203 |
Retrieval | C-MTEB/CmedqaRetrieval | ndcg_at_5 | 44.416 |
Retrieval | C-MTEB/CmedqaRetrieval | precision_at_1 | 41.335 |
Retrieval | C-MTEB/CmedqaRetrieval | precision_at_10 | 10.568 |
Retrieval | C-MTEB/CmedqaRetrieval | precision_at_100 | 1.6400000000000001 |
Retrieval | C-MTEB/CmedqaRetrieval | precision_at_1000 | 0.184 |
Retrieval | C-MTEB/CmedqaRetrieval | precision_at_3 | 23.998 |
Retrieval | C-MTEB/CmedqaRetrieval | precision_at_5 | 17.389 |
Retrieval | C-MTEB/CmedqaRetrieval | recall_at_1 | 27.236 |
Retrieval | C-MTEB/CmedqaRetrieval | recall_at_10 | 58.80800000000001 |
Retrieval | C-MTEB/CmedqaRetrieval | recall_at_100 | 88.411 |
Retrieval | C-MTEB/CmedqaRetrieval | recall_at_1000 | 99.032 |
Retrieval | C-MTEB/CmedqaRetrieval | recall_at_3 | 42.253 |
Retrieval | C-MTEB/CmedqaRetrieval | recall_at_5 | 49.118 |
PairClassification | C-MTEB/CMNLI | cos_sim_accuracy | 86.03728202044498 |
PairClassification | C-MTEB/CMNLI | cos_sim_ap | 92.49469583272597 |
PairClassification | C-MTEB/CMNLI | cos_sim_f1 | 86.74095974528088 |
PairClassification | C-MTEB/CMNLI | cos_sim_precision | 84.43657294664601 |
PairClassification | C-MTEB/CMNLI | cos_sim_recall | 89.17465513210195 |
PairClassification | C-MTEB/CMNLI | dot_accuracy | 72.21888153938664 |
PairClassification | C-MTEB/CMNLI | dot_ap | 80.59377163340332 |
PairClassification | C-MTEB/CMNLI | dot_f1 | 74.96686040583258 |
PairClassification | C-MTEB/CMNLI | dot_precision | 66.4737793851718 |
PairClassification | C-MTEB/CMNLI | dot_recall | 85.94809445873275 |
PairClassification | C-MTEB/CMNLI | euclidean_accuracy | 85.47203848466627 |
PairClassification | C-MTEB/CMNLI | euclidean_ap | 91.89152584749868 |
PairClassification | C-MTEB/CMNLI | euclidean_f1 | 86.38105975197294 |
PairClassification | C-MTEB/CMNLI | euclidean_precision | 83.40953625081646 |
PairClassification | C-MTEB/CMNLI | euclidean_recall | 89.5721299976619 |
PairClassification | C-MTEB/CMNLI | manhattan_accuracy | 85.3758268190018 |
PairClassification | C-MTEB/CMNLI | manhattan_ap | 91.88989707722311 |
PairClassification | C-MTEB/CMNLI | manhattan_f1 | 86.39767519839052 |
PairClassification | C-MTEB/CMNLI | manhattan_precision | 82.76231263383298 |
PairClassification | C-MTEB/CMNLI | manhattan_recall | 90.36707972878185 |
PairClassification | C-MTEB/CMNLI | max_accuracy | 86.03728202044498 |
PairClassification | C-MTEB/CMNLI | max_ap | 92.49469583272597 |
PairClassification | C-MTEB/CMNLI | max_f1 | 86.74095974528088 |
Retrieval | C-MTEB/CovidRetrieval | map_at_1 | 74.34100000000001 |
Retrieval | C-MTEB/CovidRetrieval | map_at_10 | 82.49499999999999 |
Retrieval | C-MTEB/CovidRetrieval | map_at_100 | 82.64200000000001 |
Retrieval | C-MTEB/CovidRetrieval | map_at_1000 | 82.643 |
Retrieval | C-MTEB/CovidRetrieval | map_at_3 | 81.142 |
Retrieval | C-MTEB/CovidRetrieval | map_at_5 | 81.95400000000001 |
Retrieval | C-MTEB/CovidRetrieval | mrr_at_1 | 74.71 |
Retrieval | C-MTEB/CovidRetrieval | mrr_at_10 | 82.553 |
Retrieval | C-MTEB/CovidRetrieval | mrr_at_100 | 82.699 |
Retrieval | C-MTEB/CovidRetrieval | mrr_at_1000 | 82.70100000000001 |
Retrieval | C-MTEB/CovidRetrieval | mrr_at_3 | 81.279 |
Retrieval | C-MTEB/CovidRetrieval | mrr_at_5 | 82.069 |
Retrieval | C-MTEB/CovidRetrieval | ndcg_at_1 | 74.605 |
Retrieval | C-MTEB/CovidRetrieval | ndcg_at_10 | 85.946 |
Retrieval | C-MTEB/CovidRetrieval | ndcg_at_100 | 86.607 |
Retrieval | C-MTEB/CovidRetrieval | ndcg_at_1000 | 86.669 |
Retrieval | C-MTEB/CovidRetrieval | ndcg_at_3 | 83.263 |
Retrieval | C-MTEB/CovidRetrieval | ndcg_at_5 | 84.71600000000001 |
Retrieval | C-MTEB/CovidRetrieval | precision_at_1 | 74.605 |
Retrieval | C-MTEB/CovidRetrieval | precision_at_10 | 9.758 |
Retrieval | C-MTEB/CovidRetrieval | precision_at_100 | 1.005 |
Retrieval | C-MTEB/CovidRetrieval | precision_at_1000 | 0.101 |
Retrieval | C-MTEB/CovidRetrieval | precision_at_3 | 29.996000000000002 |
Retrieval | C-MTEB/CovidRetrieval | precision_at_5 | 18.736 |
Retrieval | C-MTEB/CovidRetrieval | recall_at_1 | 74.34100000000001 |
Retrieval | C-MTEB/CovidRetrieval | recall_at_10 | 96.523 |
Retrieval | C-MTEB/CovidRetrieval | recall_at_100 | 99.473 |
Retrieval | C-MTEB/CovidRetrieval | recall_at_1000 | 100.0 |
Retrieval | C-MTEB/CovidRetrieval | recall_at_3 | 89.278 |
Retrieval | C-MTEB/CovidRetrieval | recall_at_5 | 92.83500000000001 |
Retrieval | C-MTEB/DuRetrieval | map_at_1 | 26.950000000000003 |
Retrieval | C-MTEB/DuRetrieval | map_at_10 | 82.408 |
Retrieval | C-MTEB/DuRetrieval | map_at_100 | 85.057 |
Retrieval | C-MTEB/DuRetrieval | map_at_1000 | 85.09100000000001 |
Retrieval | C-MTEB/DuRetrieval | map_at_3 | 57.635999999999996 |
Retrieval | C-MTEB/DuRetrieval | map_at_5 | 72.48 |
Retrieval | C-MTEB/DuRetrieval | mrr_at_1 | 92.15 |
Retrieval | C-MTEB/DuRetrieval | mrr_at_10 | 94.554 |
Retrieval | C-MTEB/DuRetrieval | mrr_at_100 | 94.608 |
Retrieval | C-MTEB/DuRetrieval | mrr_at_1000 | 94.61 |
Retrieval | C-MTEB/DuRetrieval | mrr_at_3 | 94.292 |
Retrieval | C-MTEB/DuRetrieval | mrr_at_5 | 94.459 |
Retrieval | C-MTEB/DuRetrieval | ndcg_at_1 | 92.15 |
Retrieval | C-MTEB/DuRetrieval | ndcg_at_10 | 89.108 |
Retrieval | C-MTEB/DuRetrieval | ndcg_at_100 | 91.525 |
Retrieval | C-MTEB/DuRetrieval | ndcg_at_1000 | 91.82900000000001 |
Retrieval | C-MTEB/DuRetrieval | ndcg_at_3 | 88.44 |
Retrieval | C-MTEB/DuRetrieval | ndcg_at_5 | 87.271 |
Retrieval | C-MTEB/DuRetrieval | precision_at_1 | 92.15 |
Retrieval | C-MTEB/DuRetrieval | precision_at_10 | 42.29 |
Retrieval | C-MTEB/DuRetrieval | precision_at_100 | 4.812 |
Retrieval | C-MTEB/DuRetrieval | precision_at_1000 | 0.48900000000000005 |
Retrieval | C-MTEB/DuRetrieval | precision_at_3 | 79.14999999999999 |
Retrieval | C-MTEB/DuRetrieval | precision_at_5 | 66.64 |
Retrieval | C-MTEB/DuRetrieval | recall_at_1 | 26.950000000000003 |
Retrieval | C-MTEB/DuRetrieval | recall_at_10 | 89.832 |
Retrieval | C-MTEB/DuRetrieval | recall_at_100 | 97.921 |
Retrieval | C-MTEB/DuRetrieval | recall_at_1000 | 99.471 |
Retrieval | C-MTEB/DuRetrieval | recall_at_3 | 59.562000000000005 |
Retrieval | C-MTEB/DuRetrieval | recall_at_5 | 76.533 |
Retrieval | C-MTEB/EcomRetrieval | map_at_1 | 53.5 |
Retrieval | C-MTEB/EcomRetrieval | map_at_10 | 63.105999999999995 |
Retrieval | C-MTEB/EcomRetrieval | map_at_100 | 63.63100000000001 |
Retrieval | C-MTEB/EcomRetrieval | map_at_1000 | 63.641999999999996 |
Retrieval | C-MTEB/EcomRetrieval | map_at_3 | 60.617 |
Retrieval | C-MTEB/EcomRetrieval | map_at_5 | 62.132 |
Retrieval | C-MTEB/EcomRetrieval | mrr_at_1 | 53.5 |
Retrieval | C-MTEB/EcomRetrieval | mrr_at_10 | 63.105999999999995 |
Retrieval | C-MTEB/EcomRetrieval | mrr_at_100 | 63.63100000000001 |
Retrieval | C-MTEB/EcomRetrieval | mrr_at_1000 | 63.641999999999996 |
Retrieval | C-MTEB/EcomRetrieval | mrr_at_3 | 60.617 |
Retrieval | C-MTEB/EcomRetrieval | mrr_at_5 | 62.132 |
Retrieval | C-MTEB/EcomRetrieval | ndcg_at_1 | 53.5 |
Retrieval | C-MTEB/EcomRetrieval | ndcg_at_10 | 67.92200000000001 |
Retrieval | C-MTEB/EcomRetrieval | ndcg_at_100 | 70.486 |
Retrieval | C-MTEB/EcomRetrieval | ndcg_at_1000 | 70.777 |
Retrieval | C-MTEB/EcomRetrieval | ndcg_at_3 | 62.853 |
Retrieval | C-MTEB/EcomRetrieval | ndcg_at_5 | 65.59899999999999 |
Retrieval | C-MTEB/EcomRetrieval | precision_at_1 | 53.5 |
Retrieval | C-MTEB/EcomRetrieval | precision_at_10 | 8.309999999999999 |
Retrieval | C-MTEB/EcomRetrieval | precision_at_100 | 0.951 |
Retrieval | C-MTEB/EcomRetrieval | precision_at_1000 | 0.097 |
Retrieval | C-MTEB/EcomRetrieval | precision_at_3 | 23.1 |
Retrieval | C-MTEB/EcomRetrieval | precision_at_5 | 15.2 |
Retrieval | C-MTEB/EcomRetrieval | recall_at_1 | 53.5 |
Retrieval | C-MTEB/EcomRetrieval | recall_at_10 | 83.1 |
Retrieval | C-MTEB/EcomRetrieval | recall_at_100 | 95.1 |
Retrieval | C-MTEB/EcomRetrieval | recall_at_1000 | 97.39999999999999 |
Retrieval | C-MTEB/EcomRetrieval | recall_at_3 | 69.3 |
Retrieval | C-MTEB/EcomRetrieval | recall_at_5 | 76.0 |
Classification | C-MTEB/IFlyTek-classification | accuracy | 51.773759138130046 |
Classification | C-MTEB/IFlyTek-classification | f1 | 40.38600802756481 |
Classification | C-MTEB/JDReview-classification | accuracy | 88.48030018761726 |
Classification | C-MTEB/JDReview-classification | ap | 59.2732541555627 |
Classification | C-MTEB/JDReview-classification | f1 | 83.58836007358619 |
STS | C-MTEB/LCQMC | cos_sim_pearson | 73.67511194245922 |
STS | C-MTEB/LCQMC | cos_sim_spearman | 79.43347759067298 |
STS | C-MTEB/LCQMC | euclidean_pearson | 79.04491504318766 |
STS | C-MTEB/LCQMC | euclidean_spearman | 79.14478545356785 |
STS | C-MTEB/LCQMC | manhattan_pearson | 79.03365022867428 |
STS | C-MTEB/LCQMC | manhattan_spearman | 79.13172717619908 |
Retrieval | C-MTEB/MMarcoRetrieval | map_at_1 | 67.184 |
Retrieval | C-MTEB/MMarcoRetrieval | map_at_10 | 76.24600000000001 |
Retrieval | C-MTEB/MMarcoRetrieval | map_at_100 | 76.563 |
Retrieval | C-MTEB/MMarcoRetrieval | map_at_1000 | 76.575 |
Retrieval | C-MTEB/MMarcoRetrieval | map_at_3 | 74.522 |
Retrieval | C-MTEB/MMarcoRetrieval | map_at_5 | 75.598 |
Retrieval | C-MTEB/MMarcoRetrieval | mrr_at_1 | 69.47 |
Retrieval | C-MTEB/MMarcoRetrieval | mrr_at_10 | 76.8 |
Retrieval | C-MTEB/MMarcoRetrieval | mrr_at_100 | 77.082 |
Retrieval | C-MTEB/MMarcoRetrieval | mrr_at_1000 | 77.093 |
Retrieval | C-MTEB/MMarcoRetrieval | mrr_at_3 | 75.29400000000001 |
Retrieval | C-MTEB/MMarcoRetrieval | mrr_at_5 | 76.24 |
Retrieval | C-MTEB/MMarcoRetrieval | ndcg_at_1 | 69.47 |
Retrieval | C-MTEB/MMarcoRetrieval | ndcg_at_10 | 79.81099999999999 |
Retrieval | C-MTEB/MMarcoRetrieval | ndcg_at_100 | 81.187 |
Retrieval | C-MTEB/MMarcoRetrieval | ndcg_at_1000 | 81.492 |
Retrieval | C-MTEB/MMarcoRetrieval | ndcg_at_3 | 76.536 |
Retrieval | C-MTEB/MMarcoRetrieval | ndcg_at_5 | 78.367 |
Retrieval | C-MTEB/MMarcoRetrieval | precision_at_1 | 69.47 |
Retrieval | C-MTEB/MMarcoRetrieval | precision_at_10 | 9.599 |
Retrieval | C-MTEB/MMarcoRetrieval | precision_at_100 | 1.026 |
Retrieval | C-MTEB/MMarcoRetrieval | precision_at_1000 | 0.105 |
Retrieval | C-MTEB/MMarcoRetrieval | precision_at_3 | 28.777 |
Retrieval | C-MTEB/MMarcoRetrieval | precision_at_5 | 18.232 |
Retrieval | C-MTEB/MMarcoRetrieval | recall_at_1 | 67.184 |
Retrieval | C-MTEB/MMarcoRetrieval | recall_at_10 | 90.211 |
Retrieval | C-MTEB/MMarcoRetrieval | recall_at_100 | 96.322 |
Retrieval | C-MTEB/MMarcoRetrieval | recall_at_1000 | 98.699 |
Retrieval | C-MTEB/MMarcoRetrieval | recall_at_3 | 81.556 |
Retrieval | C-MTEB/MMarcoRetrieval | recall_at_5 | 85.931 |
Classification | mteb/amazon_massive_intent | accuracy | 76.96032279757901 |
Classification | mteb/amazon_massive_intent | f1 | 73.48052314033545 |
Classification | mteb/amazon_massive_scenario | accuracy | 84.64357767316744 |
Classification | mteb/amazon_massive_scenario | f1 | 83.58250539497922 |
Retrieval | C-MTEB/MedicalRetrieval | map_at_1 | 56.00000000000001 |
Retrieval | C-MTEB/MedicalRetrieval | map_at_10 | 62.066 |
Retrieval | C-MTEB/MedicalRetrieval | map_at_100 | 62.553000000000004 |
Retrieval | C-MTEB/MedicalRetrieval | map_at_1000 | 62.598 |
Retrieval | C-MTEB/MedicalRetrieval | map_at_3 | 60.4 |
Retrieval | C-MTEB/MedicalRetrieval | map_at_5 | 61.370000000000005 |
Retrieval | C-MTEB/MedicalRetrieval | mrr_at_1 | 56.2 |
Retrieval | C-MTEB/MedicalRetrieval | mrr_at_10 | 62.166 |
Retrieval | C-MTEB/MedicalRetrieval | mrr_at_100 | 62.653000000000006 |
Retrieval | C-MTEB/MedicalRetrieval | mrr_at_1000 | 62.699000000000005 |
Retrieval | C-MTEB/MedicalRetrieval | mrr_at_3 | 60.5 |
Retrieval | C-MTEB/MedicalRetrieval | mrr_at_5 | 61.47 |
Retrieval | C-MTEB/MedicalRetrieval | ndcg_at_1 | 56.00000000000001 |
Retrieval | C-MTEB/MedicalRetrieval | ndcg_at_10 | 65.199 |
Retrieval | C-MTEB/MedicalRetrieval | ndcg_at_100 | 67.79899999999999 |
Retrieval | C-MTEB/MedicalRetrieval | ndcg_at_1000 | 69.056 |
Retrieval | C-MTEB/MedicalRetrieval | ndcg_at_3 | 61.814 |
Retrieval | C-MTEB/MedicalRetrieval | ndcg_at_5 | 63.553000000000004 |
Retrieval | C-MTEB/MedicalRetrieval | precision_at_1 | 56.00000000000001 |
Retrieval | C-MTEB/MedicalRetrieval | precision_at_10 | 7.51 |
Retrieval | C-MTEB/MedicalRetrieval | precision_at_100 | 0.878 |
Retrieval | C-MTEB/MedicalRetrieval | precision_at_1000 | 0.098 |
Retrieval | C-MTEB/MedicalRetrieval | precision_at_3 | 21.967 |
Retrieval | C-MTEB/MedicalRetrieval | precision_at_5 | 14.02 |
Retrieval | C-MTEB/MedicalRetrieval | recall_at_1 | 56.00000000000001 |
Retrieval | C-MTEB/MedicalRetrieval | recall_at_10 | 75.1 |
Retrieval | C-MTEB/MedicalRetrieval | recall_at_100 | 87.8 |
Retrieval | C-MTEB/MedicalRetrieval | recall_at_1000 | 97.7 |
Retrieval | C-MTEB/MedicalRetrieval | recall_at_3 | 65.9 |
Retrieval | C-MTEB/MedicalRetrieval | recall_at_5 | 70.1 |
Reranking | C-MTEB/Mmarco-reranking | map | 32.74158258279793 |
Reranking | C-MTEB/Mmarco-reranking | mrr | 31.56071428571428 |
Classification | C-MTEB/MultilingualSentiment-classification | accuracy | 78.96666666666667 |
Classification | C-MTEB/MultilingualSentiment-classification | f1 | 78.82528563818045 |
PairClassification | C-MTEB/OCNLI | cos_sim_accuracy | 83.54087709799674 |
PairClassification | C-MTEB/OCNLI | cos_sim_ap | 87.26170197077586 |
PairClassification | C-MTEB/OCNLI | cos_sim_f1 | 84.7609561752988 |
PairClassification | C-MTEB/OCNLI | cos_sim_precision | 80.20735155513667 |
PairClassification | C-MTEB/OCNLI | cos_sim_recall | 89.86272439281943 |
PairClassification | C-MTEB/OCNLI | dot_accuracy | 72.22523010286952 |
PairClassification | C-MTEB/OCNLI | dot_ap | 79.51975358187732 |
PairClassification | C-MTEB/OCNLI | dot_f1 | 76.32183908045977 |
PairClassification | C-MTEB/OCNLI | dot_precision | 67.58957654723126 |
PairClassification | C-MTEB/OCNLI | dot_recall | 87.64519535374869 |
PairClassification | C-MTEB/OCNLI | euclidean_accuracy | 82.0249052517596 |
PairClassification | C-MTEB/OCNLI | euclidean_ap | 85.32829948726406 |
PairClassification | C-MTEB/OCNLI | euclidean_f1 | 83.24924318869829 |
PairClassification | C-MTEB/OCNLI | euclidean_precision | 79.71014492753623 |
PairClassification | C-MTEB/OCNLI | euclidean_recall | 87.11721224920802 |
PairClassification | C-MTEB/OCNLI | manhattan_accuracy | 82.13318895506227 |
PairClassification | C-MTEB/OCNLI | manhattan_ap | 85.28856869288006 |
PairClassification | C-MTEB/OCNLI | manhattan_f1 | 83.34946757018393 |
PairClassification | C-MTEB/OCNLI | manhattan_precision | 76.94369973190348 |
PairClassification | C-MTEB/OCNLI | manhattan_recall | 90.91869060190075 |
PairClassification | C-MTEB/OCNLI | max_accuracy | 83.54087709799674 |
PairClassification | C-MTEB/OCNLI | max_ap | 87.26170197077586 |
PairClassification | C-MTEB/OCNLI | max_f1 | 84.7609561752988 |
Classification | C-MTEB/OnlineShopping-classification | accuracy | 94.56 |
Classification | C-MTEB/OnlineShopping-classification | ap | 92.80848436710805 |
Classification | C-MTEB/OnlineShopping-classification | f1 | 94.54951966576111 |
STS | C-MTEB/PAWSX | cos_sim_pearson | 39.0866558287863 |
STS | C-MTEB/PAWSX | cos_sim_spearman | 45.9211126233312 |
STS | C-MTEB/PAWSX | euclidean_pearson | 44.86568743222145 |
STS | C-MTEB/PAWSX | euclidean_spearman | 45.63882757207507 |
STS | C-MTEB/PAWSX | manhattan_pearson | 44.89480036909126 |
STS | C-MTEB/PAWSX | manhattan_spearman | 45.65929449046206 |
STS | C-MTEB/QBQTC | cos_sim_pearson | 43.04701793979569 |
STS | C-MTEB/QBQTC | cos_sim_spearman | 44.87491033760315 |
STS | C-MTEB/QBQTC | euclidean_pearson | 36.2004061032567 |
STS | C-MTEB/QBQTC | euclidean_spearman | 41.44823909683865 |
STS | C-MTEB/QBQTC | manhattan_pearson | 36.136113427955095 |
STS | C-MTEB/QBQTC | manhattan_spearman | 41.39225495993949 |
STS | mteb/sts22-crosslingual-sts | cos_sim_pearson | 61.65611315777857 |
STS | mteb/sts22-crosslingual-sts | cos_sim_spearman | 64.4067673105648 |
STS | mteb/sts22-crosslingual-sts | euclidean_pearson | 61.814977248797184 |
STS | mteb/sts22-crosslingual-sts | euclidean_spearman | 63.99473350700169 |
STS | mteb/sts22-crosslingual-sts | manhattan_pearson | 61.684304629588624 |
STS | mteb/sts22-crosslingual-sts | manhattan_spearman | 63.97831213239316 |
STS | C-MTEB/STSB | cos_sim_pearson | 76.57324933064379 |
STS | C-MTEB/STSB | cos_sim_spearman | 79.23602286949782 |
STS | C-MTEB/STSB | euclidean_pearson | 80.28226284310948 |
STS | C-MTEB/STSB | euclidean_spearman | 80.32210477608423 |
STS | C-MTEB/STSB | manhattan_pearson | 80.27262188617811 |
STS | C-MTEB/STSB | manhattan_spearman | 80.31619185039723 |
Reranking | C-MTEB/T2Reranking | map | 67.05266891356277 |
Reranking | C-MTEB/T2Reranking | mrr | 77.1906333623497 |
Retrieval | C-MTEB/T2Retrieval | map_at_1 | 28.212 |
Retrieval | C-MTEB/T2Retrieval | map_at_10 | 78.932 |
Retrieval | C-MTEB/T2Retrieval | map_at_100 | 82.51899999999999 |
Retrieval | C-MTEB/T2Retrieval | map_at_1000 | 82.575 |
Retrieval | C-MTEB/T2Retrieval | map_at_3 | 55.614 |
Retrieval | C-MTEB/T2Retrieval | map_at_5 | 68.304 |
Retrieval | C-MTEB/T2Retrieval | mrr_at_1 | 91.211 |
Retrieval | C-MTEB/T2Retrieval | mrr_at_10 | 93.589 |
Retrieval | C-MTEB/T2Retrieval | mrr_at_100 | 93.659 |
Retrieval | C-MTEB/T2Retrieval | mrr_at_1000 | 93.662 |
Retrieval | C-MTEB/T2Retrieval | mrr_at_3 | 93.218 |
Retrieval | C-MTEB/T2Retrieval | mrr_at_5 | 93.453 |
Retrieval | C-MTEB/T2Retrieval | ndcg_at_1 | 91.211 |
Retrieval | C-MTEB/T2Retrieval | ndcg_at_10 | 86.24000000000001 |
Retrieval | C-MTEB/T2Retrieval | ndcg_at_100 | 89.614 |
Retrieval | C-MTEB/T2Retrieval | ndcg_at_1000 | 90.14 |
Retrieval | C-MTEB/T2Retrieval | ndcg_at_3 | 87.589 |
Retrieval | C-MTEB/T2Retrieval | ndcg_at_5 | 86.265 |
Retrieval | C-MTEB/T2Retrieval | precision_at_1 | 91.211 |
Retrieval | C-MTEB/T2Retrieval | precision_at_10 | 42.626 |
Retrieval | C-MTEB/T2Retrieval | precision_at_100 | 5.043 |
Retrieval | C-MTEB/T2Retrieval | precision_at_1000 | 0.517 |
Retrieval | C-MTEB/T2Retrieval | precision_at_3 | 76.42 |
Retrieval | C-MTEB/T2Retrieval | precision_at_5 | 64.045 |
Retrieval | C-MTEB/T2Retrieval | recall_at_1 | 28.212 |
Retrieval | C-MTEB/T2Retrieval | recall_at_10 | 85.223 |
Retrieval | C-MTEB/T2Retrieval | recall_at_100 | 96.229 |
Retrieval | C-MTEB/T2Retrieval | recall_at_1000 | 98.849 |
Retrieval | C-MTEB/T2Retrieval | recall_at_3 | 57.30800000000001 |
Retrieval | C-MTEB/T2Retrieval | recall_at_5 | 71.661 |
Classification | C-MTEB/TNews-classification | accuracy | 54.385000000000005 |
Classification | C-MTEB/TNews-classification | f1 | 52.38762400903556 |
Clustering | C-MTEB/ThuNewsClusteringP2P | v_measure | 74.55283855942916 |
Clustering | C-MTEB/ThuNewsClusteringS2S | v_measure | 68.55115316700493 |
Retrieval | C-MTEB/VideoRetrieval | map_at_1 | 58.8 |
Retrieval | C-MTEB/VideoRetrieval | map_at_10 | 69.035 |
Retrieval | C-MTEB/VideoRetrieval | map_at_100 | 69.52000000000001 |
Retrieval | C-MTEB/VideoRetrieval | map_at_1000 | 69.529 |
Retrieval | C-MTEB/VideoRetrieval | map_at_3 | 67.417 |
Retrieval | C-MTEB/VideoRetrieval | map_at_5 | 68.407 |
Retrieval | C-MTEB/VideoRetrieval | mrr_at_1 | 58.8 |
Retrieval | C-MTEB/VideoRetrieval | mrr_at_10 | 69.035 |
Retrieval | C-MTEB/VideoRetrieval | mrr_at_100 | 69.52000000000001 |
Retrieval | C-MTEB/VideoRetrieval | mrr_at_1000 | 69.529 |
Retrieval | C-MTEB/VideoRetrieval | mrr_at_3 | 67.417 |
Retrieval | C-MTEB/VideoRetrieval | mrr_at_5 | 68.407 |
Retrieval | C-MTEB/VideoRetrieval | ndcg_at_1 | 58.8 |
Retrieval | C-MTEB/VideoRetrieval | ndcg_at_10 | 73.395 |
Retrieval | C-MTEB/VideoRetrieval | ndcg_at_100 | 75.62 |
Retrieval | C-MTEB/VideoRetrieval | ndcg_at_1000 | 75.90299999999999 |
Retrieval | C-MTEB/VideoRetrieval | ndcg_at_3 | 70.11800000000001 |
Retrieval | C-MTEB/VideoRetrieval | ndcg_at_5 | 71.87400000000001 |
Retrieval | C-MTEB/VideoRetrieval | precision_at_1 | 58.8 |
Retrieval | C-MTEB/VideoRetrieval | precision_at_10 | 8.68 |
Retrieval | C-MTEB/VideoRetrieval | precision_at_100 | 0.9690000000000001 |
Retrieval | C-MTEB/VideoRetrieval | precision_at_1000 | 0.099 |
Retrieval | C-MTEB/VideoRetrieval | precision_at_3 | 25.967000000000002 |
Retrieval | C-MTEB/VideoRetrieval | precision_at_5 | 16.42 |
Retrieval | C-MTEB/VideoRetrieval | recall_at_1 | 58.8 |
Retrieval | C-MTEB/VideoRetrieval | recall_at_10 | 86.8 |
Retrieval | C-MTEB/VideoRetrieval | recall_at_100 | 96.89999999999999 |
Retrieval | C-MTEB/VideoRetrieval | recall_at_1000 | 99.2 |
Retrieval | C-MTEB/VideoRetrieval | recall_at_3 | 77.9 |
Retrieval | C-MTEB/VideoRetrieval | recall_at_5 | 82.1 |
Classification | C-MTEB/waimai-classification | accuracy | 89.42 |
Classification | C-MTEB/waimai-classification | ap | 75.35978503182068 |
Classification | C-MTEB/waimai-classification | f1 | 88.01006394348263 |
📄 許可證
本模型採用MIT licence。
Phi 2 GGUF
其他
Phi-2是微軟開發的一個小型但強大的語言模型,具有27億參數,專注於高效推理和高質量文本生成。
大型語言模型 支持多種語言
P
TheBloke
41.5M
205
Roberta Large
MIT
基於掩碼語言建模目標預訓練的大型英語語言模型,採用改進的BERT訓練方法
大型語言模型 英語
R
FacebookAI
19.4M
212
Distilbert Base Uncased
Apache-2.0
DistilBERT是BERT基礎模型的蒸餾版本,在保持相近性能的同時更輕量高效,適用於序列分類、標記分類等自然語言處理任務。
大型語言模型 英語
D
distilbert
11.1M
669
Llama 3.1 8B Instruct GGUF
Meta Llama 3.1 8B Instruct 是一個多語言大語言模型,針對多語言對話用例進行了優化,在常見的行業基準測試中表現優異。
大型語言模型 英語
L
modularai
9.7M
4
Xlm Roberta Base
MIT
XLM-RoBERTa是基於100種語言的2.5TB過濾CommonCrawl數據預訓練的多語言模型,採用掩碼語言建模目標進行訓練。
大型語言模型 支持多種語言
X
FacebookAI
9.6M
664
Roberta Base
MIT
基於Transformer架構的英語預訓練模型,通過掩碼語言建模目標在海量文本上訓練,支持文本特徵提取和下游任務微調
大型語言模型 英語
R
FacebookAI
9.3M
488
Opt 125m
其他
OPT是由Meta AI發佈的開放預訓練Transformer語言模型套件,參數量從1.25億到1750億,旨在對標GPT-3系列性能,同時促進大規模語言模型的開放研究。
大型語言模型 英語
O
facebook
6.3M
198
1
基於transformers庫的預訓練模型,適用於多種NLP任務
大型語言模型
Transformers

1
unslothai
6.2M
1
Llama 3.1 8B Instruct
Llama 3.1是Meta推出的多語言大語言模型系列,包含8B、70B和405B參數規模,支持8種語言和代碼生成,優化了多語言對話場景。
大型語言模型
Transformers 支持多種語言

L
meta-llama
5.7M
3,898
T5 Base
Apache-2.0
T5基礎版是由Google開發的文本到文本轉換Transformer模型,參數規模2.2億,支持多語言NLP任務。
大型語言模型 支持多種語言
T
google-t5
5.4M
702
精選推薦AI模型
Llama 3 Typhoon V1.5x 8b Instruct
專為泰語設計的80億參數指令模型,性能媲美GPT-3.5-turbo,優化了應用場景、檢索增強生成、受限生成和推理任務
大型語言模型
Transformers 支持多種語言

L
scb10x
3,269
16
Cadet Tiny
Openrail
Cadet-Tiny是一個基於SODA數據集訓練的超小型對話模型,專為邊緣設備推理設計,體積僅為Cosmo-3B模型的2%左右。
對話系統
Transformers 英語

C
ToddGoldfarb
2,691
6
Roberta Base Chinese Extractive Qa
基於RoBERTa架構的中文抽取式問答模型,適用於從給定文本中提取答案的任務。
問答系統 中文
R
uer
2,694
98