Yinka
Y
Yinka
由 Classical 开发
该模型在中文文本嵌入基准(MTEB)上进行了多项任务的评估,包括文本相似度、分类、聚类和检索等任务。
下载量 388
发布时间 : 5/30/2024
模型简介
这是一个在中文文本嵌入基准(MTEB)上评估的模型,支持多种自然语言处理任务,如语义相似度计算、文本分类、聚类和信息检索等。
模型特点
多任务评估
在MTEB中文基准的多个任务上进行了全面评估,包括STS、分类、聚类和检索等。
中文优化
专门针对中文文本处理进行了优化,在多个中文数据集上表现良好。
多样化指标
提供多种评估指标,包括皮尔逊相关系数、斯皮尔曼相关系数、准确率、F1分数等。
模型能力
文本相似度计算
文本分类
文本聚类
信息检索
语义匹配
问答重排序
使用案例
电子商务
商品评论分类
对电商平台的商品评论进行情感分类
在JDReview数据集上达到88.48%的准确率
商品检索
电商平台的商品搜索和推荐
在EcomRetrieval数据集上MAP@10达到63.11
医疗健康
医疗问答检索
医疗领域的问题检索和匹配
在CMedQAv1和CMedQAv2数据集上MAP分别达到89.26和90.05
医学文献检索
医学相关文献的检索和排序
在MedicalRetrieval数据集上NDCG@10达到65.20
通用语义理解
语义相似度计算
计算两段文本的语义相似度
在LCQMC数据集上余弦相似度皮尔逊相关系数达到73.68
文本分类
对文本进行多类别分类
在IFlyTek数据集上准确率达到51.77%
🚀 Yinka嵌入模型
Yinka嵌入模型是一个支持可变向量维度的模型。它基于开源模型stella-v3.5-mrl进行续训,并采用了piccolo2中提到的多任务混合损失训练方法。
🚀 快速开始
Yinka嵌入模型的使用方法与stella-v3.5-mrl相同,无需任何前缀。以下是使用示例:
from sentence_transformers import SentenceTransformer
from sklearn.preprocessing import normalize
model = SentenceTransformer("Classical/Yinka")
# 注意先不要normalize! 选取前n维后再normalize
vectors = model.encode(["text1", "text2"], normalize_embeddings=False)
print(vectors.shape) # shape is [2,1792]
n_dims = 768
cut_vecs = normalize(vectors[:, :n_dims])
✨ 主要特性
- 续训优化:在开源模型stella-v3.5-mrl基础上进行续训,提升模型性能。
- 多任务混合损失:采用piccolo2提到的多任务混合损失训练方法。
- 可变向量维度:支持可变的向量维度,满足不同场景需求。
💻 使用示例
基础用法
from sentence_transformers import SentenceTransformer
from sklearn.preprocessing import normalize
model = SentenceTransformer("Classical/Yinka")
# 注意先不要normalize! 选取前n维后再normalize
vectors = model.encode(["text1", "text2"], normalize_embeddings=False)
print(vectors.shape) # shape is [2,1792]
n_dims = 768
cut_vecs = normalize(vectors[:, :n_dims])
📚 详细文档
结果对比
模型名称 | 模型大小 (GB) | 维度 | 序列长度 | 分类 (9) | 聚类 (4) | 成对分类 (2) | 重排序 (4) | 检索 (8) | STS (8) | 平均 (35) |
---|---|---|---|---|---|---|---|---|---|---|
Yinka | 1.21 | 1792 | 512 | 74.30 | 61.99 | 89.87 | 69.77 | 74.40 | 63.30 | 70.79 |
stella-v3.5-mrl | 1.21 | 1792 | 512 | 71.56 | 54.39 | 88.09 | 68.45 | 73.51 | 62.48 | 68.56 |
piccolo-large-zh-v2 | 1.21 | 1792 | 512 | 74.59 | 62.17 | 90.24 | 70 | 74.36 | 63.5 | 70.95 |
更多评估指标
展开查看详细评估指标
任务类型 | 数据集 | 指标类型 | 指标值 |
---|---|---|---|
STS | C-MTEB/AFQMC | cos_sim_pearson | 56.306314279047875 |
STS | C-MTEB/AFQMC | cos_sim_spearman | 61.020227685004016 |
STS | C-MTEB/AFQMC | euclidean_pearson | 58.61821670933433 |
STS | C-MTEB/AFQMC | euclidean_spearman | 60.131457106640674 |
STS | C-MTEB/AFQMC | manhattan_pearson | 58.6189460369694 |
STS | C-MTEB/AFQMC | manhattan_spearman | 60.126350618526224 |
STS | C-MTEB/ATEC | cos_sim_pearson | 55.8612958476143 |
STS | C-MTEB/ATEC | cos_sim_spearman | 59.01977664864512 |
STS | C-MTEB/ATEC | euclidean_pearson | 62.028094897243655 |
STS | C-MTEB/ATEC | euclidean_spearman | 58.6046814257705 |
STS | C-MTEB/ATEC | manhattan_pearson | 62.02580042431887 |
STS | C-MTEB/ATEC | manhattan_spearman | 58.60626890004892 |
Classification | mteb/amazon_reviews_multi | accuracy | 49.496 |
Classification | mteb/amazon_reviews_multi | f1 | 46.673963383873065 |
STS | C-MTEB/BQ | cos_sim_pearson | 70.73971622592535 |
STS | C-MTEB/BQ | cos_sim_spearman | 72.76102992060764 |
STS | C-MTEB/BQ | euclidean_pearson | 71.04525865868672 |
STS | C-MTEB/BQ | euclidean_spearman | 72.4032852155075 |
STS | C-MTEB/BQ | manhattan_pearson | 71.03693009336658 |
STS | C-MTEB/BQ | manhattan_spearman | 72.39635701224252 |
Clustering | C-MTEB/CLSClusteringP2P | v_measure | 56.34751074520767 |
Clustering | C-MTEB/CLSClusteringS2S | v_measure | 48.4856662121073 |
Reranking | C-MTEB/CMedQAv1-reranking | map | 89.26384109024997 |
Reranking | C-MTEB/CMedQAv1-reranking | mrr | 91.27261904761905 |
Reranking | C-MTEB/CMedQAv2-reranking | map | 90.0464058154547 |
Reranking | C-MTEB/CMedQAv2-reranking | mrr | 92.06480158730159 |
Retrieval | C-MTEB/CmedqaRetrieval | map_at_1 | 27.236 |
Retrieval | C-MTEB/CmedqaRetrieval | map_at_10 | 40.778 |
Retrieval | C-MTEB/CmedqaRetrieval | map_at_100 | 42.692 |
Retrieval | C-MTEB/CmedqaRetrieval | map_at_1000 | 42.787 |
Retrieval | C-MTEB/CmedqaRetrieval | map_at_3 | 36.362 |
Retrieval | C-MTEB/CmedqaRetrieval | map_at_5 | 38.839 |
Retrieval | C-MTEB/CmedqaRetrieval | mrr_at_1 | 41.335 |
Retrieval | C-MTEB/CmedqaRetrieval | mrr_at_10 | 49.867 |
Retrieval | C-MTEB/CmedqaRetrieval | mrr_at_100 | 50.812999999999995 |
Retrieval | C-MTEB/CmedqaRetrieval | mrr_at_1000 | 50.848000000000006 |
Retrieval | C-MTEB/CmedqaRetrieval | mrr_at_3 | 47.354 |
Retrieval | C-MTEB/CmedqaRetrieval | mrr_at_5 | 48.718 |
Retrieval | C-MTEB/CmedqaRetrieval | ndcg_at_1 | 41.335 |
Retrieval | C-MTEB/CmedqaRetrieval | ndcg_at_10 | 47.642 |
Retrieval | C-MTEB/CmedqaRetrieval | ndcg_at_100 | 54.855 |
Retrieval | C-MTEB/CmedqaRetrieval | ndcg_at_1000 | 56.449000000000005 |
Retrieval | C-MTEB/CmedqaRetrieval | ndcg_at_3 | 42.203 |
Retrieval | C-MTEB/CmedqaRetrieval | ndcg_at_5 | 44.416 |
Retrieval | C-MTEB/CmedqaRetrieval | precision_at_1 | 41.335 |
Retrieval | C-MTEB/CmedqaRetrieval | precision_at_10 | 10.568 |
Retrieval | C-MTEB/CmedqaRetrieval | precision_at_100 | 1.6400000000000001 |
Retrieval | C-MTEB/CmedqaRetrieval | precision_at_1000 | 0.184 |
Retrieval | C-MTEB/CmedqaRetrieval | precision_at_3 | 23.998 |
Retrieval | C-MTEB/CmedqaRetrieval | precision_at_5 | 17.389 |
Retrieval | C-MTEB/CmedqaRetrieval | recall_at_1 | 27.236 |
Retrieval | C-MTEB/CmedqaRetrieval | recall_at_10 | 58.80800000000001 |
Retrieval | C-MTEB/CmedqaRetrieval | recall_at_100 | 88.411 |
Retrieval | C-MTEB/CmedqaRetrieval | recall_at_1000 | 99.032 |
Retrieval | C-MTEB/CmedqaRetrieval | recall_at_3 | 42.253 |
Retrieval | C-MTEB/CmedqaRetrieval | recall_at_5 | 49.118 |
PairClassification | C-MTEB/CMNLI | cos_sim_accuracy | 86.03728202044498 |
PairClassification | C-MTEB/CMNLI | cos_sim_ap | 92.49469583272597 |
PairClassification | C-MTEB/CMNLI | cos_sim_f1 | 86.74095974528088 |
PairClassification | C-MTEB/CMNLI | cos_sim_precision | 84.43657294664601 |
PairClassification | C-MTEB/CMNLI | cos_sim_recall | 89.17465513210195 |
PairClassification | C-MTEB/CMNLI | dot_accuracy | 72.21888153938664 |
PairClassification | C-MTEB/CMNLI | dot_ap | 80.59377163340332 |
PairClassification | C-MTEB/CMNLI | dot_f1 | 74.96686040583258 |
PairClassification | C-MTEB/CMNLI | dot_precision | 66.4737793851718 |
PairClassification | C-MTEB/CMNLI | dot_recall | 85.94809445873275 |
PairClassification | C-MTEB/CMNLI | euclidean_accuracy | 85.47203848466627 |
PairClassification | C-MTEB/CMNLI | euclidean_ap | 91.89152584749868 |
PairClassification | C-MTEB/CMNLI | euclidean_f1 | 86.38105975197294 |
PairClassification | C-MTEB/CMNLI | euclidean_precision | 83.40953625081646 |
PairClassification | C-MTEB/CMNLI | euclidean_recall | 89.5721299976619 |
PairClassification | C-MTEB/CMNLI | manhattan_accuracy | 85.3758268190018 |
PairClassification | C-MTEB/CMNLI | manhattan_ap | 91.88989707722311 |
PairClassification | C-MTEB/CMNLI | manhattan_f1 | 86.39767519839052 |
PairClassification | C-MTEB/CMNLI | manhattan_precision | 82.76231263383298 |
PairClassification | C-MTEB/CMNLI | manhattan_recall | 90.36707972878185 |
PairClassification | C-MTEB/CMNLI | max_accuracy | 86.03728202044498 |
PairClassification | C-MTEB/CMNLI | max_ap | 92.49469583272597 |
PairClassification | C-MTEB/CMNLI | max_f1 | 86.74095974528088 |
Retrieval | C-MTEB/CovidRetrieval | map_at_1 | 74.34100000000001 |
Retrieval | C-MTEB/CovidRetrieval | map_at_10 | 82.49499999999999 |
Retrieval | C-MTEB/CovidRetrieval | map_at_100 | 82.64200000000001 |
Retrieval | C-MTEB/CovidRetrieval | map_at_1000 | 82.643 |
Retrieval | C-MTEB/CovidRetrieval | map_at_3 | 81.142 |
Retrieval | C-MTEB/CovidRetrieval | map_at_5 | 81.95400000000001 |
Retrieval | C-MTEB/CovidRetrieval | mrr_at_1 | 74.71 |
Retrieval | C-MTEB/CovidRetrieval | mrr_at_10 | 82.553 |
Retrieval | C-MTEB/CovidRetrieval | mrr_at_100 | 82.699 |
Retrieval | C-MTEB/CovidRetrieval | mrr_at_1000 | 82.70100000000001 |
Retrieval | C-MTEB/CovidRetrieval | mrr_at_3 | 81.279 |
Retrieval | C-MTEB/CovidRetrieval | mrr_at_5 | 82.069 |
Retrieval | C-MTEB/CovidRetrieval | ndcg_at_1 | 74.605 |
Retrieval | C-MTEB/CovidRetrieval | ndcg_at_10 | 85.946 |
Retrieval | C-MTEB/CovidRetrieval | ndcg_at_100 | 86.607 |
Retrieval | C-MTEB/CovidRetrieval | ndcg_at_1000 | 86.669 |
Retrieval | C-MTEB/CovidRetrieval | ndcg_at_3 | 83.263 |
Retrieval | C-MTEB/CovidRetrieval | ndcg_at_5 | 84.71600000000001 |
Retrieval | C-MTEB/CovidRetrieval | precision_at_1 | 74.605 |
Retrieval | C-MTEB/CovidRetrieval | precision_at_10 | 9.758 |
Retrieval | C-MTEB/CovidRetrieval | precision_at_100 | 1.005 |
Retrieval | C-MTEB/CovidRetrieval | precision_at_1000 | 0.101 |
Retrieval | C-MTEB/CovidRetrieval | precision_at_3 | 29.996000000000002 |
Retrieval | C-MTEB/CovidRetrieval | precision_at_5 | 18.736 |
Retrieval | C-MTEB/CovidRetrieval | recall_at_1 | 74.34100000000001 |
Retrieval | C-MTEB/CovidRetrieval | recall_at_10 | 96.523 |
Retrieval | C-MTEB/CovidRetrieval | recall_at_100 | 99.473 |
Retrieval | C-MTEB/CovidRetrieval | recall_at_1000 | 100.0 |
Retrieval | C-MTEB/CovidRetrieval | recall_at_3 | 89.278 |
Retrieval | C-MTEB/CovidRetrieval | recall_at_5 | 92.83500000000001 |
Retrieval | C-MTEB/DuRetrieval | map_at_1 | 26.950000000000003 |
Retrieval | C-MTEB/DuRetrieval | map_at_10 | 82.408 |
Retrieval | C-MTEB/DuRetrieval | map_at_100 | 85.057 |
Retrieval | C-MTEB/DuRetrieval | map_at_1000 | 85.09100000000001 |
Retrieval | C-MTEB/DuRetrieval | map_at_3 | 57.635999999999996 |
Retrieval | C-MTEB/DuRetrieval | map_at_5 | 72.48 |
Retrieval | C-MTEB/DuRetrieval | mrr_at_1 | 92.15 |
Retrieval | C-MTEB/DuRetrieval | mrr_at_10 | 94.554 |
Retrieval | C-MTEB/DuRetrieval | mrr_at_100 | 94.608 |
Retrieval | C-MTEB/DuRetrieval | mrr_at_1000 | 94.61 |
Retrieval | C-MTEB/DuRetrieval | mrr_at_3 | 94.292 |
Retrieval | C-MTEB/DuRetrieval | mrr_at_5 | 94.459 |
Retrieval | C-MTEB/DuRetrieval | ndcg_at_1 | 92.15 |
Retrieval | C-MTEB/DuRetrieval | ndcg_at_10 | 89.108 |
Retrieval | C-MTEB/DuRetrieval | ndcg_at_100 | 91.525 |
Retrieval | C-MTEB/DuRetrieval | ndcg_at_1000 | 91.82900000000001 |
Retrieval | C-MTEB/DuRetrieval | ndcg_at_3 | 88.44 |
Retrieval | C-MTEB/DuRetrieval | ndcg_at_5 | 87.271 |
Retrieval | C-MTEB/DuRetrieval | precision_at_1 | 92.15 |
Retrieval | C-MTEB/DuRetrieval | precision_at_10 | 42.29 |
Retrieval | C-MTEB/DuRetrieval | precision_at_100 | 4.812 |
Retrieval | C-MTEB/DuRetrieval | precision_at_1000 | 0.48900000000000005 |
Retrieval | C-MTEB/DuRetrieval | precision_at_3 | 79.14999999999999 |
Retrieval | C-MTEB/DuRetrieval | precision_at_5 | 66.64 |
Retrieval | C-MTEB/DuRetrieval | recall_at_1 | 26.950000000000003 |
Retrieval | C-MTEB/DuRetrieval | recall_at_10 | 89.832 |
Retrieval | C-MTEB/DuRetrieval | recall_at_100 | 97.921 |
Retrieval | C-MTEB/DuRetrieval | recall_at_1000 | 99.471 |
Retrieval | C-MTEB/DuRetrieval | recall_at_3 | 59.562000000000005 |
Retrieval | C-MTEB/DuRetrieval | recall_at_5 | 76.533 |
Retrieval | C-MTEB/EcomRetrieval | map_at_1 | 53.5 |
Retrieval | C-MTEB/EcomRetrieval | map_at_10 | 63.105999999999995 |
Retrieval | C-MTEB/EcomRetrieval | map_at_100 | 63.63100000000001 |
Retrieval | C-MTEB/EcomRetrieval | map_at_1000 | 63.641999999999996 |
Retrieval | C-MTEB/EcomRetrieval | map_at_3 | 60.617 |
Retrieval | C-MTEB/EcomRetrieval | map_at_5 | 62.132 |
Retrieval | C-MTEB/EcomRetrieval | mrr_at_1 | 53.5 |
Retrieval | C-MTEB/EcomRetrieval | mrr_at_10 | 63.105999999999995 |
Retrieval | C-MTEB/EcomRetrieval | mrr_at_100 | 63.63100000000001 |
Retrieval | C-MTEB/EcomRetrieval | mrr_at_1000 | 63.641999999999996 |
Retrieval | C-MTEB/EcomRetrieval | mrr_at_3 | 60.617 |
Retrieval | C-MTEB/EcomRetrieval | mrr_at_5 | 62.132 |
Retrieval | C-MTEB/EcomRetrieval | ndcg_at_1 | 53.5 |
Retrieval | C-MTEB/EcomRetrieval | ndcg_at_10 | 67.92200000000001 |
Retrieval | C-MTEB/EcomRetrieval | ndcg_at_100 | 70.486 |
Retrieval | C-MTEB/EcomRetrieval | ndcg_at_1000 | 70.777 |
Retrieval | C-MTEB/EcomRetrieval | ndcg_at_3 | 62.853 |
Retrieval | C-MTEB/EcomRetrieval | ndcg_at_5 | 65.59899999999999 |
Retrieval | C-MTEB/EcomRetrieval | precision_at_1 | 53.5 |
Retrieval | C-MTEB/EcomRetrieval | precision_at_10 | 8.309999999999999 |
Retrieval | C-MTEB/EcomRetrieval | precision_at_100 | 0.951 |
Retrieval | C-MTEB/EcomRetrieval | precision_at_1000 | 0.097 |
Retrieval | C-MTEB/EcomRetrieval | precision_at_3 | 23.1 |
Retrieval | C-MTEB/EcomRetrieval | precision_at_5 | 15.2 |
Retrieval | C-MTEB/EcomRetrieval | recall_at_1 | 53.5 |
Retrieval | C-MTEB/EcomRetrieval | recall_at_10 | 83.1 |
Retrieval | C-MTEB/EcomRetrieval | recall_at_100 | 95.1 |
Retrieval | C-MTEB/EcomRetrieval | recall_at_1000 | 97.39999999999999 |
Retrieval | C-MTEB/EcomRetrieval | recall_at_3 | 69.3 |
Retrieval | C-MTEB/EcomRetrieval | recall_at_5 | 76.0 |
Classification | C-MTEB/IFlyTek-classification | accuracy | 51.773759138130046 |
Classification | C-MTEB/IFlyTek-classification | f1 | 40.38600802756481 |
Classification | C-MTEB/JDReview-classification | accuracy | 88.48030018761726 |
Classification | C-MTEB/JDReview-classification | ap | 59.2732541555627 |
Classification | C-MTEB/JDReview-classification | f1 | 83.58836007358619 |
STS | C-MTEB/LCQMC | cos_sim_pearson | 73.67511194245922 |
STS | C-MTEB/LCQMC | cos_sim_spearman | 79.43347759067298 |
STS | C-MTEB/LCQMC | euclidean_pearson | 79.04491504318766 |
STS | C-MTEB/LCQMC | euclidean_spearman | 79.14478545356785 |
STS | C-MTEB/LCQMC | manhattan_pearson | 79.03365022867428 |
STS | C-MTEB/LCQMC | manhattan_spearman | 79.13172717619908 |
Retrieval | C-MTEB/MMarcoRetrieval | map_at_1 | 67.184 |
Retrieval | C-MTEB/MMarcoRetrieval | map_at_10 | 76.24600000000001 |
Retrieval | C-MTEB/MMarcoRetrieval | map_at_100 | 76.563 |
Retrieval | C-MTEB/MMarcoRetrieval | map_at_1000 | 76.575 |
Retrieval | C-MTEB/MMarcoRetrieval | map_at_3 | 74.522 |
Retrieval | C-MTEB/MMarcoRetrieval | map_at_5 | 75.598 |
Retrieval | C-MTEB/MMarcoRetrieval | mrr_at_1 | 69.47 |
Retrieval | C-MTEB/MMarcoRetrieval | mrr_at_10 | 76.8 |
Retrieval | C-MTEB/MMarcoRetrieval | mrr_at_100 | 77.082 |
Retrieval | C-MTEB/MMarcoRetrieval | mrr_at_1000 | 77.093 |
Retrieval | C-MTEB/MMarcoRetrieval | mrr_at_3 | 75.29400000000001 |
Retrieval | C-MTEB/MMarcoRetrieval | mrr_at_5 | 76.24 |
Retrieval | C-MTEB/MMarcoRetrieval | ndcg_at_1 | 69.47 |
Retrieval | C-MTEB/MMarcoRetrieval | ndcg_at_10 | 79.81099999999999 |
Retrieval | C-MTEB/MMarcoRetrieval | ndcg_at_100 | 81.187 |
Retrieval | C-MTEB/MMarcoRetrieval | ndcg_at_1000 | 81.492 |
Retrieval | C-MTEB/MMarcoRetrieval | ndcg_at_3 | 76.536 |
Retrieval | C-MTEB/MMarcoRetrieval | ndcg_at_5 | 78.367 |
Retrieval | C-MTEB/MMarcoRetrieval | precision_at_1 | 69.47 |
Retrieval | C-MTEB/MMarcoRetrieval | precision_at_10 | 9.599 |
Retrieval | C-MTEB/MMarcoRetrieval | precision_at_100 | 1.026 |
Retrieval | C-MTEB/MMarcoRetrieval | precision_at_1000 | 0.105 |
Retrieval | C-MTEB/MMarcoRetrieval | precision_at_3 | 28.777 |
Retrieval | C-MTEB/MMarcoRetrieval | precision_at_5 | 18.232 |
Retrieval | C-MTEB/MMarcoRetrieval | recall_at_1 | 67.184 |
Retrieval | C-MTEB/MMarcoRetrieval | recall_at_10 | 90.211 |
Retrieval | C-MTEB/MMarcoRetrieval | recall_at_100 | 96.322 |
Retrieval | C-MTEB/MMarcoRetrieval | recall_at_1000 | 98.699 |
Retrieval | C-MTEB/MMarcoRetrieval | recall_at_3 | 81.556 |
Retrieval | C-MTEB/MMarcoRetrieval | recall_at_5 | 85.931 |
Classification | mteb/amazon_massive_intent | accuracy | 76.96032279757901 |
Classification | mteb/amazon_massive_intent | f1 | 73.48052314033545 |
Classification | mteb/amazon_massive_scenario | accuracy | 84.64357767316744 |
Classification | mteb/amazon_massive_scenario | f1 | 83.58250539497922 |
Retrieval | C-MTEB/MedicalRetrieval | map_at_1 | 56.00000000000001 |
Retrieval | C-MTEB/MedicalRetrieval | map_at_10 | 62.066 |
Retrieval | C-MTEB/MedicalRetrieval | map_at_100 | 62.553000000000004 |
Retrieval | C-MTEB/MedicalRetrieval | map_at_1000 | 62.598 |
Retrieval | C-MTEB/MedicalRetrieval | map_at_3 | 60.4 |
Retrieval | C-MTEB/MedicalRetrieval | map_at_5 | 61.370000000000005 |
Retrieval | C-MTEB/MedicalRetrieval | mrr_at_1 | 56.2 |
Retrieval | C-MTEB/MedicalRetrieval | mrr_at_10 | 62.166 |
Retrieval | C-MTEB/MedicalRetrieval | mrr_at_100 | 62.653000000000006 |
Retrieval | C-MTEB/MedicalRetrieval | mrr_at_1000 | 62.699000000000005 |
Retrieval | C-MTEB/MedicalRetrieval | mrr_at_3 | 60.5 |
Retrieval | C-MTEB/MedicalRetrieval | mrr_at_5 | 61.47 |
Retrieval | C-MTEB/MedicalRetrieval | ndcg_at_1 | 56.00000000000001 |
Retrieval | C-MTEB/MedicalRetrieval | ndcg_at_10 | 65.199 |
Retrieval | C-MTEB/MedicalRetrieval | ndcg_at_100 | 67.79899999999999 |
Retrieval | C-MTEB/MedicalRetrieval | ndcg_at_1000 | 69.056 |
Retrieval | C-MTEB/MedicalRetrieval | ndcg_at_3 | 61.814 |
Retrieval | C-MTEB/MedicalRetrieval | ndcg_at_5 | 63.553000000000004 |
Retrieval | C-MTEB/MedicalRetrieval | precision_at_1 | 56.00000000000001 |
Retrieval | C-MTEB/MedicalRetrieval | precision_at_10 | 7.51 |
Retrieval | C-MTEB/MedicalRetrieval | precision_at_100 | 0.878 |
Retrieval | C-MTEB/MedicalRetrieval | precision_at_1000 | 0.098 |
Retrieval | C-MTEB/MedicalRetrieval | precision_at_3 | 21.967 |
Retrieval | C-MTEB/MedicalRetrieval | precision_at_5 | 14.02 |
Retrieval | C-MTEB/MedicalRetrieval | recall_at_1 | 56.00000000000001 |
Retrieval | C-MTEB/MedicalRetrieval | recall_at_10 | 75.1 |
Retrieval | C-MTEB/MedicalRetrieval | recall_at_100 | 87.8 |
Retrieval | C-MTEB/MedicalRetrieval | recall_at_1000 | 97.7 |
Retrieval | C-MTEB/MedicalRetrieval | recall_at_3 | 65.9 |
Retrieval | C-MTEB/MedicalRetrieval | recall_at_5 | 70.1 |
Reranking | C-MTEB/Mmarco-reranking | map | 32.74158258279793 |
Reranking | C-MTEB/Mmarco-reranking | mrr | 31.56071428571428 |
Classification | C-MTEB/MultilingualSentiment-classification | accuracy | 78.96666666666667 |
Classification | C-MTEB/MultilingualSentiment-classification | f1 | 78.82528563818045 |
PairClassification | C-MTEB/OCNLI | cos_sim_accuracy | 83.54087709799674 |
PairClassification | C-MTEB/OCNLI | cos_sim_ap | 87.26170197077586 |
PairClassification | C-MTEB/OCNLI | cos_sim_f1 | 84.7609561752988 |
PairClassification | C-MTEB/OCNLI | cos_sim_precision | 80.20735155513667 |
PairClassification | C-MTEB/OCNLI | cos_sim_recall | 89.86272439281943 |
PairClassification | C-MTEB/OCNLI | dot_accuracy | 72.22523010286952 |
PairClassification | C-MTEB/OCNLI | dot_ap | 79.51975358187732 |
PairClassification | C-MTEB/OCNLI | dot_f1 | 76.32183908045977 |
PairClassification | C-MTEB/OCNLI | dot_precision | 67.58957654723126 |
PairClassification | C-MTEB/OCNLI | dot_recall | 87.64519535374869 |
PairClassification | C-MTEB/OCNLI | euclidean_accuracy | 82.0249052517596 |
PairClassification | C-MTEB/OCNLI | euclidean_ap | 85.32829948726406 |
PairClassification | C-MTEB/OCNLI | euclidean_f1 | 83.24924318869829 |
PairClassification | C-MTEB/OCNLI | euclidean_precision | 79.71014492753623 |
PairClassification | C-MTEB/OCNLI | euclidean_recall | 87.11721224920802 |
PairClassification | C-MTEB/OCNLI | manhattan_accuracy | 82.13318895506227 |
PairClassification | C-MTEB/OCNLI | manhattan_ap | 85.28856869288006 |
PairClassification | C-MTEB/OCNLI | manhattan_f1 | 83.34946757018393 |
PairClassification | C-MTEB/OCNLI | manhattan_precision | 76.94369973190348 |
PairClassification | C-MTEB/OCNLI | manhattan_recall | 90.91869060190075 |
PairClassification | C-MTEB/OCNLI | max_accuracy | 83.54087709799674 |
PairClassification | C-MTEB/OCNLI | max_ap | 87.26170197077586 |
PairClassification | C-MTEB/OCNLI | max_f1 | 84.7609561752988 |
Classification | C-MTEB/OnlineShopping-classification | accuracy | 94.56 |
Classification | C-MTEB/OnlineShopping-classification | ap | 92.80848436710805 |
Classification | C-MTEB/OnlineShopping-classification | f1 | 94.54951966576111 |
STS | C-MTEB/PAWSX | cos_sim_pearson | 39.0866558287863 |
STS | C-MTEB/PAWSX | cos_sim_spearman | 45.9211126233312 |
STS | C-MTEB/PAWSX | euclidean_pearson | 44.86568743222145 |
STS | C-MTEB/PAWSX | euclidean_spearman | 45.63882757207507 |
STS | C-MTEB/PAWSX | manhattan_pearson | 44.89480036909126 |
STS | C-MTEB/PAWSX | manhattan_spearman | 45.65929449046206 |
STS | C-MTEB/QBQTC | cos_sim_pearson | 43.04701793979569 |
STS | C-MTEB/QBQTC | cos_sim_spearman | 44.87491033760315 |
STS | C-MTEB/QBQTC | euclidean_pearson | 36.2004061032567 |
STS | C-MTEB/QBQTC | euclidean_spearman | 41.44823909683865 |
STS | C-MTEB/QBQTC | manhattan_pearson | 36.136113427955095 |
STS | C-MTEB/QBQTC | manhattan_spearman | 41.39225495993949 |
STS | mteb/sts22-crosslingual-sts | cos_sim_pearson | 61.65611315777857 |
STS | mteb/sts22-crosslingual-sts | cos_sim_spearman | 64.4067673105648 |
STS | mteb/sts22-crosslingual-sts | euclidean_pearson | 61.814977248797184 |
STS | mteb/sts22-crosslingual-sts | euclidean_spearman | 63.99473350700169 |
STS | mteb/sts22-crosslingual-sts | manhattan_pearson | 61.684304629588624 |
STS | mteb/sts22-crosslingual-sts | manhattan_spearman | 63.97831213239316 |
STS | C-MTEB/STSB | cos_sim_pearson | 76.57324933064379 |
STS | C-MTEB/STSB | cos_sim_spearman | 79.23602286949782 |
STS | C-MTEB/STSB | euclidean_pearson | 80.28226284310948 |
STS | C-MTEB/STSB | euclidean_spearman | 80.32210477608423 |
STS | C-MTEB/STSB | manhattan_pearson | 80.27262188617811 |
STS | C-MTEB/STSB | manhattan_spearman | 80.31619185039723 |
Reranking | C-MTEB/T2Reranking | map | 67.05266891356277 |
Reranking | C-MTEB/T2Reranking | mrr | 77.1906333623497 |
Retrieval | C-MTEB/T2Retrieval | map_at_1 | 28.212 |
Retrieval | C-MTEB/T2Retrieval | map_at_10 | 78.932 |
Retrieval | C-MTEB/T2Retrieval | map_at_100 | 82.51899999999999 |
Retrieval | C-MTEB/T2Retrieval | map_at_1000 | 82.575 |
Retrieval | C-MTEB/T2Retrieval | map_at_3 | 55.614 |
Retrieval | C-MTEB/T2Retrieval | map_at_5 | 68.304 |
Retrieval | C-MTEB/T2Retrieval | mrr_at_1 | 91.211 |
Retrieval | C-MTEB/T2Retrieval | mrr_at_10 | 93.589 |
Retrieval | C-MTEB/T2Retrieval | mrr_at_100 | 93.659 |
Retrieval | C-MTEB/T2Retrieval | mrr_at_1000 | 93.662 |
Retrieval | C-MTEB/T2Retrieval | mrr_at_3 | 93.218 |
Retrieval | C-MTEB/T2Retrieval | mrr_at_5 | 93.453 |
Retrieval | C-MTEB/T2Retrieval | ndcg_at_1 | 91.211 |
Retrieval | C-MTEB/T2Retrieval | ndcg_at_10 | 86.24000000000001 |
Retrieval | C-MTEB/T2Retrieval | ndcg_at_100 | 89.614 |
Retrieval | C-MTEB/T2Retrieval | ndcg_at_1000 | 90.14 |
Retrieval | C-MTEB/T2Retrieval | ndcg_at_3 | 87.589 |
Retrieval | C-MTEB/T2Retrieval | ndcg_at_5 | 86.265 |
Retrieval | C-MTEB/T2Retrieval | precision_at_1 | 91.211 |
Retrieval | C-MTEB/T2Retrieval | precision_at_10 | 42.626 |
Retrieval | C-MTEB/T2Retrieval | precision_at_100 | 5.043 |
Retrieval | C-MTEB/T2Retrieval | precision_at_1000 | 0.517 |
Retrieval | C-MTEB/T2Retrieval | precision_at_3 | 76.42 |
Retrieval | C-MTEB/T2Retrieval | precision_at_5 | 64.045 |
Retrieval | C-MTEB/T2Retrieval | recall_at_1 | 28.212 |
Retrieval | C-MTEB/T2Retrieval | recall_at_10 | 85.223 |
Retrieval | C-MTEB/T2Retrieval | recall_at_100 | 96.229 |
Retrieval | C-MTEB/T2Retrieval | recall_at_1000 | 98.849 |
Retrieval | C-MTEB/T2Retrieval | recall_at_3 | 57.30800000000001 |
Retrieval | C-MTEB/T2Retrieval | recall_at_5 | 71.661 |
Classification | C-MTEB/TNews-classification | accuracy | 54.385000000000005 |
Classification | C-MTEB/TNews-classification | f1 | 52.38762400903556 |
Clustering | C-MTEB/ThuNewsClusteringP2P | v_measure | 74.55283855942916 |
Clustering | C-MTEB/ThuNewsClusteringS2S | v_measure | 68.55115316700493 |
Retrieval | C-MTEB/VideoRetrieval | map_at_1 | 58.8 |
Retrieval | C-MTEB/VideoRetrieval | map_at_10 | 69.035 |
Retrieval | C-MTEB/VideoRetrieval | map_at_100 | 69.52000000000001 |
Retrieval | C-MTEB/VideoRetrieval | map_at_1000 | 69.529 |
Retrieval | C-MTEB/VideoRetrieval | map_at_3 | 67.417 |
Retrieval | C-MTEB/VideoRetrieval | map_at_5 | 68.407 |
Retrieval | C-MTEB/VideoRetrieval | mrr_at_1 | 58.8 |
Retrieval | C-MTEB/VideoRetrieval | mrr_at_10 | 69.035 |
Retrieval | C-MTEB/VideoRetrieval | mrr_at_100 | 69.52000000000001 |
Retrieval | C-MTEB/VideoRetrieval | mrr_at_1000 | 69.529 |
Retrieval | C-MTEB/VideoRetrieval | mrr_at_3 | 67.417 |
Retrieval | C-MTEB/VideoRetrieval | mrr_at_5 | 68.407 |
Retrieval | C-MTEB/VideoRetrieval | ndcg_at_1 | 58.8 |
Retrieval | C-MTEB/VideoRetrieval | ndcg_at_10 | 73.395 |
Retrieval | C-MTEB/VideoRetrieval | ndcg_at_100 | 75.62 |
Retrieval | C-MTEB/VideoRetrieval | ndcg_at_1000 | 75.90299999999999 |
Retrieval | C-MTEB/VideoRetrieval | ndcg_at_3 | 70.11800000000001 |
Retrieval | C-MTEB/VideoRetrieval | ndcg_at_5 | 71.87400000000001 |
Retrieval | C-MTEB/VideoRetrieval | precision_at_1 | 58.8 |
Retrieval | C-MTEB/VideoRetrieval | precision_at_10 | 8.68 |
Retrieval | C-MTEB/VideoRetrieval | precision_at_100 | 0.9690000000000001 |
Retrieval | C-MTEB/VideoRetrieval | precision_at_1000 | 0.099 |
Retrieval | C-MTEB/VideoRetrieval | precision_at_3 | 25.967000000000002 |
Retrieval | C-MTEB/VideoRetrieval | precision_at_5 | 16.42 |
Retrieval | C-MTEB/VideoRetrieval | recall_at_1 | 58.8 |
Retrieval | C-MTEB/VideoRetrieval | recall_at_10 | 86.8 |
Retrieval | C-MTEB/VideoRetrieval | recall_at_100 | 96.89999999999999 |
Retrieval | C-MTEB/VideoRetrieval | recall_at_1000 | 99.2 |
Retrieval | C-MTEB/VideoRetrieval | recall_at_3 | 77.9 |
Retrieval | C-MTEB/VideoRetrieval | recall_at_5 | 82.1 |
Classification | C-MTEB/waimai-classification | accuracy | 89.42 |
Classification | C-MTEB/waimai-classification | ap | 75.35978503182068 |
Classification | C-MTEB/waimai-classification | f1 | 88.01006394348263 |
📄 许可证
本模型采用MIT licence。
Phi 2 GGUF
其他
Phi-2是微软开发的一个小型但强大的语言模型,具有27亿参数,专注于高效推理和高质量文本生成。
大型语言模型 支持多种语言
P
TheBloke
41.5M
205
Roberta Large
MIT
基于掩码语言建模目标预训练的大型英语语言模型,采用改进的BERT训练方法
大型语言模型 英语
R
FacebookAI
19.4M
212
Distilbert Base Uncased
Apache-2.0
DistilBERT是BERT基础模型的蒸馏版本,在保持相近性能的同时更轻量高效,适用于序列分类、标记分类等自然语言处理任务。
大型语言模型 英语
D
distilbert
11.1M
669
Llama 3.1 8B Instruct GGUF
Meta Llama 3.1 8B Instruct 是一个多语言大语言模型,针对多语言对话用例进行了优化,在常见的行业基准测试中表现优异。
大型语言模型 英语
L
modularai
9.7M
4
Xlm Roberta Base
MIT
XLM-RoBERTa是基于100种语言的2.5TB过滤CommonCrawl数据预训练的多语言模型,采用掩码语言建模目标进行训练。
大型语言模型 支持多种语言
X
FacebookAI
9.6M
664
Roberta Base
MIT
基于Transformer架构的英语预训练模型,通过掩码语言建模目标在海量文本上训练,支持文本特征提取和下游任务微调
大型语言模型 英语
R
FacebookAI
9.3M
488
Opt 125m
其他
OPT是由Meta AI发布的开放预训练Transformer语言模型套件,参数量从1.25亿到1750亿,旨在对标GPT-3系列性能,同时促进大规模语言模型的开放研究。
大型语言模型 英语
O
facebook
6.3M
198
1
基于transformers库的预训练模型,适用于多种NLP任务
大型语言模型
Transformers

1
unslothai
6.2M
1
Llama 3.1 8B Instruct
Llama 3.1是Meta推出的多语言大语言模型系列,包含8B、70B和405B参数规模,支持8种语言和代码生成,优化了多语言对话场景。
大型语言模型
Transformers 支持多种语言

L
meta-llama
5.7M
3,898
T5 Base
Apache-2.0
T5基础版是由Google开发的文本到文本转换Transformer模型,参数规模2.2亿,支持多语言NLP任务。
大型语言模型 支持多种语言
T
google-t5
5.4M
702
精选推荐AI模型
Llama 3 Typhoon V1.5x 8b Instruct
专为泰语设计的80亿参数指令模型,性能媲美GPT-3.5-turbo,优化了应用场景、检索增强生成、受限生成和推理任务
大型语言模型
Transformers 支持多种语言

L
scb10x
3,269
16
Cadet Tiny
Openrail
Cadet-Tiny是一个基于SODA数据集训练的超小型对话模型,专为边缘设备推理设计,体积仅为Cosmo-3B模型的2%左右。
对话系统
Transformers 英语

C
ToddGoldfarb
2,691
6
Roberta Base Chinese Extractive Qa
基于RoBERTa架构的中文抽取式问答模型,适用于从给定文本中提取答案的任务。
问答系统 中文
R
uer
2,694
98