🚀 mmlw-roberta-large
This is a model for sentence similarity tasks. It belongs to the sentence-transformers family and is used for feature extraction and sentence similarity calculations. It has been tested on multiple datasets from the MTEB benchmark, showing its performance in various NLP tasks such as clustering, classification, retrieval, etc.
📚 Documentation
Model Information
Property |
Details |
Model Type |
mmlw-roberta-large |
Pipeline Tag |
sentence-similarity |
Tags |
sentence-transformers, feature-extraction, sentence-similarity, transformers, mteb |
Performance Metrics
1. Clustering Task
- Dataset: PL-MTEB/8tags-clustering (MTEB 8TagsClustering, test split)
| Metric | Value |
|--------|-------|
| v_measure | 31.16472823814849 |
2. Classification Tasks
AllegroReviews
- Dataset: PL-MTEB/allegro-reviews (MTEB AllegroReviews, test split)
| Metric | Value |
|--------|-------|
| accuracy | 47.48508946322067 |
| f1 | 42.33327527584009 |
CBD
- Dataset: PL-MTEB/cbd (MTEB CBD, test split)
| Metric | Value |
|--------|-------|
| accuracy | 69.33 |
| ap | 22.972409521444508 |
| f1 | 58.91072163784952 |
MassiveIntentClassification (pl)
- Dataset: mteb/amazon_massive_intent (MTEB MassiveIntentClassification (pl), test split)
| Metric | Value |
|--------|-------|
| accuracy | 74.8117014122394 |
| f1 | 72.0259730121889 |
MassiveScenarioClassification (pl)
- Dataset: mteb/amazon_massive_scenario (MTEB MassiveScenarioClassification (pl), test split)
| Metric | Value |
|--------|-------|
| accuracy | 77.84465366509752 |
| f1 | 77.73439218970051 |
3. Pair Classification Task
- Dataset: PL-MTEB/cdsce-pairclassification (MTEB CDSC - E, test split)
| Metric | Value |
|--------|-------|
| cos_sim_accuracy | 89.8 |
| cos_sim_ap | 79.87039801032493 |
| cos_sim_f1 | 68.53932584269663 |
| cos_sim_precision | 73.49397590361446 |
| cos_sim_recall | 64.21052631578948 |
| dot_accuracy | 86.1 |
| dot_ap | 63.684975861694035 |
| dot_f1 | 63.61746361746362 |
| dot_precision | 52.57731958762887 |
| dot_recall | 80.52631578947368 |
| euclidean_accuracy | 89.8 |
| euclidean_ap | 79.7527126811392 |
| euclidean_f1 | 68.46361185983827 |
| euclidean_precision | 70.1657458563536 |
| euclidean_recall | 66.84210526315789 |
| manhattan_accuracy | 89.7 |
| manhattan_ap | 79.64632771093657 |
| manhattan_f1 | 68.4931506849315 |
| manhattan_precision | 71.42857142857143 |
| manhattan_recall | 65.78947368421053 |
| max_accuracy | 89.8 |
| max_ap | 79.87039801032493 |
| max_f1 | 68.53932584269663 |
4. STS Task
- Dataset: PL-MTEB/cdscr-sts (MTEB CDSC - R, test split)
| Metric | Value |
|--------|-------|
| cos_sim_pearson | 92.1088892402831 |
| cos_sim_spearman | 92.54126377343101 |
| euclidean_pearson | 91.99022371986013 |
| euclidean_spearman | 92.55235973775511 |
| manhattan_pearson | 91.92170171331357 |
| manhattan_spearman | 92.47797623672449 |
5. Retrieval Tasks
ArguAna - PL
- Dataset: arguana-pl (MTEB ArguAna - PL, test split)
| Metric | Value |
|--------|-------|
| map_at_1 | 38.834 |
| map_at_10 | 55.22899999999999 |
| map_at_100 | 55.791999999999994 |
| map_at_1000 | 55.794 |
| map_at_3 | 51.233 |
| map_at_5 | 53.772 |
| mrr_at_1 | 39.687 |
| mrr_at_10 | 55.596000000000004 |
| mrr_at_100 | 56.157000000000004 |
| mrr_at_1000 | 56.157999999999994 |
| mrr_at_3 | 51.66 |
| mrr_at_5 | 54.135 |
| ndcg_at_1 | 38.834 |
| ndcg_at_10 | 63.402 |
| ndcg_at_100 | 65.78 |
| ndcg_at_1000 | 65.816 |
| ndcg_at_3 | 55.349000000000004 |
| ndcg_at_5 | 59.892 |
| precision_at_1 | 38.834 |
| precision_at_10 | 8.905000000000001 |
| precision_at_100 | 0.9939999999999999 |
| precision_at_1000 | 0.1 |
| precision_at_3 | 22.428 |
| precision_at_5 | 15.647 |
| recall_at_1 | 38.834 |
| recall_at_10 | 89.047 |
| recall_at_100 | 99.36 |
| recall_at_1000 | 99.644 |
| recall_at_3 | 67.283 |
| recall_at_5 | 78.236 |
DBPedia - PL
- Dataset: dbpedia-pl (MTEB DBPedia - PL, test split)
| Metric | Value |
|--------|-------|
| map_at_1 | 8.683 |
| map_at_10 | 18.9 |
| map_at_100 | 26.933 |
| map_at_1000 | 28.558 |
| map_at_3 | 13.638 |
| map_at_5 | 15.9 |
| mrr_at_1 | 63.74999999999999 |
| mrr_at_10 | 73.566 |
| mrr_at_100 | 73.817 |
| mrr_at_1000 | 73.824 |
| mrr_at_3 | 71.875 |
| mrr_at_5 | 73.2 |
| ndcg_at_1 | 53.125 |
| ndcg_at_10 | 40.271 |
| ndcg_at_100 | 45.51 |
| ndcg_at_1000 | 52.968 |
| ndcg_at_3 | 45.122 |
| ndcg_at_5 | 42.306 |
| precision_at_1 | 63.74999999999999 |
| precision_at_10 | 31.55 |
| precision_at_100 | 10.440000000000001 |
| precision_at_1000 | 2.01 |
| precision_at_3 | 48.333 |
| precision_at_5 | 40.5 |
| recall_at_1 | 8.683 |
| recall_at_10 | 24.63 |
| recall_at_100 | 51.762 |
| recall_at_1000 | 75.64999999999999 |
| recall_at_3 | 15.136 |
| recall_at_5 | 18.678 |
FiQA - PL
- Dataset: fiqa-pl (MTEB FiQA - PL, test split)
| Metric | Value |
|--------|-------|
| map_at_1 | 19.872999999999998 |
| map_at_10 | 32.923 |
| map_at_100 | 34.819 |
| map_at_1000 | 34.99 |
| map_at_3 | 28.500999999999998 |
| map_at_5 | 31.087999999999997 |
| mrr_at_1 | 40.432 |
| mrr_at_10 | 49.242999999999995 |
| mrr_at_100 | 50.014 |
| mrr_at_1000 | 50.05500000000001 |
| mrr_at_3 | 47.144999999999996 |
| mrr_at_5 | 48.171 |
| ndcg_at_1 | 40.586 |
| ndcg_at_10 | 40.887 |
| ndcg_at_100 | 47.701 |
| ndcg_at_1000 | 50.624 |
| ndcg_at_3 | 37.143 |
| ndcg_at_5 | 38.329 |
| precision_at_1 | 40.586 |
| precision_at_10 | 11.497 |
| precision_at_100 | 1.838 |
| precision_at_1000 | 0.23700000000000002 |
| precision_at_3 | 25.0 |
| precision_at_5 | 18.549 |
| recall_at_1 | 19.872999999999998 |
| recall_at_10 | 48.073 |
| recall_at_100 | 73.473 |
| recall_at_1000 | 90.94 |
| recall_at_3 | 33.645 |
| recall_at_5 | 39.711 |
HotpotQA - PL
- Dataset: hotpotqa-pl (MTEB HotpotQA - PL, test split)
| Metric | Value |
|--------|-------|
| map_at_1 | 39.399 |
| map_at_10 | 62.604000000000006 |
| map_at_100 | 63.475 |
| map_at_1000 | 63.534 |
| map_at_3 | 58.870999999999995 |
| map_at_5 | 61.217 |
| mrr_at_1 | 78.758 |
| mrr_at_10 | 84.584 |
| mrr_at_100 | 84.753 |
| mrr_at_1000 | 84.759 |
| mrr_at_3 | 83.65700000000001 |
| mrr_at_5 | 84.283 |
| ndcg_at_1 | 78.798 |
| ndcg_at_10 | 71.04 |
| ndcg_at_100 | 74.048 |
| ndcg_at_1000 | 75.163 |
| ndcg_at_3 | 65.862 |
| ndcg_at_5 | 68.77600000000001 |
| precision_at_1 | 78.798 |
| precision_at_10 | 14.949000000000002 |
| precision_at_100 | 1.7309999999999999 |
| precision_at_1000 | 0.188 |
| precision_at_3 | 42.237 |
| precision_at_5 | 27.634999999999998 |
| recall_at_1 | 39.399 |
| recall_at_10 | 74.747 |
| recall_at_100 | 86.529 |
| recall_at_1000 | 93.849 |
| recall_at_3 | 63.356 |
| recall_at_5 | 69.08800000000001 |
MSMARCO - PL
- Dataset: msmarco-pl (MTEB MSMARCO - PL, validation split)
| Metric | Value |
|--------|-------|
| map_at_1 | 19.598 |
| map_at_10 | 30.453999999999997 |
| map_at_100 | 31.601000000000003 |
| map_at_1000 | 31.66 |
| map_at_3 | 27.118 |
| map_at_5 | 28.943 |
| mrr_at_1 | 20.1 |
| mrr_at_10 | 30.978 |
| mrr_at_100 | 32.057 |
| mrr_at_1000 | 32.112 |
| mrr_at_3 | 27.679 |
| mrr_at_5 | 29.493000000000002 |
| ndcg_at_1 | 20.158 |
| ndcg_at_10 | 36.63 |
| ndcg_at_100 | 42.291000000000004 |
| ndcg_at_1000 | 43.828 |
| ndcg_at_3 | 29.744999999999997 |
| ndcg_at_5 | 33.024 |
| precision_at_1 | 20.158 |
| precision_at_10 | 5.811999999999999 |
| precision_at_100 | 0.868 |
| precision_at_1000 | 0.1 |
| precision_at_3 | 12.689 |
| precision_at_5 | 9.295 |
| recall_at_1 | 19.598 |
| recall_at_10 | 55.596999999999994 |
| recall_at_100 | 82.143 |
| recall_at_1000 | 94.015 |
| recall_at_3 | 36.720000000000006 |
| recall_at_5 | 44.606 |
NFCorpus - PL
- Dataset: nfcorpus-pl (MTEB NFCorpus - PL, test split)
| Metric | Value |
|--------|-------|
| map_at_1 | 5.604 |
| map_at_10 | 12.684000000000001 |
| map_at_100 | 16.274 |
| map_at_1000 | 17.669 |
| map_at_3 | 9.347 |
| map_at_5 | 10.752 |
| mrr_at_1 | 43.963 |
| mrr_at_10 | 52.94 |
| mrr_at_100 | 53.571000000000005 |
| mrr_at_1000 | 53.613 |
| mrr_at_3 | 51.032 |
| mrr_at_5 | 52.193 |
| ndcg_at_1 | 41.486000000000004 |
| ndcg_at_10 | 33.937 |
| ndcg_at_100 | 31.726 |
| ndcg_at_1000 | 40.331 |
| ndcg_at_3 | 39.217 |
| ndcg_at_5 | 36.521 |
| precision_at_1 | 43.034 |
| precision_at_10 | 25.324999999999996 |
| precision_at_100 | 8.022 |
| precision_at_1000 | 2.0629999999999997 |
| precision_at_3 | 36.945 |
| precision_at_5 | 31.517 |
| recall_at_1 | 5.604 |
| recall_at_10 | 16.554 |
| recall_at_100 | 33.113 |
| recall_at_1000 | 62.832 |
| recall_at_3 | ... |