dunzhang-stella_en_400M_v5 Open-source English Text Processing Model - Free for Classification and Information Retrieval

Home

Dunzhang Stella En 400M V5

Developed by Marqo

Stella 400M is a medium-scale English text processing model focused on classification and information retrieval tasks.

Text Classification

Transformers

OtherOpen Source License:MIT #High-precision text classification #E-commerce review analysis #Multi-task evaluation

Downloads 17.20k

Release Time : 9/25/2024

Model Overview

This model is primarily used for text classification and information retrieval tasks, demonstrating excellent performance on multiple standard datasets.

Model Features

High-performance classification

Achieved 97.19% accuracy on Amazon product review classification task

Multi-task capability

Supports various text processing tasks including classification and information retrieval

Medium-scale

Balanced 400M parameter design that balances performance and efficiency

Model Capabilities

Text classification

Sentiment analysis

Information retrieval

Text similarity calculation

Use Cases

E-commerce

Product review classification

Automatically classify sentiment tendencies of Amazon product reviews

Achieved 97.19% accuracy on Amazon polarity classification task

Multi-class review classification

Perform multi-star classification on Amazon reviews

Achieved 59.53% accuracy on Amazon multi-class review task

Information retrieval

Argument retrieval

Perform argument matching retrieval on ArguAna dataset

Achieved a main score of 64.24

🚀 stella_en_400M_v5

This document presents the performance results of the stella_en_400M_v5 model on multiple datasets and tasks.

📚 Documentation

Model Information

Property	Details
Model Name	stella_en_400M_v5

Performance Results

1. MTEB AmazonCounterfactualClassification (en)

Task Type: Classification
Dataset Revision: e8379541af4e31359cca9fbcf4b00f2671dba205 | Metric | Value | |--------|-------| | accuracy | 92.35820895522387 | | ap | 70.81322736988783 | | ap_weighted | 70.81322736988783 | | f1 | 88.9505466159595 | | f1_weighted | 92.68630932872613 | | main_score | 92.35820895522387 |

2. MTEB AmazonPolarityClassification

Task Type: Classification
Dataset Revision: e2d317d38cd51312af73b3d32a06d1a08b442046 | Metric | Value | |--------|-------| | accuracy | 97.1945 | | ap | 96.08192192244094 | | ap_weighted | 96.08192192244094 | | f1 | 97.1936887167346 | | f1_weighted | 97.1936887167346 | | main_score | 97.1945 |

3. MTEB AmazonReviewsClassification (en)

Task Type: Classification
Dataset Revision: 1399c76144fd37290681b995c656ef9b2e06e26d | Metric | Value | |--------|-------| | accuracy | 59.528000000000006 | | f1 | 59.21016819840188 | | f1_weighted | 59.21016819840188 | | main_score | 59.528000000000006 |

4. MTEB ArguAna

Task Type: Retrieval
Dataset Revision: c22ab2a51041ffd869aaddef7af8d8215647e41a | Metric | Value | |--------|-------| | main_score | 64.24 | | map_at_1 | 40.398 | | map_at_10 | 56.215 | | map_at_100 | 56.833999999999996 | | map_at_1000 | 56.835 | | map_at_20 | 56.747 | | map_at_3 | 52.181 | | map_at_5 | 54.628 | | mrr_at_1 | 41.25177809388336 | | mrr_at_10 | 56.570762491815216 | | mrr_at_100 | 57.17548614361504 | | mrr_at_1000 | 57.176650626377466 | | mrr_at_20 | 57.08916253512566 | | mrr_at_3 | 52.47747747747754 | | mrr_at_5 | 54.94547178757718 | | nauc_map_at_1000_diff1 | 22.408086887100158 | | nauc_map_at_1000_max | -8.730419096847543 | | nauc_map_at_1000_std | -17.789262741255737 | | nauc_map_at_100_diff1 | 22.407371684274025 | | nauc_map_at_100_max | -8.732263549026266 | | nauc_map_at_100_std | -17.79550515579994 | | nauc_map_at_10_diff1 | 21.925005073301246 | | nauc_map_at_10_max | -8.990323944492134 | | nauc_map_at_10_std | -18.199246301671458 | | nauc_map_at_1_diff1 | 26.23276644969203 | | nauc_map_at_1_max | -12.376511389571245 | | nauc_map_at_1_std | -18.11411715207284 | | nauc_map_at_20_diff1 | 22.32455790850922 | | nauc_map_at_20_max | -8.664671547236034 | | nauc_map_at_20_std | -17.8290016125137 | | nauc_map_at_3_diff1 | 22.395462147465064 | | nauc_map_at_3_max | -8.206580750918844 | | nauc_map_at_3_std | -17.604490446911484 | | nauc_map_at_5_diff1 | 21.95307379904799 | | nauc_map_at_5_max | -8.03958102978443 | | nauc_map_at_5_std | -17.36578866595004 | | nauc_mrr_at_1000_diff1 | 20.124236798365587 | | nauc_mrr_at_1000_max | -9.587376069575898 | | nauc_mrr_at_1000_std | -17.79191612151833 | | nauc_mrr_at_100_diff1 | 20.123612603474033 | | nauc_mrr_at_100_max | -9.589187218607831 | | nauc_mrr_at_100_std | -17.7981617777748 | | nauc_mrr_at_10_diff1 | 19.723683875738075 | | nauc_mrr_at_10_max | -9.774151729178815 | | nauc_mrr_at_10_std | -18.168668675495162 | | nauc_mrr_at_1_diff1 | 23.945332059908132 | | nauc_mrr_at_1_max | -12.260461466152819 | | nauc_mrr_at_1_std | -18.007194922921148 | | nauc_mrr_at_20_diff1 | 20.04819461810257 | | nauc_mrr_at_20_max | -9.518368283588936 | | nauc_mrr_at_20_std | -17.831608149836136 | | nauc_mrr_at_3_diff1 | 19.8571785245832 | | nauc_mrr_at_3_max | -9.464375021240478 | | nauc_mrr_at_3_std | -17.728533927330453 | | nauc_mrr_at_5_diff1 | 19.670313652167827 | | nauc_mrr_at_5_max | -8.966372585728434 | | nauc_mrr_at_5_std | -17.468955834324817 | | nauc_ndcg_at_1000_diff1 | 21.863049281767417 | | nauc_ndcg_at_1000_max | -8.18698520924057 | | nauc_ndcg_at_1000_std | -17.634483364794804 | | nauc_ndcg_at_100_diff1 | 21.849924385738586 | | nauc_ndcg_at_100_max | -8.226437560889345 | | nauc_ndcg_at_100_std | -17.774648478087002 | | nauc_ndcg_at_10_diff1 | 19.888395590413573 | | nauc_ndcg_at_10_max | -8.968706085632382 | | nauc_ndcg_at_10_std | -19.31386964628115 | | nauc_ndcg_at_1_diff1 | 26.23276644969203 | | nauc_ndcg_at_1_max | -12.376511389571245 | | nauc_ndcg_at_1_std | -18.11411715207284 | | nauc_ndcg_at_20_diff1 | 21.38413342416933 | | nauc_ndcg_at_20_max | -7.636238194084164 | | nauc_ndcg_at_20_std | -17.946390844693028 | | nauc_ndcg_at_3_diff1 | 21.29169165029195 | | nauc_ndcg_at_3_max | -6.793840499730093 | | nauc_ndcg_at_3_std | -17.52359001586737 | | nauc_ndcg_at_5_diff1 | 20.238297656671364 | | nauc_ndcg_at_5_max | -6.424992706950072 | | nauc_ndcg_at_5_std | -17.082391132291356 | | nauc_precision_at_1000_diff1 | -7.05195108528572 | | nauc_precision_at_1000_max | 34.439879624882145 | | nauc_precision_at_1000_std | 68.72436351659353 | | nauc_precision_at_100_diff1 | -2.769464113932605 | | nauc_precision_at_100_max | 9.89562961226698 | | nauc_precision_at_100_std | -0.5880967482224028 | | nauc_precision_at_10_diff1 | 2.1371544726832323 | | nauc_precision_at_10_max | -11.93051325147756 | | nauc_precision_at_10_std | -30.83144187392059 | | nauc_precision_at_1_diff1 | 26.23276644969203 | | nauc_precision_at_1_max | -12.376511389571245 | | nauc_precision_at_1_std | -18.11411715207284 | | nauc_precision_at_20_diff1 | 3.780146814257504 | | nauc_precision_at_20_max | 17.06527540214615 | | nauc_precision_at_20_std | -20.36832563035565 | | nauc_precision_at_3_diff1 | 17.63894384012077 | | nauc_precision_at_3_max | -2.0220490624638887 | | nauc_precision_at_3_std | -17.285601413493918 | | nauc_precision_at_5_diff1 | 12.557855071944601 | | nauc_precision_at_5_max | 0.5840236463956658 | | nauc_precision_at_5_std | -15.827224420217846 | | nauc_recall_at_1000_diff1 | -7.051951085286463 | | nauc_recall_at_1000_max | 34.43987962487738 | | nauc_recall_at_1000_std | 68.724363516591 | | nauc_recall_at_100_diff1 | -2.769464113930314 | | nauc_recall_at_100_max | 9.895629612270017 | | nauc_recall_at_100_std | -0.58809674821745 | | nauc_recall_at_10_diff1 | 2.1371544726834495 | | nauc_recall_at_10_max | -11.930513251477253 | | nauc_recall_at_10_std | -30.83144187392047 | | nauc_recall_at_1_diff1 | 26.23276644969203 | | nauc_recall_at_1_max | -12.376511389571245 | | nauc_recall_at_1_std | -18.11411715207284 | | nauc_recall_at_20_diff1 | 3.7801468142575922 | | nauc_recall_at_20_max | 17.0652754021456 | | nauc_recall_at_20_std | -20.36832563035559 | | nauc_recall_at_3_diff1 | 17.63894384012074 | | nauc_recall_at_3_max | -2.02204906246383 | | nauc_recall_at_3_std | -17.28560141349386 | | nauc_recall_at_5_diff1 | 12.55785507194463 | | nauc_recall_at_5_max | 0.5840236463957296 | | nauc_recall_at_5_std | -15.827224420217856 | | ndcg_at_1 | 40.398 | | ndcg_at_10 | 64.24 | | ndcg_at_100 | 66.631 | | ndcg_at_1000 | 66.65100000000001 | | ndcg_at_20 | 66.086 | | ndcg_at_3 | 55.938 | | ndcg_at_5 | 60.370000000000005 | | precision_at_1 | 40.398 | | precision_at_10 | 8.962 | | precision_at_100 | 0.9950000000000001 | | precision_at_1000 | 0.1 | | precision_at_20 | 4.836 | | precision_at_3 | 22.262 | | precision_at_5 | 15.519 | | recall_at_1 | 40.398 | | recall_at_10 | 89.616 | | recall_at_100 | 99.502 | | recall_at_1000 | 99.644 | | recall_at_20 | 96.72800000000001 | | recall_at_3 | 66.78500000000001 | | recall_at_5 | 77.596 |

5. MTEB ArxivClusteringP2P

Task Type: Clustering
Dataset Revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d | Metric | Value | |--------|-------| | main_score | 55.1564333205451 | | v_measure | 55.1564333205451 | | v_measure_std | 14.696883012214512 |

6. MTEB ArxivClusteringS2S

Task Type: Clustering
Dataset Revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53 | Metric | Value | |--------|-------| | main_score | 49.823698316694795 | | v_measure | 49.823698316694795 | | v_measure_std | 14.951660654298186 |

7. MTEB AskUbuntuDupQuestions

Task Type: Reranking
Dataset Revision: 2000358ca161889fa9c082cb41daa8dcfb161a54 | Metric | Value | |--------|-------| | main_score | 66.15294503553424 | | map | 66.15294503553424 | | mrr | 78.53438420612935 | | nAUC_map_diff1 | 12.569697092717997 | | nAUC_map_max | 21.50670312412572 | | nAUC_map_std | 16.943786429229064 | | nAUC_mrr_diff1 | 15.590272897361238 | | nAUC_mrr_max | 34.96072022474653 | | nAUC_mrr_std | 21.649217605241045 |

8. MTEB BIOSSES

Task Type: STS
Dataset Revision: d3fb88f8f02e40887cd149695127462bbcf29b4a | Metric | Value | |--------|-------| | cosine_pearson | 85.7824546319275 | | cosine_spearman | 83.29587385660628 | | euclidean_pearson | 84.58764190565167 | | euclidean_spearman | 83.30069324352772 | | main_score | 83.29587385660628 | | manhattan_pearson | 84.95996839947179 | | manhattan_spearman | 83.87480271054358 | | pearson | 85.7824546319275 | | spearman | 83.29587385660628 |

9. MTEB Banking77Classification

Task Type: Classification
Dataset Revision: 0fd18e25b25c072e09e0d92ab615fda904d66300 | Metric | Value | |--------|-------| | accuracy | 89.30194805194806 | | f1 | 89.26182507266391 | | f1_weighted | 89.26182507266391 |

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご