Stella Large Zh V2
S
Stella Large Zh V2
Developed by infgrad
stella-large-zh-v2 is a Chinese model focused on sentence similarity calculation, supporting various semantic text similarity tasks and text classification tasks.
Downloads 259
Release Time : 10/13/2023
Model Overview
This model is mainly used for tasks such as sentence similarity calculation, text classification, text clustering, and re-ranking, and performs excellently on multiple Chinese evaluation benchmarks.
Model Features
Support for Multi-task Evaluation Benchmarks
Performs excellently on multiple Chinese multi-task evaluation benchmarks (such as MTEB), covering tasks such as semantic text similarity, text classification, text clustering, and re-ranking.
High-performance Sentence Similarity Calculation
Demonstrates outstanding performance in sentence similarity calculation on datasets such as Ant Financial Q&A, ATEC, and Bank Q&A, supporting multiple distance metrics (cosine similarity, Euclidean distance, Manhattan distance).
Powerful Re-ranking Ability
In the re-ranking tasks of CMedQAv1 and CMedQAv2, both the average accuracy and the average reciprocal rank exceed 85%, showing excellent performance.
Model Capabilities
Sentence Similarity Calculation
Text Classification
Text Clustering
Re-ranking
Retrieval
Use Cases
Financial Field
Financial Q&A System
Used in the Q&A system of the financial field to calculate the similarity between user questions and candidate answers.
On the Ant Financial Q&A dataset, the Pearson value of cosine similarity is 47.34, and the Spearman value is 49.94.
Bank Customer Service Q&A
Used for Q&A matching and similarity calculation in the bank customer service system.
On the bank Q&A dataset, the Pearson value of cosine similarity is 62.83, and the Spearman value is 65.53.
Medical Field
Medical Q&A Re-ranking
Used for answer re-ranking in the medical Q&A system to improve the relevance of answers.
In the re-ranking tasks of CMedQAv1 and CMedQAv2, the average accuracies are 85.44 and 85.82 respectively.
COVID-19 Information Retrieval
Used for the retrieval and ranking of COVID-19 related information.
In the COVID-19 retrieval task, the top-1 average accuracy is 68.86, and the top-10 average accuracy is 77.10.
General Text Processing
Text Classification
Used for general text classification tasks, such as Amazon review classification.
In the Amazon review classification (Chinese) task, the accuracy is 40.81, and the F1 score is 39.02.
Text Clustering
Used for text clustering tasks, such as CLS point-to-point and sentence-to-sentence clustering.
In the CLS point-to-point clustering task, the V-measure is 39.95; in the sentence-to-sentence clustering task, the V-measure is 38.18.
Featured Recommended AI Models