🚀 Erlangshen-Roberta-110M-Sentiment
This is a fine-tuned version of the Chinese RoBERTa-wwm-ext-base model on several sentiment analysis datasets, which can effectively handle sentiment analysis tasks in the Chinese language.
🚀 Quick Start
The Erlangshen-Roberta-110M-Sentiment model is a fine - tuned version based on the Chinese RoBERTa-wwm-ext-base, specifically designed for sentiment analysis tasks. You can use the following code to quickly start using this model:
from transformers import BertForSequenceClassification
from transformers import BertTokenizer
import torch
tokenizer=BertTokenizer.from_pretrained('IDEA-CCNL/Erlangshen-Roberta-110M-Sentiment')
model=BertForSequenceClassification.from_pretrained('IDEA-CCNL/Erlangshen-Roberta-110M-Sentiment')
text='今天心情不好'
output=model(torch.tensor([tokenizer.encode(text)]))
print(torch.nn.functional.softmax(output.logits,dim=-1))
✨ Features
- Fine - tuned on Multiple Datasets: Based on the chinese-roberta-wwm-ext-base, it is fine - tuned on 8 Chinese sentiment analysis datasets, with a total of 227,347 samples.
- Good Performance: Shows excellent performance on multiple sentiment analysis tasks, such as ASAP - SENT, ASAP - ASPECT, and ChnSentiCorp.
📦 Installation
There is no specific installation steps provided in the original document, so this section is skipped.
💻 Usage Examples
Basic Usage
from transformers import BertForSequenceClassification
from transformers import BertTokenizer
import torch
tokenizer=BertTokenizer.from_pretrained('IDEA-CCNL/Erlangshen-Roberta-110M-Sentiment')
model=BertForSequenceClassification.from_pretrained('IDEA-CCNL/Erlangshen-Roberta-110M-Sentiment')
text='今天心情不好'
output=model(torch.tensor([tokenizer.encode(text)]))
print(torch.nn.functional.softmax(output.logits,dim=-1))
Advanced Usage
There is no advanced usage example provided in the original document, so this part is skipped.
📚 Documentation
Model Taxonomy
Property |
Details |
Demand |
General |
Task |
Natural Language Understanding (NLU) |
Series |
Erlangshen |
Model |
Roberta |
Parameter |
110M |
Extra |
Sentiment Analysis |
Model Information
Based on chinese-roberta-wwm-ext-base, we fine - tuned a sentiment analysis version on 8 Chinese sentiment analysis datasets, with totaling 227,347 samples.
Performance
Model |
ASAP - SENT |
ASAP - ASPECT |
ChnSentiCorp |
Erlangshen - Roberta - 110M - Sentiment |
97.77 |
97.31 |
96.61 |
Erlangshen - Roberta - 330M - Sentiment |
97.9 |
97.51 |
96.66 |
Erlangshen - MegatronBert - 1.3B - Sentiment |
98.1 |
97.8 |
97 |
🔧 Technical Details
There is no technical implementation details provided in the original document, so this section is skipped.
📄 License
The model uses the Apache 2.0 license.
📖 Citation
If you are using the resource for your work, please cite our paper:
@article{fengshenbang,
author = {Jiaxing Zhang and Ruyi Gan and Junjie Wang and Yuxiang Zhang and Lin Zhang and Ping Yang and Xinyu Gao and Ziwei Wu and Xiaoqun Dong and Junqing He and Jianheng Zhuo and Qi Yang and Yongfeng Huang and Xiayu Li and Yanghan Wu and Junyu Lu and Xinyu Zhu and Weifeng Chen and Ting Han and Kunhao Pan and Rui Wang and Hao Wang and Xiaojun Wu and Zhongshen Zeng and Chongpei Chen},
title = {Fengshenbang 1.0: Being the Foundation of Chinese Cognitive Intelligence},
journal = {CoRR},
volume = {abs/2209.02970},
year = {2022}
}
You can also cite our website:
@misc{Fengshenbang-LM,
title={Fengshenbang-LM},
author={IDEA-CCNL},
year={2021},
howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
}