🚀 用于金融新闻情感分类的roberta - large微调模型(侧重加拿大新闻)
本模型基于roberta - large进行微调,用于金融新闻的情感分类,尤其侧重于加拿大新闻。它能有效识别金融新闻中的情感倾向,为金融领域的信息分析提供有力支持。
🚀 快速开始
模型介绍
此模型在financial_news_sentiment_mixte_with_phrasebank_75
数据集上进行训练。这是phrasebank
数据集的定制版本,其中仅保留了至少75%标注者验证过的句子。此外,还添加了约2000篇手动验证的加拿大金融新闻文章。因此,该模型更专门针对加拿大新闻进行了训练。最终结果显示,整体F1分数为93.25%,在加拿大新闻上的F1分数为83.6%。
📦 安装指南
使用HuggingFace加载模型
以下是加载roberta-large-financial-news-sentiment-en
模型及其子词分词器的代码示例:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("Jean-Baptiste/roberta-large-financial-news-sentiment-en")
model = AutoModelForSequenceClassification.from_pretrained("Jean-Baptiste/roberta-large-financial-news-sentiment-en")
处理文本样本
以下是使用加载好的模型处理文本样本的代码示例:
from transformers import pipeline
pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)
pipe("Melcor REIT (TSX: MR.UN) today announced results for the third quarter ended September 30, 2022. Revenue was stable in the quarter and year-to-date. Net operating income was down 3% in the quarter at $11.61 million due to the timing of operating expenses and inflated costs including utilities like gas/heat and power")
[{'label': 'negative', 'score': 0.9399105906486511}]
💻 使用示例
基础用法
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("Jean-Baptiste/roberta-large-financial-news-sentiment-en")
model = AutoModelForSequenceClassification.from_pretrained("Jean-Baptiste/roberta-large-financial-news-sentiment-en")
from transformers import pipeline
pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)
pipe("Melcor REIT (TSX: MR.UN) today announced results for the third quarter ended September 30, 2022. Revenue was stable in the quarter and year-to-date. Net operating income was down 3% in the quarter at $11.61 million due to the timing of operating expenses and inflated costs including utilities like gas/heat and power")
[{'label': 'negative', 'score': 0.9399105906486511}]
📚 详细文档
训练数据
训练数据的分类情况如下:
模型性能
整体F1分数(宏平均)
精度 |
召回率 |
F1分数 |
0.9355 |
0.9299 |
0.9325 |
按实体划分
实体 |
精度 |
召回率 |
F1分数 |
负面 |
0.9605 |
0.9240 |
0.9419 |
中性 |
0.9538 |
0.9459 |
0.9498 |
正面 |
0.8922 |
0.9200 |
0.9059 |
📄 许可证
本项目采用MIT许可证。