🚀 用於金融新聞情感分類的roberta - large微調模型(側重加拿大新聞)
本模型基於roberta - large進行微調,用於金融新聞的情感分類,尤其側重於加拿大新聞。它能有效識別金融新聞中的情感傾向,為金融領域的信息分析提供有力支持。
🚀 快速開始
模型介紹
此模型在financial_news_sentiment_mixte_with_phrasebank_75
數據集上進行訓練。這是phrasebank
數據集的定製版本,其中僅保留了至少75%標註者驗證過的句子。此外,還添加了約2000篇手動驗證的加拿大金融新聞文章。因此,該模型更專門針對加拿大新聞進行了訓練。最終結果顯示,整體F1分數為93.25%,在加拿大新聞上的F1分數為83.6%。
📦 安裝指南
使用HuggingFace加載模型
以下是加載roberta-large-financial-news-sentiment-en
模型及其子詞分詞器的代碼示例:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("Jean-Baptiste/roberta-large-financial-news-sentiment-en")
model = AutoModelForSequenceClassification.from_pretrained("Jean-Baptiste/roberta-large-financial-news-sentiment-en")
處理文本樣本
以下是使用加載好的模型處理文本樣本的代碼示例:
from transformers import pipeline
pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)
pipe("Melcor REIT (TSX: MR.UN) today announced results for the third quarter ended September 30, 2022. Revenue was stable in the quarter and year-to-date. Net operating income was down 3% in the quarter at $11.61 million due to the timing of operating expenses and inflated costs including utilities like gas/heat and power")
[{'label': 'negative', 'score': 0.9399105906486511}]
💻 使用示例
基礎用法
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("Jean-Baptiste/roberta-large-financial-news-sentiment-en")
model = AutoModelForSequenceClassification.from_pretrained("Jean-Baptiste/roberta-large-financial-news-sentiment-en")
from transformers import pipeline
pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)
pipe("Melcor REIT (TSX: MR.UN) today announced results for the third quarter ended September 30, 2022. Revenue was stable in the quarter and year-to-date. Net operating income was down 3% in the quarter at $11.61 million due to the timing of operating expenses and inflated costs including utilities like gas/heat and power")
[{'label': 'negative', 'score': 0.9399105906486511}]
📚 詳細文檔
訓練數據
訓練數據的分類情況如下:
模型性能
整體F1分數(宏平均)
精度 |
召回率 |
F1分數 |
0.9355 |
0.9299 |
0.9325 |
按實體劃分
實體 |
精度 |
召回率 |
F1分數 |
負面 |
0.9605 |
0.9240 |
0.9419 |
中性 |
0.9538 |
0.9459 |
0.9498 |
正面 |
0.8922 |
0.9200 |
0.9059 |
📄 許可證
本項目採用MIT許可證。