Trading-Hero-LLM開源金融情感分析模型 - 精準優化金融文本情感分類

首頁

Trading Hero LLM

由fuchenru開發

基於FinBERT微調的金融情感分析模型，專為金融文本情感分類優化

文本分類

Transformers

開源協議:MIT #金融情感分析 #FinBERT微調 #財經文本分類

下載量 313

發布時間 : 5/25/2024

模型概述

該模型是基於FinBERT微調的版本，專門用於金融領域的情感分析任務，能夠識別金融文本中的中性、積極和消極情緒。

模型特點

金融領域優化

在大型金融語料庫上預訓練，針對金融文本特點進行優化

高準確率

測試準確率達到90.8%，F1值91.3%，在金融情感分析任務中表現優異

三分類情感分析

能夠區分中性、積極和消極三種金融情感傾向

模型能力

金融文本情感分類

財經新聞情緒分析

市場情緒預測

使用案例

金融分析

財經新聞情緒監測

分析財經新聞和市場評論的情緒傾向

準確識別中性、積極和消極情緒

投資決策支持

為投資者提供市場情緒分析參考

幫助判斷市場整體情緒走向

風險管理

市場風險預警

通過分析金融文本情緒變化預測潛在風險

提前發現市場情緒轉變信號

🚀 交易英雄金融情感分析

本模型是一個金融情感分析模型，基於預訓練的金融領域模型進行微調，能有效處理金融領域的自然語言處理任務，為金融分析和研究提供有力支持。

🚀 快速開始

代碼示例

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
tokenizer = AutoTokenizer.from_pretrained("fuchenru/Trading-Hero-LLM")
model = AutoModelForSequenceClassification.from_pretrained("fuchenru/Trading-Hero-LLM")
nlp = pipeline("text-classification", model=model, tokenizer=tokenizer)
# 預處理輸入文本
def preprocess(text, tokenizer, max_length=128):
    inputs = tokenizer(text, truncation=True, padding='max_length', max_length=max_length, return_tensors='pt')
    return inputs

# 執行預測的函數
def predict_sentiment(input_text):
    # 對輸入文本進行分詞
    inputs = tokenizer(input_text, return_tensors="pt", truncation=True, padding=True)

    # 進行推理
    with torch.no_grad():
        outputs = model(**inputs)

    # 獲取預測標籤
    predicted_label = torch.argmax(outputs.logits, dim=1).item()

    # 將預測標籤映射到原始標籤
    label_map = {0: 'neutral', 1: 'positive', 2: 'negative'}
    predicted_sentiment = label_map[predicted_label]

    return predicted_sentiment

stock_news = [
    "Market analysts predict a stable outlook for the coming weeks.",
    "The market remained relatively flat today, with minimal movement in stock prices.",
    "Investor sentiment improved following news of a potential trade deal.",
    # 此處省略部分內容
]


for i in stock_news:
    predicted_sentiment = predict_sentiment(i)
    print("Predicted Sentiment:", predicted_sentiment)

Predicted Sentiment: neutral
Predicted Sentiment: neutral
Predicted Sentiment: positive

✨ 主要特性

本模型是 FinBERT 的微調版本，FinBERT 是一個在金融文本上預訓練的 BERT 模型。
微調過程使模型適應特定的金融自然語言處理任務，增強了其在特定領域情感分析應用中的性能。

📦 安裝指南

文檔未提及安裝步驟，故跳過此章節。

💻 使用示例

基礎用法

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
tokenizer = AutoTokenizer.from_pretrained("fuchenru/Trading-Hero-LLM")
model = AutoModelForSequenceClassification.from_pretrained("fuchenru/Trading-Hero-LLM")
nlp = pipeline("text-classification", model=model, tokenizer=tokenizer)
# 預處理輸入文本
def preprocess(text, tokenizer, max_length=128):
    inputs = tokenizer(text, truncation=True, padding='max_length', max_length=max_length, return_tensors='pt')
    return inputs

# 執行預測的函數
def predict_sentiment(input_text):
    # 對輸入文本進行分詞
    inputs = tokenizer(input_text, return_tensors="pt", truncation=True, padding=True)

    # 進行推理
    with torch.no_grad():
        outputs = model(**inputs)

    # 獲取預測標籤
    predicted_label = torch.argmax(outputs.logits, dim=1).item()

    # 將預測標籤映射到原始標籤
    label_map = {0: 'neutral', 1: 'positive', 2: 'negative'}
    predicted_sentiment = label_map[predicted_label]

    return predicted_sentiment

stock_news = [
    "Market analysts predict a stable outlook for the coming weeks.",
    "The market remained relatively flat today, with minimal movement in stock prices.",
    "Investor sentiment improved following news of a potential trade deal.",
    # 此處省略部分內容
]


for i in stock_news:
    predicted_sentiment = predict_sentiment(i)
    print("Predicted Sentiment:", predicted_sentiment)

高級用法

文檔未提及高級用法相關內容，故跳過此部分。

📚 詳細文檔

主要用戶

金融分析師、自然語言處理研究人員以及處理金融數據的開發人員。

訓練數據

微調數據集：該模型在一個自定義的金融通信文本數據集上進行了微調。數據集分為訓練集、驗證集和測試集，具體如下：
- 訓練集：10,918,272 個標記
- 驗證集：1,213,184 個標記
- 測試集：1,347,968 個標記
預訓練數據集：FinBERT 在一個總計 49 億個標記的大型金融語料庫上進行了預訓練，包括：
- 公司報告（10 - K 和 10 - Q）：25 億個標記
- 財報電話會議記錄：13 億個標記
- 分析師報告：11 億個標記

評估指標

測試準確率 = 0.908469
測試精確率 = 0.927788
測試召回率 = 0.908469
測試 F1 值 = 0.913267
標籤含義：0 -> 中性；1 -> 積極；2 -> 消極

🔧 技術細節

文檔未提供足夠詳細的技術實現細節（未超過 50 字），故跳過此章節。

📄 許可證

本項目採用 MIT 許可證。

引用

@misc{yang2020finbert,
    title={FinBERT: A Pretrained Language Model for Financial Communications},
    author={Yi Yang and Mark Christopher Siy UY and Allen Huang},
    year={2020},
    eprint={2006.08097},
    archivePrefix={arXiv},
    }