robust-sentiment-analysis開源情感分析模型 - 支持5種情感分類的實用工具

首頁

Robust Sentiment Analysis

由tabularisai開發

基於distilbert/distilbert-base-uncased微調的情感分析模型，僅使用合成數據訓練，支持5種情感分類。

文本分類

Transformers

英語開源協議:Apache-2.0 #合成數據訓練 #五級情感分類 #社交媒體分析

下載量 2,632

發布時間 : 7/23/2024

模型概述

該模型是一個用於英語文本情感分析的分類器，能夠將文本分類為非常負面、負面、中性、正面和非常正面五種情感類別。

模型特點

合成數據訓練

僅使用合成數據訓練，避免了現實世界數據集的常見限制

多類別情感分析

支持5種情感類別的精細分類（非常負面到非常正面）

高性能

在驗證集上實現了約0.95的train_acc_off_by_one準確率

輕量級

基於DistilBERT架構，比完整BERT模型更輕量高效

模型能力

文本情感分類

社交媒體情感分析

產品評論分類

客戶反饋分析

使用案例

商業分析

社交媒體監控

分析社交媒體上關於品牌或產品的公眾情感傾向

幫助品牌瞭解公眾情緒，及時調整營銷策略

客戶反饋分析

自動分類客戶反饋的情感傾向

快速識別不滿客戶，提高客戶服務質量

市場研究

產品評論分析

分析電商平臺上的產品評論情感

瞭解產品優缺點，指導產品改進

競爭情報分析

比較競爭對手產品的用戶情感反饋

獲取市場競爭優勢洞察

🚀 基於（蒸餾）BERT的情感分類模型：釋放合成數據的力量

本模型基於（蒸餾）BERT架構，利用合成數據進行訓練，可對文本進行精準的情感分類，廣泛應用於社交媒體分析、客戶反饋分析等多個領域。

🚀 快速開始

模型信息

屬性	詳情
模型名稱	tabularisai/robust-sentiment-analysis
基礎模型	distilbert/distilbert-base-uncased
任務類型	文本分類（情感分析）
語言	英語
類別數量	5（非常負面、負面、中性、正面、非常正面）
使用場景	社交媒體分析、客戶反饋分析、產品評論分類、品牌監測、市場調研、客戶服務優化、競爭情報分析

✨ 主要特性

基於合成數據訓練：僅使用合成數據進行訓練，可針對各種情感表達進行有針對性的訓練，不受現實數據集的限制。
多場景適用：適用於社交媒體監測、客戶反饋分析、產品評論情感分類、品牌情感跟蹤等多種場景。

📦 安裝指南

此部分原文檔未提供具體安裝命令，故跳過。

💻 使用示例

基礎用法

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# 加載模型和分詞器
model_name = "tabularisai/robust-sentiment-analysis"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# 預測情感的函數
def predict_sentiment(text):
    inputs = tokenizer(text.lower(), return_tensors="pt", truncation=True, padding=True, max_length=512)
    with torch.no_grad():
        outputs = model(**inputs)
    
    probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class = torch.argmax(probabilities, dim=-1).item()
    
    sentiment_map = {0: "非常負面", 1: "負面", 2: "中性", 3: "正面", 4: "非常正面"}
    return sentiment_map[predicted_class]

# 示例用法
texts = [
    "I absolutely loved this movie! The acting was superb and the plot was engaging.",
    "The service at this restaurant was terrible. I'll never go back.",
    "The product works as expected. Nothing special, but it gets the job done.",
    "I'm somewhat disappointed with my purchase. It's not as good as I hoped.",
    "This book changed my life! I couldn't put it down and learned so much."
]

for text in texts:
    sentiment = predict_sentiment(text)
    print(f"文本: {text}")
    print(f"情感: {sentiment}\n")

高級用法（JavaScript示例）

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Tabularis情感分析</title>
</head>
<body>
    <div id="output"></div>

    <script type="module">
        import { AutoTokenizer, AutoModel, env } from 'https://cdn.jsdelivr.net/npm/@xenova/transformers@2.6.0';

        env.allowLocalModels = false;
        env.useCDN = true;

        const MODEL_NAME = 'tabularisai/robust-sentiment-analysis';

        function softmax(arr) {
            const max = Math.max(...arr);
            const exp = arr.map(x => Math.exp(x - max));
            const sum = exp.reduce((acc, val) => acc + val);
            return exp.map(x => x / sum);
        }

        async function analyzeSentiment() {
            try {
                const tokenizer = await AutoTokenizer.from_pretrained(MODEL_NAME);
                const model = await AutoModel.from_pretrained(MODEL_NAME);

                const texts = [
                    "I absolutely loved this movie! The acting was superb and the plot was engaging.",
                    "The service at this restaurant was terrible. I'll never go back.",
                    "The product works as expected. Nothing special, but it gets the job done.",
                    "I'm somewhat disappointed with my purchase. It's not as good as I hoped.",
                    "This book changed my life! I couldn't put it down and learned so much."
                ];

                const output = document.getElementById('output');

                for (const text of texts) {
                    const inputs = await tokenizer(text, { return_tensors: 'pt' });
                    const result = await model(inputs);
                    
                    console.log('模型輸出:', result);

                    if (result.output && result.output.data) {
                        const logitsArray = Array.from(result.output.data);
                        console.log('邏輯數組:', logitsArray);

                        const probabilities = softmax(logitsArray);
                        const predicted_class = probabilities.indexOf(Math.max(...probabilities));

                        const sentimentMap = {
                            0: "非常負面",
                            1: "負面",
                            2: "中性",
                            3: "正面",
                            4: "非常正面"
                        };

                        const sentiment = sentimentMap[predicted_class];
                        const score = probabilities[predicted_class];

                        output.innerHTML += `文本: "${text}"<br>`;
                        output.innerHTML += `情感: ${sentiment}, 得分: ${score.toFixed(4)}<br><br>`;
                    } else {
                        console.error('意外的模型輸出結構:', result);
                        output.innerHTML += `無法處理: "${text}"<br><br>`;
                    }
                }
            } catch (error) {
                console.error('錯誤:', error);
                document.getElementById('output').innerHTML = '發生錯誤。請查看控制檯獲取詳細信息。';
            }
        }

        analyzeSentiment();
    </script>
</body>
</html>