robust-sentiment-analysis开源情感分析模型 - 支持5种情感分类的实用工具

首页

Robust Sentiment Analysis

由 tabularisai 开发

基于distilbert/distilbert-base-uncased微调的情感分析模型，仅使用合成数据训练，支持5种情感分类。

文本分类

Transformers

英语开源协议:Apache-2.0 #合成数据训练 #五级情感分类 #社交媒体分析

下载量 2,632

发布时间 : 7/23/2024

模型简介

该模型是一个用于英语文本情感分析的分类器，能够将文本分类为非常负面、负面、中性、正面和非常正面五种情感类别。

模型特点

合成数据训练

仅使用合成数据训练，避免了现实世界数据集的常见限制

多类别情感分析

支持5种情感类别的精细分类（非常负面到非常正面）

高性能

在验证集上实现了约0.95的train_acc_off_by_one准确率

轻量级

基于DistilBERT架构，比完整BERT模型更轻量高效

模型能力

文本情感分类

社交媒体情感分析

产品评论分类

客户反馈分析

使用案例

商业分析

社交媒体监控

分析社交媒体上关于品牌或产品的公众情感倾向

帮助品牌了解公众情绪，及时调整营销策略

客户反馈分析

自动分类客户反馈的情感倾向

快速识别不满客户，提高客户服务质量

市场研究

产品评论分析

分析电商平台上的产品评论情感

了解产品优缺点，指导产品改进

竞争情报分析

比较竞争对手产品的用户情感反馈

获取市场竞争优势洞察

🚀 基于（蒸馏）BERT的情感分类模型：释放合成数据的力量

本模型基于（蒸馏）BERT架构，利用合成数据进行训练，可对文本进行精准的情感分类，广泛应用于社交媒体分析、客户反馈分析等多个领域。

🚀 快速开始

模型信息

属性	详情
模型名称	tabularisai/robust-sentiment-analysis
基础模型	distilbert/distilbert-base-uncased
任务类型	文本分类（情感分析）
语言	英语
类别数量	5（非常负面、负面、中性、正面、非常正面）
使用场景	社交媒体分析、客户反馈分析、产品评论分类、品牌监测、市场调研、客户服务优化、竞争情报分析

✨ 主要特性

基于合成数据训练：仅使用合成数据进行训练，可针对各种情感表达进行有针对性的训练，不受现实数据集的限制。
多场景适用：适用于社交媒体监测、客户反馈分析、产品评论情感分类、品牌情感跟踪等多种场景。

📦 安装指南

此部分原文档未提供具体安装命令，故跳过。

💻 使用示例

基础用法

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# 加载模型和分词器
model_name = "tabularisai/robust-sentiment-analysis"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# 预测情感的函数
def predict_sentiment(text):
    inputs = tokenizer(text.lower(), return_tensors="pt", truncation=True, padding=True, max_length=512)
    with torch.no_grad():
        outputs = model(**inputs)
    
    probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class = torch.argmax(probabilities, dim=-1).item()
    
    sentiment_map = {0: "非常负面", 1: "负面", 2: "中性", 3: "正面", 4: "非常正面"}
    return sentiment_map[predicted_class]

# 示例用法
texts = [
    "I absolutely loved this movie! The acting was superb and the plot was engaging.",
    "The service at this restaurant was terrible. I'll never go back.",
    "The product works as expected. Nothing special, but it gets the job done.",
    "I'm somewhat disappointed with my purchase. It's not as good as I hoped.",
    "This book changed my life! I couldn't put it down and learned so much."
]

for text in texts:
    sentiment = predict_sentiment(text)
    print(f"文本: {text}")
    print(f"情感: {sentiment}\n")

高级用法（JavaScript示例）

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Tabularis情感分析</title>
</head>
<body>
    <div id="output"></div>

    <script type="module">
        import { AutoTokenizer, AutoModel, env } from 'https://cdn.jsdelivr.net/npm/@xenova/transformers@2.6.0';

        env.allowLocalModels = false;
        env.useCDN = true;

        const MODEL_NAME = 'tabularisai/robust-sentiment-analysis';

        function softmax(arr) {
            const max = Math.max(...arr);
            const exp = arr.map(x => Math.exp(x - max));
            const sum = exp.reduce((acc, val) => acc + val);
            return exp.map(x => x / sum);
        }

        async function analyzeSentiment() {
            try {
                const tokenizer = await AutoTokenizer.from_pretrained(MODEL_NAME);
                const model = await AutoModel.from_pretrained(MODEL_NAME);

                const texts = [
                    "I absolutely loved this movie! The acting was superb and the plot was engaging.",
                    "The service at this restaurant was terrible. I'll never go back.",
                    "The product works as expected. Nothing special, but it gets the job done.",
                    "I'm somewhat disappointed with my purchase. It's not as good as I hoped.",
                    "This book changed my life! I couldn't put it down and learned so much."
                ];

                const output = document.getElementById('output');

                for (const text of texts) {
                    const inputs = await tokenizer(text, { return_tensors: 'pt' });
                    const result = await model(inputs);
                    
                    console.log('模型输出:', result);

                    if (result.output && result.output.data) {
                        const logitsArray = Array.from(result.output.data);
                        console.log('逻辑数组:', logitsArray);

                        const probabilities = softmax(logitsArray);
                        const predicted_class = probabilities.indexOf(Math.max(...probabilities));

                        const sentimentMap = {
                            0: "非常负面",
                            1: "负面",
                            2: "中性",
                            3: "正面",
                            4: "非常正面"
                        };

                        const sentiment = sentimentMap[predicted_class];
                        const score = probabilities[predicted_class];

                        output.innerHTML += `文本: "${text}"<br>`;
                        output.innerHTML += `情感: ${sentiment}, 得分: ${score.toFixed(4)}<br><br>`;
                    } else {
                        console.error('意外的模型输出结构:', result);
                        output.innerHTML += `无法处理: "${text}"<br><br>`;
                    }
                }
            } catch (error) {
                console.error('错误:', error);
                document.getElementById('output').innerHTML = '发生错误。请查看控制台获取详细信息。';
            }
        }

        analyzeSentiment();
    </script>
</body>
</html>