Robust-sentiment-analysis Open-source Sentiment Analysis Model - A Practical Tool Supporting 5 Sentiment Classifications

Robust Sentiment Analysis

Developed by tabularisai

A sentiment analysis model fine-tuned based on distilbert/distilbert-base-uncased, trained solely on synthetic data, supporting 5 sentiment classifications.

Text Classification

Transformers

EnglishOpen Source License:Apache-2.0 #Synthetic Data Training #Five-level Sentiment Classification #Social Media Analysis

Downloads 2,632

Release Time : 7/23/2024

Model Overview

This model is a classifier for English text sentiment analysis, capable of categorizing text into five sentiment categories: very negative, negative, neutral, positive, and very positive.

Model Features

Synthetic Data Training

Trained exclusively on synthetic data, avoiding common limitations of real-world datasets.

Multi-category Sentiment Analysis

Supports fine-grained classification into 5 sentiment categories (from very negative to very positive).

High Performance

Achieved approximately 0.95 train_acc_off_by_one accuracy on the validation set.

Lightweight

Based on the DistilBERT architecture, more lightweight and efficient than the full BERT model.

Model Capabilities

Text Sentiment Classification

Social Media Sentiment Analysis

Product Review Classification

Customer Feedback Analysis

Use Cases

Business Analysis

Social Media Monitoring

Analyze public sentiment trends about brands or products on social media.

Helps brands understand public sentiment and adjust marketing strategies promptly.

Customer Feedback Analysis

Automatically classify the sentiment tendencies of customer feedback.

Quickly identify dissatisfied customers and improve customer service quality.

Market Research

Product Review Analysis

Analyze sentiment in product reviews on e-commerce platforms.

Understand product strengths and weaknesses to guide product improvements.

Competitive Intelligence Analysis

Compare user sentiment feedback on competitors' products.

Gain insights for competitive market advantages.

🚀 (distil)BERT-based Sentiment Classification Model: Unleashing the Power of Synthetic Data

This is a sentiment classification model based on (distil)BERT, leveraging synthetic data to provide high - performance sentiment analysis. It can be applied in various scenarios such as social media analysis and customer feedback analysis.

🚀 Quick Start

Python Example

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "tabularisai/robust-sentiment-analysis"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Function to predict sentiment
def predict_sentiment(text):
    inputs = tokenizer(text.lower(), return_tensors="pt", truncation=True, padding=True, max_length=512)
    with torch.no_grad():
        outputs = model(**inputs)
    
    probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class = torch.argmax(probabilities, dim=-1).item()
    
    sentiment_map = {0: "Very Negative", 1: "Negative", 2: "Neutral", 3: "Positive", 4: "Very Positive"}
    return sentiment_map[predicted_class]

# Example usage
texts = [
    "I absolutely loved this movie! The acting was superb and the plot was engaging.",
    "The service at this restaurant was terrible. I'll never go back.",
    "The product works as expected. Nothing special, but it gets the job done.",
    "I'm somewhat disappointed with my purchase. It's not as good as I hoped.",
    "This book changed my life! I couldn't put it down and learned so much."
]

for text in texts:
    sentiment = predict_sentiment(text)
    print(f"Text: {text}")
    print(f"Sentiment: {sentiment}\n")

JavaScript Example

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Tabularis Sentiment Analysis</title>
</head>
<body>
    <div id="output"></div>

    <script type="module">
        import { AutoTokenizer, AutoModel, env } from 'https://cdn.jsdelivr.net/npm/@xenova/transformers@2.6.0';

        env.allowLocalModels = false;
        env.useCDN = true;

        const MODEL_NAME = 'tabularisai/robust-sentiment-analysis';

        function softmax(arr) {
            const max = Math.max(...arr);
            const exp = arr.map(x => Math.exp(x - max));
            const sum = exp.reduce((acc, val) => acc + val);
            return exp.map(x => x / sum);
        }

        async function analyzeSentiment() {
            try {
                const tokenizer = await AutoTokenizer.from_pretrained(MODEL_NAME);
                const model = await AutoModel.from_pretrained(MODEL_NAME);

                const texts = [
                    "I absolutely loved this movie! The acting was superb and the plot was engaging.",
                    "The service at this restaurant was terrible. I'll never go back.",
                    "The product works as expected. Nothing special, but it gets the job done.",
                    "I'm somewhat disappointed with my purchase. It's not as good as I hoped.",
                    "This book changed my life! I couldn't put it down and learned so much."
                ];

                const output = document.getElementById('output');

                for (const text of texts) {
                    const inputs = await tokenizer(text, { return_tensors: 'pt' });
                    const result = await model(inputs);
                    
                    console.log('Model output:', result);

                    if (result.output && result.output.data) {
                        const logitsArray = Array.from(result.output.data);
                        console.log('Logits array:', logitsArray);

                        const probabilities = softmax(logitsArray);
                        const predicted_class = probabilities.indexOf(Math.max(...probabilities));

                        const sentimentMap = {
                            0: "Very Negative",
                            1: "Negative",
                            2: "Neutral",
                            3: "Positive",
                            4: "Very Positive"
                        };

                        const sentiment = sentimentMap[predicted_class];
                        const score = probabilities[predicted_class];

                        output.innerHTML += `Text: "${text}"<br>`;
                        output.innerHTML += `Sentiment: ${sentiment}, Score: ${score.toFixed(4)}<br><br>`;
                    } else {
                        console.error('Unexpected model output structure:', result);
                        output.innerHTML += `Unable to process: "${text}"<br><br>`;
                    }
                }
            } catch (error) {
                console.error('Error:', error);
                document.getElementById('output').innerHTML = 'An error occurred. Please check the console for details.';
            }
        }

        analyzeSentiment();
    </script>
</body>
</html>

✨ Features

Multi - Class Classification: Capable of classifying sentiment into five classes: Very Negative, Negative, Neutral, Positive, and Very Positive.
Synthetic Data Utilization: Trained on synthetic data to cover a wide range of sentiment expressions.
Multiple Application Scenarios: Suitable for social media analysis, customer feedback analysis, product reviews classification, brand monitoring, market research, customer service optimization, and competitive intelligence.

📦 Installation

No specific installation steps are provided in the original document.

📚 Documentation

Model Details

Property	Details
Model Name	tabularisai/robust-sentiment-analysis
Model Type	(distil)BERT-based Sentiment Classification Model
Base Model	distilbert/distilbert-base-uncased
Task	Text Classification (Sentiment Analysis)
Language	English
Number of Classes	5 (Very Negative, Negative, Neutral, Positive, Very Positive)
Usage	Social media analysis, customer feedback analysis, product reviews classification, brand monitoring, market research, customer service optimization, competitive intelligence

Model Description

This model is a fine - tuned version of distilbert/distilbert-base-uncased for sentiment analysis, trained only on synthetic data.

Training Data

The model was fine - tuned on synthetic data, which allows for targeted training on a diverse range of sentiment expressions without the limitations often found in real - world datasets.

Training Procedure

The model was fine - tuned on synthetic data using the distilbert/distilbert-base-uncased architecture. The training process involved:

Dataset: Synthetic data designed to cover a wide range of sentiment expressions
Training framework: PyTorch Lightning
Number of epochs: 5
Performance metric: Achieved train_acc_off_by_one of approximately 0.95 on the validation dataset

Intended Use

This model is designed for sentiment analysis tasks, particularly useful for social media monitoring, customer feedback analysis, product review sentiment classification, and brand sentiment tracking.

Ethical Considerations

While efforts have been made to create a balanced and fair model through the use of synthetic data, users should be aware that the model may still exhibit biases. It's crucial to thoroughly test the model in your specific use case and monitor its performance over time.

🔧 Technical Details

The model is based on the distilbert/distilbert-base-uncased architecture. During training, it used PyTorch Lightning as the training framework and was fine - tuned for 5 epochs on synthetic data. It achieved a train_acc_off_by_one of approximately 0.95 on the validation dataset.

📄 License

The model is under the apache - 2.0 license.

📌 NEWS!

2024/12: We uploaded an even better and more robust sentiment model! The error rate is reduced by 10%, and overall accuracy is improved!

📞 Contact

For questions or private and reliable API with our model please contact info@tabularis.ai

📝 Citation

Will be included

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご