SaBERT Spanish Sentiment Analysis Open-Source Model - Free Detection of Positive or Negative Sentiments in Text

Home

Sabert Spanish Sentiment Analysis

Developed by VerificadoProfesional

A BERT-based Spanish sentiment analysis classifier used to detect positive and negative sentiments in text.

Text Classification

Transformers

SpanishOpen Source License:Apache-2.0 #Spanish sentiment analysis #BERT fine-tuned model #Tweet sentiment detection

Downloads 2,553

Release Time : 4/24/2024

Model Overview

This model is a BERT-based text classifier specifically designed for sentiment analysis of Spanish texts and can effectively identify positive and negative sentiments.

Model Features

Based on the BERT architecture

Fine-tuned using the BERT architecture for sentiment analysis of Spanish texts.

High accuracy

Achieved an accuracy of 86.47% on the test set, showing excellent performance.

Trained with multi-regional data

The training data includes 11,500 Spanish tweets from different regions, covering a wide range.

Model Capabilities

Sentiment analysis of Spanish texts

Positive/negative sentiment classification

Use Cases

Social media analysis

Tweet sentiment analysis

Analyze the sentiment tendency of Spanish tweets for public opinion monitoring.

Accurately identify positive and negative sentiments.

Customer feedback analysis

Product review sentiment analysis

Analyze the sentiment tendency of Spanish product reviews to help improve products.

Effectively classify positive and negative reviews.

🚀 Spanish Sentiment Analysis Classifier

This BERT-based text classifier is designed to detect sentiments in Spanish, developed as a thesis project for the Computer Engineering degree at Universidad de Buenos Aires (UBA).

🚀 Quick Start

This BERT-based text classifier was developed as a thesis project for the Computer Engineering degree at Universidad de Buenos Aires (UBA). The model is designed to detect sentiments in Spanish and was fine-tuned on the dccuchile/bert-base-spanish-wwm-uncased model using a specific set of hyperparameters. It was trained on a dataset containing 11,500 Spanish tweets collected from various regions, both positive and negative. These tweets were sourced from a well-curated combination of TASS datasets.

✨ Features

Sentiment Detection: Specifically designed to detect sentiments in Spanish text.
Fine - Tuned Model: Fine - tuned on the dccuchile/bert-base-spanish-wwm-uncased model with specific hyperparameters.
Trained on Diverse Data: Trained on 11,500 Spanish tweets from various regions, sourced from TASS datasets.

📦 Installation

You can install the required dependencies using pip:

pip install transformers torch

💻 Usage Examples

Basic Usage

from transformers import BertForSequenceClassification, BertTokenizer
model = BertForSequenceClassification.from_pretrained("VerificadoProfesional/SaBERT-Spanish-Sentiment-Analysis")
tokenizer = BertTokenizer.from_pretrained("VerificadoProfesional/SaBERT-Spanish-Sentiment-Analysis")

Advanced Usage

def predict(model,tokenizer,text,threshold = 0.5):   
        inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=128)
        with torch.no_grad():
            outputs = model(**inputs)
        
        logits = outputs.logits
        probabilities = torch.softmax(logits, dim=1).squeeze().tolist()
        
        predicted_class = torch.argmax(logits, dim=1).item()
        if probabilities[predicted_class] <= threshold and predicted_class == 1:
            predicted_class = 0
  
        return bool(predicted_class), probabilities

text = "Your Spanish news text here"
predicted_label,probabilities = predict(model,tokenizer,text)
print(f"Text: {text}")
print(f"Predicted Class: {predicted_label}")
print(f"Probabilities: {probabilities}")

📚 Documentation

Team Members

Model Details

Property	Details
Model Type	dccuchile/bert-base-spanish-wwm-uncased
Hyperparameters	dropout_rate = 0.1, num_classes = 2, max_length = 128, batch_size = 16, num_epochs = 5, learning_rate = 3e - 5
Training Data	11,500 Spanish tweets (Positive and Negative)

Metrics

The model's performance was evaluated using the following metrics:

Accuracy = 86.47%
F1 - Score = 86.47%
Precision = 86.46%
Recall = 86.51%

📄 License

Apache License 2.0
TASS Dataset license

🔗 Acknowledgments

Special thanks to DCC UChile for the base Spanish BERT model and to all contributors to the dataset used for training.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご