CLF - sentimentos - cmts open source model - Free processing of sentiment classification for Brazilian Portuguese social media texts containing emojis

Clf Sentimentos Cmts

Developed by tbluhm

Sentiment classification model for Brazilian Portuguese social media texts, with emoji processing support

Open Source License:MIT #Brazilian Portuguese Sentiment Analysis #Social Media Comment Classification #Emoji Interpretation

Downloads 113

Release Time : 4/2/2024

Model Overview

A fine-tuned XLM-RoBERTa model specifically designed for analyzing sentiment tendencies (positive/negative/neutral) in Brazilian Portuguese social media comments, including emoji interpretation capabilities

Model Features

Emoji Interpretation

Capable of understanding the emotional expression role of emojis in text

Brazilian Portuguese Optimization

Fine-tuned specifically for the linguistic characteristics of Brazilian social media

Multi-domain Applicability

Can analyze comments across various domains such as politics, entertainment, and business

Model Capabilities

Text Sentiment Classification

Emoji Sentiment Interpretation

Social Media Comment Analysis

Use Cases

Business Analysis

Product Review Monitoring

Automatically analyzes consumer sentiment tendencies toward products

Identifies product improvement opportunities and market trends

Social Media Management

Content Moderation

Automatically filters negative or inappropriate user comments

Improves community management efficiency

🚀 Model Card: CLF-SENTIMENTOS-CMTS finetuned XLM-RoBERTa

This model, clf-sentimentos-cmts, applies machine learning techniques to specific natural language processing (NLP) tasks. Specifically, it classifies sentiments in Brazilian Portuguese social media texts, including emoji processing. It is an adaptation of XLM-RoBERTa, a highly effective and robust Transformer architecture pre - trained on a vast multilingual dataset.

✨ Features

Fine - tuning for Specific Tasks: Unlike standard language model training, the fine - tuning of tbluhm/clf-sentimentos-cmts adjusts the XLM - RoBERTa parameters on a specific dataset. This optimizes it for sentiment text classification in Brazilian Portuguese, including emoji interpretation. The diverse dataset includes comments from politicians, artists, and automotive companies' profiles, reflecting a wide range of social media contexts and linguistic expressions in Brazil.
Deep Contextual Analysis: When fed a social media comment, the model deeply analyzes each word and emoji, considering the global context of the text. Using attention mechanisms, it weighs the importance of each element in relation to the overall sentiment expressed in the comment. This approach enables accurate classification, assigning a sentiment label based on the text's contextual and semantic understanding, including emoji interpretation.
Emoji - Aware Sentiment Classification: The model can recognize positive sentiments when users express satisfaction with smiley emojis and negative sentiments when users express dissatisfaction or criticism with sad emojis. For comments that do not clearly express an emotion or are purely informative, it labels them as neutral.
Wide Range of Applications: Besides classifying sentiments in Brazilian Portuguese social media comments, the model has various potential applications. Companies can use it to monitor public perception of their products and services on social media platforms, identify emerging trends, and areas for improvement. It can also be used for automated content moderation, filtering out negative or inappropriate comments.

📦 Installation

No installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

Basic Usage

Here is an example of how to use the model in Python with the Transformers library:

from transformers import pipeline

# Load the model tbluhm/clf-sentimentos-cmts
analise_sentimento =  pipeline("text-classification", model="tbluhm/clf-sentimentos-cmts")

# Example text for sentiment analysis
texto = "Excelente notícia para todos os brasileiros!"

# Perform sentiment analysis on the text
resultado = analise_sentimento(texto)

# Print the result
print(resultado)

📚 Documentation

Model Origin

This model is a fine - tuned version of xlm - roberta - base - tweet - sentiment - pt.

Performance Metrics

The model achieves the following results on the evaluation set:

Loss: 0.7189
Accuracy: 0.6467
F1: 0.5588

Model Objective

The objective of this model is to classify the sentiment of short texts into categories such as positive, negative, or neutral. It can be used in various applications, including social media sentiment analysis, product reviews, and customer feedback.

Intended Use

To use the model, simply provide a short text as input to the Sentiment Analysis Pipeline. The model will classify the text's sentiment as Positive, Negative, or Neutral.

Training Data

The model was fine - tuned on a dataset composed of product reviews, tweets, and other short - text sources in various languages. The training dataset includes over 1 million labeled examples.

Limitations and Ethical Considerations

It's important to note that the model may not capture all aspects of human sentiment and may not be perfect in all situations. Additionally, the model may reflect biases present in the training data. Therefore, it is recommended to use the model with caution and consider its limitations.

🔧 Technical Details

Training Procedure

Training Hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e - 05
train_batch_size: 64
eval_batch_size: 64
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
lr_scheduler_type: linear
num_epochs: 2

Training Results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
0.7039	1.0	9	0.7650	0.6413	0.5526
0.6487	2.0	18	0.7189	0.6467	0.5588

Framework Versions

Transformers 4.38.2
Pytorch 2.2.1+cpu
Datasets 2.18.0
Tokenizers 0.15.2

📄 License

The model is released under the MIT license.

📖 Citation

Author: Thiago D. Faria Bluhm. (2024). Adapted from: [XLM - ROBERTA](https://huggingface.co/FacebookAI/xlm - roberta - base).

Acknowledgments

Contributors: Wesley Dos Anjos, Pedro Lustosa, Amanda Rangel, Audrey Marx, Gabriel Leal, and Tiago Vettorazi.

Widget Examples

Example Title	Text
Positive	Eu gostei muito daquele ator no filme.
Negative	Esse político é uma pessoa sem escrúpulos.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご