Open-source model of Indonesian - RoBERTa - base - sentiment - classifier: Free deployment to analyze the sentiment tendency of Indonesian comments

Indonesian Roberta Base Sentiment Classifier

Developed by w11wo

A RoBERTa-based sentiment text classification model for Indonesian language, fine-tuned on the SmSA dataset, used to analyze the sentiment tendencies of Indonesian reviews and evaluations.

Text Classification

Transformers

OtherOpen Source License:MIT #Indonesian Sentiment Analysis #High Accuracy 94.36%#E-commerce Review Classification

Downloads 18.66k

Release Time : 3/2/2022

Model Overview

This model is a sentiment analysis model for Indonesian language, capable of classifying Indonesian text into positive, negative, and other sentiment categories.

Model Features

High Accuracy

Achieves 94.36% accuracy and 92.42% macro F1 score on the evaluation set.

Specialized Fine-tuning

Fine-tuned specifically on the Indonesian review dataset SmSA.

Large Model Support

Based on the 124M-parameter RoBERTa base model.

Model Capabilities

Indonesian text sentiment classification

Review sentiment analysis

Evaluation sentiment tendency judgment

Use Cases

Social Media Analysis

Product Review Analysis

Analyze the sentiment tendencies of Indonesian product reviews on e-commerce platforms.

Accurately identifies 93.2% of review sentiments.

Social Media Monitoring

Monitor changes in user sentiment on Indonesian social media.

Customer Service

Customer Feedback Classification

Automatically classify the sentiment tendencies in customer feedback.

🚀 Indonesian RoBERTa Base Sentiment Classifier

The Indonesian RoBERTa Base Sentiment Classifier is a sentiment-text-classification model. It solves the problem of accurately classifying the sentiment of Indonesian texts. Its value lies in providing an effective tool for analyzing the sentiment of Indonesian comments and reviews.

🚀 Quick Start

The Indonesian RoBERTa Base Sentiment Classifier is a sentiment - text - classification model based on the RoBERTa model. The model was initially the pre - trained Indonesian RoBERTa Base model, and then it was fine - tuned on the SmSA dataset of indonlu, which consists of Indonesian comments and reviews.

After training, the model achieved an evaluation accuracy of 94.36% and F1 - macro of 92.42%. On the benchmark test set, it achieved an accuracy of 93.2% and F1 - macro of 91.02%.

The Trainer class from the Transformers library of Hugging Face was used to train the model. PyTorch was used as the backend framework during training, but the model remains compatible with other frameworks.

✨ Features

Based on the RoBERTa model, which has strong text - understanding capabilities.
Fine - tuned on the Indonesian SmSA dataset, suitable for classifying the sentiment of Indonesian texts.
Achieved high accuracy and F1 - macro scores in evaluations.
Compatible with multiple frameworks despite being trained with PyTorch as the backend.

📦 Model

Property	Details
Model Type	`indonesian-roberta-base-sentiment-classifier`
#params	124M
Architecture	RoBERTa Base
Training/Validation data (text)	`SmSA`

📚 Evaluation Results

The model was trained for 5 epochs and the best model was loaded at the end.

Epoch	Training Loss	Validation Loss	Accuracy	F1	Precision	Recall
1	0.342600	0.213551	0.928571	0.898539	0.909803	0.890694
2	0.190700	0.213466	0.934127	0.901135	0.925297	0.882757
3	0.125500	0.219539	0.942857	0.920901	0.927511	0.915193
4	0.083600	0.235232	0.943651	0.924227	0.926494	0.922048
5	0.059200	0.262473	0.942063	0.920583	0.924084	0.917351

💻 Usage Examples

Basic Usage

from transformers import pipeline

pretrained_name = "w11wo/indonesian-roberta-base-sentiment-classifier"

nlp = pipeline(
    "sentiment-analysis",
    model=pretrained_name,
    tokenizer=pretrained_name
)

nlp("Jangan sampai saya telpon bos saya ya!")

📄 License

This project is licensed under the MIT license.

⚠️ Important Note

Do consider the biases which come from both the pre - trained RoBERTa model and the SmSA dataset that may be carried over into the results of this model.

👨‍💻 Author

The Indonesian RoBERTa Base Sentiment Classifier was trained and evaluated by Wilson Wongso. All computation and development are done on Google Colaboratory using their free GPU access.

📖 Citation

If used, please cite the following:

@misc {wilson_wongso_2023,
	author       = { {Wilson Wongso} },
	title        = { indonesian-roberta-base-sentiment-classifier (Revision e402e46) },
	year         = 2023,
	url          = { https://huggingface.co/w11wo/indonesian-roberta-base-sentiment-classifier },
	doi          = { 10.57967/hf/0644 },
	publisher    = { Hugging Face }
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご