đ Indonesian RoBERTa Base Sentiment Classifier
A sentiment - text - classification model based on the RoBERTa architecture, fine - tuned for Indonesian text.
đ Quick Start
The Indonesian RoBERTa Base Sentiment Classifier is a sentiment - text - classification model founded on the RoBERTa model. It starts with the pre - trained [Indonesian RoBERTa Base](https://hf.co/flax - community/indonesian - roberta - base) model and is then fine - tuned on the SmSA
dataset from indonlu
, which includes Indonesian comments and reviews.
After the training process, the model reached an evaluation accuracy of 94.36% and an F1 - macro of 92.42%. On the benchmark test set, it achieved an accuracy of 93.2% and an F1 - macro of 91.02%.
The Trainer
class from Hugging Face's Transformers library was employed to train the model. PyTorch served as the backend framework during training, yet the model remains compatible with other frameworks.
⨠Features
- Based on the powerful RoBERTa architecture.
- Fine - tuned on an Indonesian dataset for better performance on Indonesian text.
- Achieved high evaluation accuracy and F1 - macro scores.
- Compatible with multiple frameworks.
đĻ Installation
No specific installation steps are provided in the original document.
đģ Usage Examples
Basic Usage
from transformers import pipeline
pretrained_name = "sahri/sentiment"
nlp = pipeline(
"sentiment-analysis",
model=pretrained_name,
tokenizer=pretrained_name
)
nlp("tidak jelek tapi keren")
đ Documentation
Model
Property |
Details |
Model Type |
indonesian - roberta - base - sentiment - classifier |
#params |
124M |
Architecture |
RoBERTa Base |
Training/Validation data (text) |
SmSA |
Evaluation Results
The model was trained for 5 epochs, and the best model was loaded at the end.
Epoch |
Training Loss |
Validation Loss |
Accuracy |
F1 |
Precision |
Recall |
1 |
0.342600 |
0.213551 |
0.928571 |
0.898539 |
0.909803 |
0.890694 |
2 |
0.190700 |
0.213466 |
0.934127 |
0.901135 |
0.925297 |
0.882757 |
3 |
0.125500 |
0.219539 |
0.942857 |
0.920901 |
0.927511 |
0.915193 |
4 |
0.083600 |
0.235232 |
0.943651 |
0.924227 |
0.926494 |
0.922048 |
5 |
0.059200 |
0.262473 |
0.942063 |
0.920583 |
0.924084 |
0.917351 |
đ§ Technical Details
The model was trained using Hugging Face's Trainer
class from the Transformers library. PyTorch was used as the backend framework during training, but the model is compatible with other frameworks.
đ License
This project is licensed under the MIT license.
â ī¸ Important Note
Do consider the biases which come from both the pre - trained RoBERTa model and the SmSA
dataset that may be carried over into the results of this model.
Author
Indonesian RoBERTa Base Sentiment Classifier was trained and evaluated by [sahri ramadhan]. All computation and development are done on Google Colaboratory using their free GPU access.