đ Indonesian RoBERTa Base
Indonesian RoBERTa Base is a masked language model based on the RoBERTa architecture, trained on the OSCAR dataset to handle Indonesian text.
đ Quick Start
Indonesian RoBERTa Base is a masked language model based on the RoBERTa model. It was trained on the OSCAR dataset, specifically the unshuffled_deduplicated_id
subset. The model was trained from scratch and achieved an evaluation loss of 1.798 and an evaluation accuracy of 62.45%.
This model was trained using HuggingFace's Flax framework and is part of the JAX/Flax Community Week organized by HuggingFace. All training was done on a TPUv3 - 8 VM, sponsored by the Google Cloud team.
All necessary scripts used for training could be found in the Files and versions tab, as well as the Training metrics logged via Tensorboard.
⨠Features
- Based on the RoBERTa architecture, a powerful masked language model.
- Trained from scratch on the OSCAR dataset for Indonesian text.
- Achieved good evaluation results with a loss of 1.798 and an accuracy of 62.45%.
đĻ Installation
No specific installation steps are provided in the original README.
đģ Usage Examples
Basic Usage
As Masked Language Model
from transformers import pipeline
pretrained_name = "flax-community/indonesian-roberta-base"
fill_mask = pipeline(
"fill-mask",
model=pretrained_name,
tokenizer=pretrained_name
)
fill_mask("Budi sedang <mask> di sekolah.")
Feature Extraction in PyTorch
from transformers import RobertaModel, RobertaTokenizerFast
pretrained_name = "flax-community/indonesian-roberta-base"
model = RobertaModel.from_pretrained(pretrained_name)
tokenizer = RobertaTokenizerFast.from_pretrained(pretrained_name)
prompt = "Budi sedang berada di sekolah."
encoded_input = tokenizer(prompt, return_tensors='pt')
output = model(**encoded_input)
đ Documentation
Model
Property |
Details |
Model Type |
indonesian-roberta-base |
#params |
124M |
Architecture |
RoBERTa |
Training/Validation data (text) |
OSCAR unshuffled_deduplicated_id Dataset |
Evaluation Results
The model was trained for 8 epochs and the following is the final result once the training ended.
train loss |
valid loss |
valid accuracy |
total time |
1.870 |
1.798 |
0.6245 |
18:25:39 |
đ§ Technical Details
The model is trained using HuggingFace's Flax framework on a TPUv3 - 8 VM sponsored by the Google Cloud team. All necessary training scripts can be found in the Files and versions tab, and training metrics are logged via Tensorboard.
đ License
This model is released under the MIT license.
đĨ Team Members