Roberta Base Indonesian 522M
An Indonesian pretrained model based on RoBERTa-base architecture, trained on Indonesian Wikipedia data, case insensitive.
Downloads 454
Release Time : 3/2/2022
Model Overview
This is a model based on the RoBERTa-base architecture, pretrained on Indonesian Wikipedia data using the Masked Language Modeling (MLM) objective. The model is case insensitive and suitable for Indonesian text processing tasks.
Model Features
Case insensitive
The model does not distinguish between cases, e.g., 'indonesia' and 'Indonesia' are treated as the same.
Based on RoBERTa architecture
Adopts the RoBERTa-base architecture, optimizing the original BERT training method.
Indonesian-specific
Specifically pretrained for Indonesian, suitable for Indonesian text processing tasks.
Model Capabilities
Masked language modeling
Text feature extraction
Indonesian text processing
Use Cases
Text processing
Mask prediction
Predict masked words in text
Can accurately predict missing words in Indonesian text
Text feature extraction
Obtain vector representations of text
Can be used as feature input for downstream NLP tasks
Featured Recommended AI Models
Š 2025AIbase