I

Indobert Large P2

Developed by indobenchmark
IndoBERT is a state-of-the-art language model developed for Indonesian based on the BERT architecture, trained using Masked Language Modeling (MLM) and Next Sentence Prediction (NSP) objectives.
Downloads 2,272
Release Time : 3/2/2022

Model Overview

IndoBERT is a pre-trained language model optimized for Indonesian, primarily used for natural language understanding tasks, supporting contextual representation extraction and language comprehension for Indonesian text.

Model Features

Indonesian Optimization
Specifically optimized for Indonesian, suitable for natural language processing tasks in Indonesian.
Large-scale Pretraining
Pretrained on the Indo4B dataset (23.43 GB of text), offering robust language understanding capabilities.
Case Insensitive
The model does not distinguish between uppercase and lowercase during the second phase of training, making it suitable for text inputs with varying cases.

Model Capabilities

Indonesian Text Understanding
Contextual Representation Extraction
Masked Language Modeling
Next Sentence Prediction

Use Cases

Natural Language Processing
Text Classification
Used for classification tasks in Indonesian text, such as sentiment analysis, topic classification, etc.
Named Entity Recognition
Identifies named entities in Indonesian text, such as person names, locations, organization names, etc.
Language Model Fine-tuning
Downstream Task Fine-tuning
Can be fine-tuned to adapt to specific Indonesian NLP tasks.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase