I

Indobert Base Uncased

Developed by indolem
IndoBERT is a BERT model specifically optimized for Indonesian, excelling in multiple Indonesian NLP tasks.
Downloads 26.35k
Release Time : 3/2/2022

Model Overview

An Indonesian version of the BERT model, used to evaluate the IndoLEM benchmark test set, covering seven tasks in morphosyntax, semantics, and discourse.

Model Features

Indonesian Optimization
Specifically trained for Indonesian, using a corpus of over 220 million Indonesian words.
Excellent Multi-task Performance
Outperforms other models in seven Indonesian NLP tasks, including POS tagging, named entity recognition, and sentiment analysis.
Comparable to English BERT
Achieves a perplexity of 3.97 on the development set, comparable to the English BERT base version.

Model Capabilities

POS Tagging
Named Entity Recognition
Dependency Parsing
Sentiment Analysis
Summarization
Tweet Prediction
Tweet Ranking

Use Cases

Natural Language Processing
Indonesian POS Tagging
Tag parts of speech for words in Indonesian text
96.8% accuracy, outperforming Bi-LSTM and mBERT
Indonesian Named Entity Recognition
Identify named entities in Indonesian text
F1 score of 90.1% on UI dataset and 74.9% on UGM dataset
Indonesian Sentiment Analysis
Analyze sentiment tendencies in Indonesian text
F1 score of 84.13%, better than other comparison models
Social Media Analysis
Next Tweet Prediction
Predict the next tweet an Indonesian user might send
93.7% accuracy
Tweet Ranking
Rank Indonesian tweets by relevance
Spearman correlation coefficient of 0.59
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase