N

Nusabert Base

Developed by LazarusNLP
NusaBERT Base Version is a multilingual encoder language model based on the BERT architecture, supporting 13 Indonesian regional languages and pretrained on multiple open-source corpora.
Downloads 68
Release Time : 2/21/2024

Model Overview

NusaBERT is a multilingual encoder language model based on the BERT architecture, specifically optimized for 13 languages in Indonesia and surrounding regions, suitable for various natural language processing tasks.

Model Features

Multilingual Support
Supports 13 languages in Indonesia and surrounding regions, including mainstream languages and dialects.
Large-Scale Pretraining
Pretrained on a diverse corpus of approximately 16 billion tokens.
Optimized Performance
Achieves an accuracy of 0.6866 and a perplexity of 4.4266 on the held-out test set.

Model Capabilities

Text Understanding
Language Modeling
Multilingual Processing

Use Cases

Natural Language Processing
Text Classification
Classify texts in multiple languages from the Indonesian region.
Named Entity Recognition
Identify entities in texts from the Indonesian region.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase