G

Gpt2 Small Indonesian 522M

Developed by cahya
This is a GPT2-small model pretrained on Indonesian Wikipedia data, specializing in Indonesian text generation tasks.
Downloads 1,900
Release Time : 3/2/2022

Model Overview

The model was pretrained using causal language modeling (CLM) objectives on 522MB of Indonesian Wikipedia data, supporting Indonesian text generation. It is case-insensitive and suitable for various downstream NLP tasks.

Model Features

Indonesian language optimization
Specifically pretrained for Indonesian, performing well in Indonesian text generation tasks
Case-insensitive
The model is case-insensitive, treating 'indonesia' and 'Indonesia' as the same
Efficient tokenization
Uses byte-level Byte Pair Encoding (BPE) with a vocabulary size of 52,000, effectively handling Unicode characters

Model Capabilities

Indonesian text generation
Language model feature extraction
Context understanding

Use Cases

Education/Culture
Historical text generation
Generating coherent texts about Indonesian history
For example, generated historical descriptions about the Majapahit Kingdom
Content creation
Automated Indonesian content generation
Assisting in creating Indonesian articles, stories, and other content
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase