K

Komodo 7b Base

Developed by Yellow-AI-NLP
Komodo-7B is a large language model developed through incremental pretraining and vocabulary expansion based on Llama-2-7B, supporting Indonesian, English, and 11 regional languages of Indonesia.
Downloads 1,113
Release Time : 2/7/2024

Model Overview

This model is specifically designed for handling Indonesian and regional languages, enhancing language coverage through vocabulary expansion, and requires further fine-tuning for downstream tasks.

Model Features

Multilingual support
Supports Indonesian, English, and 11 regional languages of Indonesia, with enhanced language coverage through systematic vocabulary expansion.
Incremental pretraining
Based on Llama-2-7B with incremental pretraining, retaining the original model's advantages while adapting to Indonesian language characteristics.
Efficient vocabulary expansion
Added 3000 high-frequency words (2000 Indonesian + 1000 regional language words), significantly improving tokenization efficiency.

Model Capabilities

Indonesian text generation
Multilingual mixed processing
Cross-language understanding

Use Cases

Language services
Indonesian content creation
Generates text content that aligns with local language conventions.
Outputs natural language that fits Indonesian cultural contexts.
Regional language translation
Handles translation tasks between Indonesian regional languages and English/Indonesian.
Achieved a score of 90.5 in English-Indonesian translation benchmarks.
Cultural research
Dialect analysis
Identifies and processes linguistic variants across different regions of Indonesia.
Scored 73.6 in dialect detection tasks.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase