Myanberta
M
Myanberta
Developed by UCSYNLP
MyanBERTa is a Burmese pre-trained language model based on the BERT architecture, pre-trained on a Burmese dataset containing 5,992,299 sentences.
Downloads 91
Release Time : 7/25/2022
Model Overview
This model is a pre-trained language model specifically designed for Burmese, utilizing the BERT architecture and byte-level BPE tokenizer, suitable for various Burmese natural language processing tasks.
Model Features
Burmese-specific
Specially designed and optimized for Burmese, better handling the linguistic characteristics of the language.
Large-scale Pre-training
Pre-trained on a large-scale Burmese dataset containing 5,992,299 sentences (136 million words).
Efficient Tokenization
Utilizes byte-level BPE tokenizer, learning 30,522 subword units as the tokenization tool.
Model Capabilities
Burmese Text Understanding
Burmese Text Generation
Burmese Language Feature Extraction
Use Cases
Natural Language Processing
Burmese Text Classification
Perform sentiment analysis or topic classification on Burmese texts
Burmese Question Answering System
Build intelligent Q&A applications based on Burmese
Featured Recommended AI Models
Š 2025AIbase