M

Myanberta

Developed by UCSYNLP
MyanBERTa is a Burmese pre-trained language model based on the BERT architecture, pre-trained on a Burmese dataset containing 5,992,299 sentences.
Downloads 91
Release Time : 7/25/2022

Model Overview

This model is a pre-trained language model specifically designed for Burmese, utilizing the BERT architecture and byte-level BPE tokenizer, suitable for various Burmese natural language processing tasks.

Model Features

Burmese-specific
Specially designed and optimized for Burmese, better handling the linguistic characteristics of the language.
Large-scale Pre-training
Pre-trained on a large-scale Burmese dataset containing 5,992,299 sentences (136 million words).
Efficient Tokenization
Utilizes byte-level BPE tokenizer, learning 30,522 subword units as the tokenization tool.

Model Capabilities

Burmese Text Understanding
Burmese Text Generation
Burmese Language Feature Extraction

Use Cases

Natural Language Processing
Burmese Text Classification
Perform sentiment analysis or topic classification on Burmese texts
Burmese Question Answering System
Build intelligent Q&A applications based on Burmese
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase