U

Uztext 3Gb BPE Roberta

Developed by rifkat
Pretrained Uzbek (Cyrillic & Latin alphabets) masked language modeling and sentence prediction model
Downloads 25
Release Time : 3/2/2022

Model Overview

A RoBERTa-based pretrained model for Uzbek language, supporting both Cyrillic and Latin alphabet text processing, primarily used for masked language modeling and sentence prediction tasks.

Model Features

Dual Alphabet Support
Supports both Cyrillic and Latin alphabet text processing for Uzbek language
Large-scale Pretraining
Pretrained on approximately 3GB of Uzbek news data
Masked Prediction Capability
Accurately predicts masked content in texts

Model Capabilities

Uzbek text understanding
Masked language modeling
Sentence prediction
Cyrillic alphabet processing
Latin alphabet processing

Use Cases

Text Completion
Historical Figure Description Completion
Complete descriptive texts about historical figures
Accurately predicts 'poet' in 'Alisher Navoi was a great Uzbek and other Turkic peoples' [mask], thinker, and statesman'
News Event Description
Natural Disaster Report Completion
Complete key information in natural disaster reports
Accurately predicts 'regions' in 'Due to heavy rainfall, severe mudflows were observed in multiple [mask]'
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase