T

Tahrirchi Bert Base

Developed by tahrirchi
TahrirchiBERT-base is an encoder-only Transformer text model for Uzbek (Latin script) with 110 million parameters, pre-trained using masked language modeling objectives.
Downloads 88
Release Time : 10/26/2023

Model Overview

This model is pre-trained on Uzbek language and suitable for fine-tuning on tasks requiring sentence-level decisions, such as sequence classification, token classification, or question answering.

Model Features

Uzbek Language Specialization
Specifically optimized and trained for Uzbek (Latin script), enabling better understanding and generation of Uzbek text.
Case Sensitivity
The model is case-sensitive and can recognize and process text inputs with different cases.
Large-scale Pre-training Data
Pre-trained using approximately 4,000 preprocessed books and 1.2 million curated web and Telegram blog texts (equivalent to 5 billion tokens).

Model Capabilities

Fill-mask
Sequence classification
Token classification
Question answering

Use Cases

Text Processing
Uzbek Text Completion
Used to complete missing parts in Uzbek text, such as masked tokens in sentences.
Uzbek Text Classification
Used for classification tasks on Uzbek text, such as sentiment analysis or topic classification.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Ā© 2025AIbase