Ruroberta Large
A Russian RoBERTa large model pre-trained by SberDevices team with 355 million parameters, trained on 250GB of Russian text
Downloads 21.00k
Release Time : 3/2/2022
Model Overview
Russian pre-trained Transformer language model, mainly used for masked token filling tasks, suitable for Russian NLP tasks
Model Features
Large-scale pre-training
Pre-trained on 250GB of Russian text data
Efficient tokenization
Uses BBPE tokenizer with a vocabulary size of 50,257
Optimized architecture
RoBERTa variant based on Transformer encoder architecture
Model Capabilities
Russian text understanding
Masked language modeling
Contextual feature extraction
Use Cases
Natural Language Processing
Text classification
Can be used for Russian text classification tasks
Named entity recognition
Suitable for Russian NER tasks
Featured Recommended AI Models
Š 2025AIbase