ruBert-large: An Open-Source Large Russian Language Model - Freely Available to Boost Russian Content Processing and Interaction

Rubert Large

Developed by ai-forever

A Russian large language model pre-trained by SberDevices team, based on Transformer architecture with 427 million parameters and 30GB training data

Downloads 6,125

Release Time : 3/2/2022

Model Overview

Russian pre-trained Transformer language model, primarily used for masked language modeling tasks

Large-scale pre-training

Pre-trained on 30GB of Russian data with strong language understanding capabilities

Efficient architecture

Uses Transformer encoder architecture with 427 million parameters

Professional tokenization

Employs BPE algorithm with vocabulary size of 120,138 tokens

Russian text understanding

Masked language prediction

Contextual semantic analysis

Natural Language Processing

Text auto-completion

Predicts missing words in Russian text editing

Improves text input efficiency

Semantic analysis

Understands deep meanings of Russian texts

Applicable for downstream tasks like sentiment analysis and intent recognition

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base