Rugpt3medium Based On Gpt2
A Russian pretrained language model based on the GPT-2 architecture, developed by the SberDevices team, supporting a sequence length of 1024 and trained on 80 billion tokens.
Downloads 9,710
Release Time : 3/2/2022
Model Overview
This model is a Russian pretrained Transformer language model primarily used for Russian text generation and comprehension tasks.
Model Features
Large-scale Pretraining
The model was pretrained on 80 billion tokens of Russian data, demonstrating strong language comprehension capabilities.
Long Sequence Support
Supports a sequence length of 1024, with fine-tuning extending to a context window of 2048 tokens.
Efficient Training
Training was completed in just 16 days using 64 GPUs, showcasing efficient training capabilities.
Model Capabilities
Russian Text Generation
Russian Text Comprehension
Use Cases
Natural Language Processing
Russian Text Generation
Can be used to generate Russian articles, dialogues, and other text content.
Russian Language Comprehension
Can be used for tasks such as Russian text classification and sentiment analysis.
Featured Recommended AI Models