ruSciBERT Open-Source Russian Model - Free Deployment to Boost Efficient Processing of Scientific Texts

Ruscibert

Developed by ai-forever

A Russian BERT model jointly trained by Sber AI team and MLSA Lab at Moscow State University's AI Institute, specializing in scientific text processing

Large Language Model

Transformers

OtherOpen Source License:Apache-2.0 #Russian scientific text processing #BERT architecture optimization #For academic research

Downloads 1,044

Release Time : 12/21/2022

Model Overview

A Russian pre-trained language model based on BERT architecture, specifically optimized for scientific texts, suitable for masked filling tasks

Model Features

Scientific text optimization

Specifically trained and optimized for Russian scientific texts

Large-scale training data

Trained using 6.5GB of Russian text data

Efficient tokenization

Uses BPE algorithm tokenizer with a vocabulary size of 50,265 tokens

Model Capabilities

Russian text understanding

Scientific text processing

Masked word prediction

Use Cases

Natural Language Processing

Scientific text classification

Classification tasks for Russian scientific literature

Vector method applications

Using model-generated text vectors to solve classification problems

Property	Details
Task	`mask filling`
Model Type	`encoder`
Tokenizer	`bpe`
Dictionary Size	`50265`
Number of Parameters	`123 M`
Training Data Volume	`6.5 GB`

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Ruscibert

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 ruSciBERT

🚀 Quick Start

📚 Documentation

📄 License