S

Sage Mt5 Large

Developed by ai-forever
A Russian and English spelling correction model based on the mT5-large architecture, normalizing words to correct spelling and typographical errors.
Downloads 51
Release Time : 3/11/2024

Model Overview

This model is designed to correct spelling and typographical errors in Russian and English, standardizing all words in the text to linguistic norms. Trained on the mT5-large architecture, the training corpus includes a broad dataset with artificially introduced errors.

Model Features

Multilingual support
Supports spell checking and text normalization for Russian and English.
Based on mT5-large architecture
Utilizes the powerful mT5-large architecture for text generation tasks.
Synthetic error training
Training data includes artificially introduced spelling and typographical errors to enhance model robustness.
Comprehensive dataset evaluation
Thoroughly evaluated on multiple Russian and English spell checking benchmark datasets.

Model Capabilities

Russian spell checking
English spell checking
Text normalization
Typographical error correction

Use Cases

Text processing
Social media text correction
Automatically corrects spelling and typographical errors in social media posts.
Achieved an F1 score of 61.4 on the RUSpellRU dataset
Medical text normalization
Corrects spelling errors in professional medical terminology within patient histories.
Achieved an F1 score of 47.0 on the MedSpellchecker dataset
Code comment correction
Corrects spelling errors in GitHub code submissions.
Achieved an F1 score of 50.4 on the GitHubTypoCorpusRu dataset
Multi-domain applications
Multi-domain text correction
Processes text errors from various domains including news, social media, and literary works.
Achieved an F1 score of 43.9 on the MultidomainGold dataset
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase