Umt5 Xxl
UMT5 is a multilingual text generation model pretrained on the mC4 multilingual corpus, supporting 107 languages and optimized for language balance using the UniMax sampling strategy
Downloads 4,449
Release Time : 7/2/2023
Model Overview
A multilingual pretrained model based on the T5 architecture, focused on cross-lingual text generation tasks, requiring fine-tuning for downstream applications
Model Features
UniMax Sampling Strategy
Achieves fairer language distribution by setting language repetition caps, preventing overfitting on tail languages
Large-Scale Multilingual Support
Covers 107 languages, including both major and low-resource languages
Enhanced mC4 Corpus
Pretrained on 29 trillion characters of multilingual data
Model Capabilities
Multilingual Text Generation
Cross-Lingual Transfer Learning
Text Summarization
Machine Translation
Use Cases
Natural Language Processing
Multilingual Machine Translation
Enables translation tasks for low-resource languages through fine-tuning
Cross-Lingual Text Summarization
Supports text summarization generation in multiple languages
Featured Recommended AI Models
Š 2025AIbase