Xlrs 53 Finnish
XLSR-Wav2Vec2 is a multilingual speech recognition model that learns shared speech representations through cross-lingual pretraining, supporting 53 languages.
Downloads 32
Release Time : 3/2/2022
Model Overview
Based on the wav2vec 2.0 architecture, this model is pretrained on raw multilingual speech waveforms to learn cross-lingual shared speech representations, suitable for downstream tasks such as automatic speech recognition.
Model Features
Cross-lingual Pretraining
Pretrained on 53 languages to learn cross-lingual shared speech representations.
Based on wav2vec 2.0
Adopts the wav2vec 2.0 architecture, trained through contrastive tasks on masked latent speech representations.
High Performance
Achieves a 72% relative reduction in phoneme error rate on the CommonVoice benchmark and a 16% relative reduction in word error rate on the BABEL dataset.
Model Capabilities
Multilingual Speech Recognition
Cross-lingual Speech Representation Learning
Use Cases
Speech Recognition
Multilingual Speech Transcription
Convert speech in multiple languages into text.
Performs excellently on the CommonVoice and BABEL datasets.
Low-resource Language Support
Speech Recognition for Low-resource Languages
Provides speech recognition capabilities for languages with limited resources.
Cross-lingual pretraining significantly improves recognition performance for low-resource languages.
Featured Recommended AI Models
Š 2025AIbase