W

Wav2vec2 Xls R 300m 21 To En

Developed by facebook
Facebook's Wav2Vec2 XLS-R fine-tuned for speech translation from 21 languages to English
Downloads 464
Release Time : 3/2/2022

Model Overview

This is a speech translation model based on SpeechEncoderDecoderModel, capable of translating 21 spoken languages into English. The encoder is based on facebook/wav2vec2-xls-r-300m, and the decoder is based on facebook/mbart-large-50, fine-tuned on the Covost2 dataset.

Model Features

Multilingual support
Supports speech translation from 21 languages to English
XLS-R architecture
Utilizes advanced Wav2Vec2 XLS-R 300M model as encoder
End-to-end translation
Directly generates English text output from speech input without intermediate transcription steps
High-quality translation
Performs excellently on Covost2 dataset, especially for common languages

Model Capabilities

Speech translation
Multilingual processing
Automatic speech recognition
End-to-end speech-to-text

Use Cases

Speech translation services
Real-time speech translation
Translates foreign languages in meetings or conversations into English in real time
High-quality translation output supporting multiple languages
Multimedia content translation
Translates speech in podcasts, videos and other multimedia content
Accurately captures speech content and converts it into English text
Assistive technology
Language learning assistance
Helps language learners understand foreign language content
Provides accurate translation references
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase