Wav2vec2 Base 100k Voxpopuli
A speech recognition base model pretrained on 100,000 hours of unannotated data from the VoxPopuli corpus
Downloads 148
Release Time : 3/2/2022
Model Overview
Facebook's Wav2Vec2 base model for multilingual speech recognition tasks, requires fine-tuning with tokenizers and labeled data
Model Features
Multilingual support
Pretrained on the multilingual VoxPopuli corpus, supporting multiple language processing
Unsupervised pretraining
Uses 100,000 hours of unlabeled speech data for self-supervised learning
Fine-tunable architecture
Can be adapted to specific language recognition tasks by adding tokenizers and fine-tuning on labeled data
Model Capabilities
Speech feature extraction
Multilingual speech recognition (requires fine-tuning)
Speech representation learning
Use Cases
Speech technology
Multilingual speech recognition system
Build language-specific speech-to-text systems by fine-tuning the model
Accuracy depends on fine-tuning data and training configuration
Speech representation learning
Extract speech features for downstream tasks like speaker recognition or emotion analysis
Featured Recommended AI Models
Š 2025AIbase