Wav2vec2 Base 10k Voxpopuli
A foundational speech recognition model pretrained on 10,000 hours of unlabeled data from the VoxPopuli corpus, supporting multilingual speech processing
Downloads 2,504
Release Time : 3/2/2022
Model Overview
Facebook's Wav2Vec2 foundational speech recognition model that extracts speech features from raw audio through self-supervised learning, suitable for multilingual automatic speech recognition tasks
Model Features
Multilingual support
Trained on the multilingual VoxPopuli corpus, supporting speech recognition in multiple languages
Self-supervised pretraining
Utilizes 10,000 hours of unlabeled speech data for self-supervised learning, effectively capturing speech features
Fine-tunable architecture
Provides a foundational model architecture that can be fine-tuned for specific languages or domains
Model Capabilities
Automatic speech recognition
Speech feature extraction
Multilingual speech processing
Use Cases
Speech-to-text
Automated meeting minutes
Automatically convert meeting recordings into text transcripts
Subtitle generation
Automatically generate subtitles for video content
Speech analysis
Speech content analysis
Extract key information from speech data for analysis
Featured Recommended AI Models
Š 2025AIbase