W

Wav2vec2 Base 100k Voxpopuli

Developed by facebook
A speech recognition base model pretrained on 100,000 hours of unannotated data from the VoxPopuli corpus
Downloads 148
Release Time : 3/2/2022

Model Overview

Facebook's Wav2Vec2 base model for multilingual speech recognition tasks, requires fine-tuning with tokenizers and labeled data

Model Features

Multilingual support
Pretrained on the multilingual VoxPopuli corpus, supporting multiple language processing
Unsupervised pretraining
Uses 100,000 hours of unlabeled speech data for self-supervised learning
Fine-tunable architecture
Can be adapted to specific language recognition tasks by adding tokenizers and fine-tuning on labeled data

Model Capabilities

Speech feature extraction
Multilingual speech recognition (requires fine-tuning)
Speech representation learning

Use Cases

Speech technology
Multilingual speech recognition system
Build language-specific speech-to-text systems by fine-tuning the model
Accuracy depends on fine-tuning data and training configuration
Speech representation learning
Extract speech features for downstream tasks like speaker recognition or emotion analysis
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase