Wav2vec2 Base En Voxpopuli V2
A Wav2Vec2 base model pre-trained on 24.1k hours of unlabeled English data from the VoxPopuli corpus, suitable for speech recognition tasks.
Downloads 35
Release Time : 3/2/2022
Model Overview
This model is the base version of Facebook's Wav2Vec2, specifically pre-trained on English speech data, primarily for Automatic Speech Recognition (ASR) tasks.
Model Features
Based on VoxPopuli Corpus
Pre-trained on 24.1k hours of unlabeled English data from the VoxPopuli corpus, focusing on English speech recognition.
16kHz Sampling Rate
The model is pre-trained on speech audio sampled at 16kHz; ensure input audio has the same sampling rate.
No Tokenizer
This model is pre-trained solely on audio and does not include a tokenizer; an additional tokenizer must be created and fine-tuned on labeled text data.
Model Capabilities
Speech recognition
English speech processing
Use Cases
Speech Recognition
English Speech-to-Text
Convert English speech to text, suitable for applications like voice assistants and transcription services.
Featured Recommended AI Models