Wav2vec2 Base 10k Voxpopuli Ft En
A Wav2Vec2 base model pre-trained on a 10K unlabeled subset of the VoxPopuli corpus and fine-tuned on English transcription data, suitable for English speech recognition tasks.
Downloads 40
Release Time : 3/2/2022
Model Overview
This model is Facebook's Wav2Vec2 base model, pre-trained on the VoxPopuli corpus and fine-tuned with English transcription data, primarily used for English automatic speech recognition (ASR) tasks.
Model Features
VoxPopuli Pre-training
Pre-trained on a 10K unlabeled subset of the large-scale multilingual VoxPopuli speech corpus
English Transcription Fine-tuning
Fine-tuned on English transcription data to optimize English speech recognition performance
End-to-End Speech Recognition
Generates text output directly from raw audio input without intermediate feature extraction steps
Model Capabilities
English speech recognition
Audio transcription
Automatic speech-to-text
Use Cases
Speech Transcription
Meeting Minutes
Automatically transcribe English meeting recordings into text records
Podcast Transcription
Convert English podcast content into searchable text format
Assistive Technology
Speech-to-Text Tool
Provide real-time speech-to-text services for the hearing impaired
Featured Recommended AI Models