W

Wav2vec2 Base En Voxpopuli V2

Developed by facebook
A Wav2Vec2 base model pre-trained on 24.1k hours of unlabeled English data from the VoxPopuli corpus, suitable for speech recognition tasks.
Downloads 35
Release Time : 3/2/2022

Model Overview

This model is the base version of Facebook's Wav2Vec2, specifically pre-trained on English speech data, primarily for Automatic Speech Recognition (ASR) tasks.

Model Features

Based on VoxPopuli Corpus
Pre-trained on 24.1k hours of unlabeled English data from the VoxPopuli corpus, focusing on English speech recognition.
16kHz Sampling Rate
The model is pre-trained on speech audio sampled at 16kHz; ensure input audio has the same sampling rate.
No Tokenizer
This model is pre-trained solely on audio and does not include a tokenizer; an additional tokenizer must be created and fine-tuned on labeled text data.

Model Capabilities

Speech recognition
English speech processing

Use Cases

Speech Recognition
English Speech-to-Text
Convert English speech to text, suitable for applications like voice assistants and transcription services.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase