wav2vec2-base-en-voxpopuli-v2 Open-source Speech Recognition Model - Suitable for English Speech Recognition Tasks

Wav2vec2 Base En Voxpopuli V2

Developed by facebook

A Wav2Vec2 base model pre-trained on 24.1k hours of unlabeled English data from the VoxPopuli corpus, suitable for speech recognition tasks.

Speech Recognition

Transformers

English#English speech pre-training #Unsupervised learning #16kHz audio processing

Downloads 35

Release Time : 3/2/2022

Model Overview

This model is the base version of Facebook's Wav2Vec2, specifically pre-trained on English speech data, primarily for Automatic Speech Recognition (ASR) tasks.

Model Features

Based on VoxPopuli Corpus

Pre-trained on 24.1k hours of unlabeled English data from the VoxPopuli corpus, focusing on English speech recognition.

16kHz Sampling Rate

The model is pre-trained on speech audio sampled at 16kHz; ensure input audio has the same sampling rate.

No Tokenizer

This model is pre-trained solely on audio and does not include a tokenizer; an additional tokenizer must be created and fine-tuned on labeled text data.

Model Capabilities

Speech recognition

English speech processing

Use Cases

Speech Recognition

English Speech-to-Text

Convert English speech to text, suitable for applications like voice assistants and transcription services.

Property	Details
Model Type	Wav2Vec2 base model
Training Data	24.1k unlabeled data from the VoxPopuli corpus
License	cc - by - nc - 4.0
Inference	Not available
Tags	audio, automatic - speech - recognition, voxpopuli - v2

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Wav2vec2 Base En Voxpopuli V2

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Wav2Vec2-base-VoxPopuli-V2

🚀 Quick Start

✨ Features

📦 Information

📚 Documentation

Paper

Authors

More Information