wav2vec2-base-960h-4-gram Open-source Speech Recognition Model - Free to Improve English Speech Recognition Accuracy

Wav2vec2 Base 960h 4 Gram

Developed by patrickvonplaten

Based on Facebook's Wav2Vec2-Base-960h model, with an added English 4-gram language model to improve automatic speech recognition (ASR) accuracy.

Speech Recognition

Transformers

EnglishOpen Source License:Apache-2.0 #High-precision speech recognition #English speech transcription #Low word error rate

Downloads 19

Release Time : 4/12/2022

Model Overview

This model is a variant of Wav2Vec2, specifically designed for English automatic speech recognition tasks, with improved recognition accuracy through integration of a 4-gram language model.

Model Features

Integrated 4-gram language model

Uses the official Librispeech ngrams 4-gram.arpa.gz file to improve speech recognition accuracy.

Based on Wav2Vec2 architecture

Utilizes Facebook's Wav2Vec2-Base-960h model as the foundation, featuring robust speech feature extraction capabilities.

Model Capabilities

English speech recognition

High-accuracy speech-to-text

Use Cases

Speech transcription

Audio content transcription

Automatically converts English speech content into text

Achieves WER of 2.59-6.46 on the LibriSpeech test set

Voice assistants

Voice command recognition

Used for command recognition in voice assistant systems

Property	Details
Model Type	Audio model for automatic speech recognition
Training Data	librispeech_asr
License	apache-2.0

"clean"	"other"
2.59	6.46

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Wav2vec2 Base 960h 4 Gram

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Wav2Vec2-Base-960h + 4-gram

📦 Information

💻 Usage Examples

Basic Usage

Result (WER)

📚 Documentation

Model Index

📄 License