English ASR Open-Source English Automatic Speech Recognition Model - Free Deployment, Precise Recognition with Low Word Error Rate

English ASR

Developed by maher13

This model is a fine-tuned English Automatic Speech Recognition (ASR) model based on facebook/wav2vec2-base, achieving a word error rate of 0.3397 on the evaluation set.

Speech Recognition

Transformers

Open Source License:Apache-2.0 #English Speech Recognition #Wav2Vec2 Fine-tuning #Low Word Error Rate

Downloads 13

Release Time : 3/2/2022

Model Overview

This is a model for English speech recognition, capable of converting English speech into text.

Model Features

Low Word Error Rate

Achieved a word error rate of 0.3397 on the evaluation set, demonstrating good performance.

Based on wav2vec2 Architecture

Fine-tuned using facebook's wav2vec2-base model, inheriting its excellent speech feature extraction capabilities.

Efficient Training

Utilizes mixed-precision training (native AMP) and a linear learning rate scheduler for high training efficiency.

Model Capabilities

English Speech Recognition

Speech-to-Text

Use Cases

Speech Transcription

Meeting Minutes

Automatically convert English meeting recordings into written transcripts

Approximately 66.03% accuracy (based on 1-0.3397 word error rate)

Voice Notes

Convert English voice notes into searchable text

Assistive Tools

Subtitle Generation

Automatically generate subtitles for English video content

Training Loss	Epoch	Step	Validation Loss	Wer
3.3432	4.0	500	1.1711	0.7767
0.5691	8.0	1000	0.4613	0.4357
0.2182	12.0	1500	0.4715	0.3853
0.1267	16.0	2000	0.4307	0.3607
0.0846	20.0	2500	0.4971	0.3537
0.0608	24.0	3000	0.4712	0.3419
0.0457	28.0	3500	0.4971	0.3397

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

English ASR

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 English_ASR

🚀 Quick Start

📚 Documentation

Training and Evaluation

Training hyperparameters

Training results

Framework versions

📄 License