wav2vec2-final-1-lm-3 Open-source Speech Recognition Model - Accurately Identify Speech Content with Low Error Rate

Wav2vec2 Final 1 Lm 3

Developed by chrisvinsen

A speech recognition model fine-tuned based on facebook/wav2vec2-base, achieving a word error rate of 0.4499 on the evaluation set, which can be reduced to 0.126 when using a 4-Gram language model

Speech Recognition

Transformers

Open Source License:Apache-2.0 #Speech Recognition #Low Word Error Rate #4-Gram Optimization

Downloads 16

Release Time : 6/2/2022

Model Overview

This is an automatic speech recognition (ASR) model based on the wav2vec2 architecture, fine-tuned on a specific dataset, suitable for speech-to-text tasks

Model Features

Low Word Error Rate

Base word error rate of 0.4499, which can be reduced to 0.126 when using a 4-Gram language model

Based on wav2vec2 Architecture

Uses facebook/wav2vec2-base as the base model, with excellent speech feature extraction capabilities

Fine-tuning

Trained for 60 epochs, progressively optimizing model performance

Model Capabilities

Speech Recognition

Audio to Text

Speech Content Analysis

Use Cases

Speech Transcription

Meeting Minutes

Automatically convert meeting recordings into text transcripts

Accuracy approximately 55.01% (word error rate 0.4499)

Voice Notes

Convert voice memos into searchable text

Accuracy can reach 87.4% when using a 4-Gram language model

Training Loss	Epoch	Step	Validation Loss	Wer
3.4816	2.74	400	1.0717	0.8927
0.751	5.48	800	0.7155	0.7533
0.517	8.22	1200	0.7039	0.6675
0.3988	10.96	1600	0.5935	0.6149
0.3179	13.7	2000	0.6477	0.5999
0.2755	16.44	2400	0.5549	0.5798
0.2343	19.18	2800	0.6626	0.5798
0.2103	21.92	3200	0.6488	0.5674
0.1877	24.66	3600	0.5874	0.5339
0.1719	27.4	4000	0.6354	0.5389
0.1603	30.14	4400	0.6612	0.5210
0.1401	32.88	4800	0.6676	0.5131
0.1286	35.62	5200	0.6366	0.5075
0.1159	38.36	5600	0.6064	0.4977
0.1084	41.1	6000	0.6530	0.4835
0.0974	43.84	6400	0.6118	0.4853
0.0879	46.58	6800	0.6316	0.4770
0.0815	49.32	7200	0.6125	0.4664
0.0708	52.05	7600	0.6449	0.4683
0.0651	54.79	8000	0.6068	0.4571
0.0555	57.53	8400	0.6305	0.4499

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Wav2vec2 Final 1 Lm 3

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 wav2vec2-19

🚀 Quick Start

📚 Documentation

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

📄 License