wav2vec_trained Open-source Speech Recognition Model - Achieve Accurate Speech-to-Text for Free

Wav2vec Trained

Developed by eugenetanjc

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base, achieving a word error rate of 0.1042 on the evaluation set.

Downloads 70

Release Time : 6/25/2022

Model Overview

A speech recognition model based on the wav2vec2 architecture, used to convert speech into text.

Low Word Error Rate

Achieved a word error rate of 0.1042 on the evaluation set.

Efficient Training

Optimized training efficiency using mixed-precision training (native AMP).

Linear Learning Rate Scheduling

Adopted a linear learning rate scheduler with 1000 warm-up steps to optimize the training process.

Speech-to-Text

Automatic Speech Recognition

Speech Transcription

Automatic Meeting Minutes Generation

Automatically convert meeting recordings into written transcripts

Voice Memo Conversion

Convert voice memos into editable text

Training Loss	Epoch	Step	Validation Loss	Wer
4.3849	2.21	500	2.9148	1.0
1.9118	4.42	1000	0.9627	0.5833
0.7596	6.64	1500	0.8953	0.3542
0.4602	8.85	2000	0.3325	0.2083
0.331	11.06	2500	0.3084	0.2083
0.2474	13.27	3000	0.0960	0.1667
0.1934	15.49	3500	0.1276	0.125
0.156	17.7	4000	0.0605	0.0833
0.1244	19.91	4500	0.0831	0.1458
0.1006	22.12	5000	0.0560	0.125
0.0827	24.34	5500	0.0395	0.0833
0.0723	26.55	6000	0.0573	0.0833
0.0606	28.76	6500	0.0337	0.1042

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base