wav2vec2-large-xlsr-53-demo-colab Open-source Speech Recognition Model - Free Experience of Precise Speech-to-Text

Wav2vec2 Large Xlsr 53 Demo Colab

Developed by project2you

A speech recognition model fine-tuned on the common_voice dataset based on facebook/wav2vec2-large-xlsr-53

Downloads 21

Release Time : 3/2/2022

Model Overview

This is an optimized model for speech recognition tasks, based on the wav2vec2 architecture and fine-tuned on the common_voice dataset.

Efficient Fine-tuning

Fine-tuned based on the pre-trained wav2vec2-large-xlsr-53 model, improving performance on the target dataset.

Low Word Error Rate

Achieved a word error rate (WER) of 1.6299 on the evaluation set, demonstrating excellent performance.

Mixed Precision Training

Used native AMP for mixed precision training, improving training efficiency.

Speech Recognition

Automatic Speech-to-Text

Speech Transcription

Speech-to-Text

Convert speech content into text transcripts

Word error rate as low as 1.6299

Training Loss	Epoch	Step	Validation Loss	Wer
8.5034	3.42	400	3.5852	1.0
1.7853	6.83	800	0.7430	1.6774
0.5675	10.26	1200	0.6513	1.6330
0.3761	13.67	1600	0.6208	1.6081
0.2776	17.09	2000	0.6401	1.6081
0.2266	20.51	2400	0.6410	1.6295
0.1949	23.93	2800	0.6910	1.6287
0.1672	27.35	3200	0.6901	1.6299

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base