wav2vec2-base-demo-colab Open-source Speech Recognition Model - Trained on Specific Datasets, Achieving Precise Recognition with Low Error Rates

Wav2vec2 Base Demo Colab

Developed by asakawa

A speech recognition model fine-tuned based on facebook/wav2vec2-base, trained on a specific dataset with a word error rate (WER) of 0.3391.

Downloads 24

Release Time : 3/2/2022

Model Overview

This model is a fine-tuned version of wav2vec2-base, focusing on speech recognition tasks, capable of converting speech to text.

Low Word Error Rate

Achieves a word error rate (WER) of 0.3391 on the evaluation set, demonstrating excellent performance.

Fine-tuned based on wav2vec2-base

Fine-tuned based on the facebook/wav2vec2-base model, inheriting its powerful speech feature extraction capabilities.

Efficient Training

Uses mixed-precision training and linear learning rate scheduling for high training efficiency.

Speech Recognition

Speech-to-Text

Speech Transcription

Meeting Minutes

Automatically convert meeting recordings into text transcripts

High accuracy with a word error rate of 0.3391

Voice Notes

Convert voice notes into editable text

Training Loss	Epoch	Step	Validation Loss	Wer
3.5329	4.0	500	1.5741	1.0400
0.6432	8.0	1000	0.4571	0.4418
0.2214	12.0	1500	0.4381	0.3823
0.1294	16.0	2000	0.4706	0.3911
0.0868	20.0	2500	0.5252	0.3662
0.0616	24.0	3000	0.4828	0.3458
0.0461	28.0	3500	0.4500	0.3391

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base