AI Light Dance Singing2 FT Wav2Vec2 Open-Source Speech Recognition Model - Free Deployment for Accurate Speech Content Recognition

Ai Light Dance Singing2 Ft Wav2vec2 Large Xlsr 53 5gram V4 2

Developed by gary109

An automatic speech recognition model fine-tuned based on wav2vec2-large-xlsr-53, trained on the GARY109/AI_LIGHT_DANCE dataset

Downloads 68

Release Time : 6/29/2022

Model Overview

This model is a fine-tuned version for speech recognition tasks, specifically optimized for singing voice

Singing voice recognition optimization

Specifically fine-tuned for singing voice, potentially offering better performance for music-related speech recognition

Based on wav2vec2 architecture

Utilizes the advanced wav2vec2-large-xlsr-53 architecture with a solid foundation for speech recognition

Low word error rate

Achieved a word error rate of 9.1% on the evaluation set, demonstrating good performance

Speech-to-text

Singing voice recognition

Music applications

Lyrics transcription

Automatically convert singing recordings into lyric text

Word error rate approximately 9.1%

Speech recognition

Speech transcription

Convert speech content into text

Training Loss	Epoch	Step	Validation Loss	Wer
0.2664	1.0	8969	0.3347	0.1645
0.2032	2.0	17938	0.3170	0.1662
0.1888	3.0	26907	0.3188	0.1317
0.1774	4.0	35876	0.2885	0.1195
0.0696	5.0	44845	0.2703	0.1105
0.254	6.0	53814	0.2817	0.0972
0.0464	7.0	62783	0.2691	0.0910
0.0426	8.0	71752	0.3033	0.0875
0.035	9.0	80721	0.3150	0.0841
0.0274	10.0	89690	0.3073	0.0816

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base