AI-Light-Dance_Singing2_FT Open-source Automatic Speech Recognition Model

Ai Light Dance Singing2 Ft Wav2vec2 Large Xlsr 53 5gram V3

Developed by gary109

An automatic speech recognition model fine-tuned based on wav2vec2-large-xlsr-53, specializing in singing voice recognition

Downloads 97

Release Time : 6/28/2022

Model Overview

This model is a fine-tuned version on the GARY109/AI_LIGHT_DANCE - ONSET-SINGING2 dataset, primarily used for singing voice recognition tasks.

Singing Voice Recognition Optimization

Specially fine-tuned for singing voice, potentially performing better than general speech recognition models in singing scenarios

5-gram Language Model Enhancement

Integrated with a 5-gram language model, likely improving recognition accuracy

Low Word Error Rate

Achieved a word error rate (WER) of 0.2256 on the evaluation set

Singing voice recognition

Automatic speech-to-text

Music Technology

Singing Recording to Lyrics

Automatically convert singing recordings into text lyrics

Word error rate approximately 22.56%

Music Education Assistance

Help music learners analyze singing pronunciation accuracy

Training Loss	Epoch	Step	Validation Loss	Wer
0.2546	1.0	280	0.6004	0.2796
0.2325	2.0	560	0.6337	0.2729
0.2185	3.0	840	0.5546	0.2299
0.1988	4.0	1120	0.5265	0.2256
0.1755	5.0	1400	0.5577	0.2212
0.1474	6.0	1680	0.6353	0.2241
0.1498	7.0	1960	0.5758	0.2086
0.1252	8.0	2240	0.5738	0.2052
0.1174	9.0	2520	0.5994	0.2048
0.1035	10.0	2800	0.5988	0.2038

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base