Ai Light Dance Singing2 Ft Wav2vec2 Large Xlsr 53 5gram V4 1
This model is an automatic speech recognition (ASR) model based on the wav2vec2-large-xlsr-53 architecture, fine-tuned on the GARY109/AI_LIGHT_DANCE - ONSET-SINGING2 dataset, primarily used for singing voice recognition tasks.
Downloads 66
Release Time : 6/28/2022
Model Overview
This is an automatic speech recognition model specifically optimized for singing voice, based on the wav2vec2-large-xlsr-53 architecture and fine-tuned on a specific singing dataset, capable of accurately recognizing singing content.
Model Features
Singing voice optimization
Specifically optimized for singing content, performing better in singing scenarios compared to general speech recognition models
High accuracy
Achieved a word error rate (WER) of 12.11% on the evaluation set, demonstrating good performance
Based on wav2vec2 architecture
Utilizes the powerful wav2vec2-large-xlsr-53 as the base model, featuring excellent speech feature extraction capabilities
Model Capabilities
Singing voice recognition
Automatic speech-to-text
Music content analysis
Use Cases
Music technology
Singing content transcription
Automatically convert singing recordings into text lyrics
Word error rate 12.11%
Music content analysis
Analyze singing content for music information retrieval
Featured Recommended AI Models
Š 2025AIbase