Whisper Large V2 Cantonese
W
Whisper Large V2 Cantonese
Developed by Scrya
A Cantonese automatic speech recognition (ASR) model fine-tuned based on OpenAI Whisper Large V2, trained on the Common Voice 11.0 Cantonese dataset with a character error rate (CER) of 6.21%.
Downloads 210
Release Time : 12/19/2022
Model Overview
This is an automatic speech recognition model specifically optimized for Cantonese, improving recognition accuracy through data enhancement techniques, suitable for Cantonese speech-to-text scenarios.
Model Features
Cantonese optimization
Specifically fine-tuned for Cantonese speech characteristics, achieving better recognition accuracy compared to general models.
Data enhancement
Uses audio enhancement techniques such as pitch shifting and time stretching during training to improve model robustness.
Low error rate
Achieves a character error rate (CER) of 6.21% on the Common Voice Cantonese test set.
Model Capabilities
Cantonese speech recognition
Speech-to-text
Audio transcription
Use Cases
Speech transcription
Cantonese meeting minutes
Automatically converts Cantonese meeting recordings into text transcripts.
Accuracy approximately 93.79% (CER 6.21%).
Media subtitle generation
Automatically generates subtitles for Cantonese video content.
Voice assistant
Cantonese voice command recognition
Used for smart home or voice assistant systems supporting Cantonese.
Featured Recommended AI Models
Š 2025AIbase