W

Whisper Large V2 Cantonese

Developed by Scrya
A Cantonese automatic speech recognition (ASR) model fine-tuned based on OpenAI Whisper Large V2, trained on the Common Voice 11.0 Cantonese dataset with a character error rate (CER) of 6.21%.
Downloads 210
Release Time : 12/19/2022

Model Overview

This is an automatic speech recognition model specifically optimized for Cantonese, improving recognition accuracy through data enhancement techniques, suitable for Cantonese speech-to-text scenarios.

Model Features

Cantonese optimization
Specifically fine-tuned for Cantonese speech characteristics, achieving better recognition accuracy compared to general models.
Data enhancement
Uses audio enhancement techniques such as pitch shifting and time stretching during training to improve model robustness.
Low error rate
Achieves a character error rate (CER) of 6.21% on the Common Voice Cantonese test set.

Model Capabilities

Cantonese speech recognition
Speech-to-text
Audio transcription

Use Cases

Speech transcription
Cantonese meeting minutes
Automatically converts Cantonese meeting recordings into text transcripts.
Accuracy approximately 93.79% (CER 6.21%).
Media subtitle generation
Automatically generates subtitles for Cantonese video content.
Voice assistant
Cantonese voice command recognition
Used for smart home or voice assistant systems supporting Cantonese.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase