D

Distil Whisper Small Cantonese

Developed by alvanlii
This is a distilled Cantonese speech recognition model based on Whisper Small, achieving a CER of 9.7 (without punctuation) on Common Voice 16.0.
Downloads 187
Release Time : 4/3/2024

Model Overview

This model is a distilled version of Whisper Small, specifically optimized for Cantonese speech recognition tasks, featuring a smaller model size and faster inference speed.

Model Features

Efficient Inference
Compared to the original Whisper Small model, inference speed is improved by approximately 50%, with GPU VRAM requirements of only about 2GB.
Cantonese Optimization
Specifically trained and optimized for Cantonese speech recognition tasks.
Lightweight
Model compression achieved by reducing decoder layers, with parameters decreased from 242M to 157M.

Model Capabilities

Cantonese speech recognition
Speech-to-text
Audio transcription

Use Cases

Speech Transcription
Cantonese Meeting Minutes
Automatically transcribe Cantonese meeting recordings into text
Achieved a character error rate (CER) of 9.7% on the Common Voice 16.0 test set
Media Subtitle Generation
Automatically generate subtitles for Cantonese video content
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase