Malaysian Distil Whisper Large V3
M
Malaysian Distil Whisper Large V3
Developed by mesolitica
A Whisper Large V3 speech recognition model distilled and optimized with Malaysian datasets, supporting Malay and other languages
Speech Recognition
Transformers Supports Multiple Languages#Malay speech recognition#Multi-source data distillation#Low-resource optimization

Downloads 30
Release Time : 12/30/2023
Model Overview
This model is a distilled version of Whisper Large V3, specifically optimized for Malaysian speech data, improving recognition accuracy for Malay and other local languages.
Model Features
Malaysian Localization Optimization
Trained with local Malaysian datasets, achieving better recognition performance for Malay and other local languages
Efficient Distilled Model
Optimized through HuggingFace standard distillation process, reducing model size while maintaining performance
Multi-source Training Data
Incorporates various data sources including IMDA official datasets, YouTube pseudo-labeled data, and conversational corpora
Model Capabilities
Malay speech recognition
Multilingual speech-to-text
Long audio processing
Use Cases
Transcription Services
Malaysian Local Media Content Transcription
Provides automatic transcription services for Malaysian YouTube videos, podcasts, and other content
Compared to generic Whisper models, achieves better recognition rates for Malay accents and local vocabulary
Educational Assistance
Malay Language Learning Applications
Used for developing Malay pronunciation assessment and voice interaction learning tools
Featured Recommended AI Models