M

Malaysian Distil Whisper Large V3

Developed by mesolitica
A Whisper Large V3 speech recognition model distilled and optimized with Malaysian datasets, supporting Malay and other languages
Downloads 30
Release Time : 12/30/2023

Model Overview

This model is a distilled version of Whisper Large V3, specifically optimized for Malaysian speech data, improving recognition accuracy for Malay and other local languages.

Model Features

Malaysian Localization Optimization
Trained with local Malaysian datasets, achieving better recognition performance for Malay and other local languages
Efficient Distilled Model
Optimized through HuggingFace standard distillation process, reducing model size while maintaining performance
Multi-source Training Data
Incorporates various data sources including IMDA official datasets, YouTube pseudo-labeled data, and conversational corpora

Model Capabilities

Malay speech recognition
Multilingual speech-to-text
Long audio processing

Use Cases

Transcription Services
Malaysian Local Media Content Transcription
Provides automatic transcription services for Malaysian YouTube videos, podcasts, and other content
Compared to generic Whisper models, achieves better recognition rates for Malay accents and local vocabulary
Educational Assistance
Malay Language Learning Applications
Used for developing Malay pronunciation assessment and voice interaction learning tools
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase