Whisper Small Tajik
A Tajik automatic speech recognition model fine-tuned from OpenAI Whisper-small, trained on Google Fleurs dataset with a word error rate of 24.26%.
Downloads 25
Release Time : 1/20/2025
Model Overview
This model is an automatic speech recognition (ASR) model optimized for Tajik language, suitable for converting Tajik speech to text.
Model Features
Tajik language optimization
Specially fine-tuned for Tajik language, offering better local language recognition capabilities compared to the original Whisper model.
Efficient training
Achieves efficient training with relatively small batch sizes (16) and gradient accumulation (2 steps).
Optimized learning rate scheduling
Uses cosine learning rate scheduler with 0.1 warmup ratio to optimize the training process.
Model Capabilities
Tajik speech recognition
Speech-to-text
Use Cases
Speech transcription
Tajik meeting minutes
Automatically converts Tajik meeting recordings into text transcripts
Word error rate around 24.26%
Voice assistant
Speech recognition module for Tajik voice assistant applications
Education
Language learning applications
Helps learners check the accuracy of Tajik pronunciation
Featured Recommended AI Models