Whisper-small-tajik Open-source Model - Achieve Accurate Tajik Automatic Speech Recognition for Free

Whisper Small Tajik

Developed by abduaziz

A Tajik automatic speech recognition model fine-tuned from OpenAI Whisper-small, trained on Google Fleurs dataset with a word error rate of 24.26%.

Speech Recognition

Transformers

OtherOpen Source License:Apache-2.0 #Tajik speech recognition #Low word error rate #Multilingual ASR

Downloads 25

Release Time : 1/20/2025

Model Overview

This model is an automatic speech recognition (ASR) model optimized for Tajik language, suitable for converting Tajik speech to text.

Model Features

Tajik language optimization

Specially fine-tuned for Tajik language, offering better local language recognition capabilities compared to the original Whisper model.

Efficient training

Achieves efficient training with relatively small batch sizes (16) and gradient accumulation (2 steps).

Optimized learning rate scheduling

Uses cosine learning rate scheduler with 0.1 warmup ratio to optimize the training process.

Model Capabilities

Tajik speech recognition

Speech-to-text

Use Cases

Speech transcription

Tajik meeting minutes

Automatically converts Tajik meeting recordings into text transcripts

Word error rate around 24.26%

Voice assistant

Speech recognition module for Tajik voice assistant applications

Education

Language learning applications

Helps learners check the accuracy of Tajik pronunciation

Property	Details
Library Name	transformers
Language	tg
License	apache - 2.0
Base Model	openai/whisper - small
Tags	generated_from_trainer
Datasets	google/fleurs
Metrics	wer

Training Loss	Epoch	Step	Validation Loss	Wer
2.7687	1.0	79	0.5778	39.6568
0.7193	2.0	158	0.3890	28.3568
0.3659	3.0	237	0.3611	26.0636
0.2021	4.0	316	0.3629	25.1068
0.1099	5.0	395	0.3740	25.3044
0.0597	6.0	474	0.3887	24.3081
0.0339	7.0	553	0.4005	24.6639
0.0213	8.0	632	0.4082	24.3239
0.0158	9.0	711	0.4131	24.2685
0.014	10.0	790	0.4141	24.2606

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Whisper Small Tajik

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Whisper Small Tajik

📚 Documentation

Model Information

Model Index

Model Performance on Evaluation Set

Training and Evaluation

Training Hyperparameters

Training Results

Framework Versions