whisper-tiny-myanmar Open-source Model - Free and Support Precise Speech-to-Text Conversion for Burmese

Whisper Tiny Myanmar

Developed by chuuhtetnaing

This model is an automatic speech recognition (ASR) model fine-tuned on Burmese speech datasets based on openai/whisper-tiny, supporting Burmese speech-to-text tasks.

Speech Recognition

Transformers

OtherOpen Source License:Apache-2.0 #Burmese speech recognition #Low-resource optimization #Whisper fine-tuning

Downloads 84

Release Time : 8/31/2024

Model Overview

A lightweight speech recognition model optimized for Burmese, suitable for Burmese speech transcription scenarios.

Model Features

Burmese Optimization

Specifically fine-tuned for Burmese speech characteristics to improve recognition accuracy

Lightweight Architecture

Small model based on whisper-tiny, suitable for deployment in resource-constrained environments

End-to-End Recognition

Supports complete workflow from speech input to text output

Model Capabilities

Burmese speech recognition

Audio transcription

Real-time speech-to-text

Use Cases

Speech Transcription

Burmese Meeting Minutes

Automatically convert Burmese meeting recordings into text records

Word error rate 61.89% (performance on test set)

Voice Assistant

Provide recognition capabilities for Burmese voice assistants

🚀 whisper-tiny-myanmar

This model is a fine - tuned version of openai/whisper-tiny on the chuuhtetnaing/myanmar-speech-dataset-openslr-80 dataset. It is designed for automatic speech recognition in the Myanmar language, offering a practical solution for transcribing Myanmar speech.

🚀 Quick Start

This model is a fine - tuned version of openai/whisper-tiny on the chuuhtetnaing/myanmar-speech-dataset-openslr-80 dataset. It achieves the following results on the evaluation set:

Loss: 0.2353
Wer: 61.8878

💻 Usage Examples

Basic Usage

from datasets import Audio, load_dataset
from transformers import pipeline

# Load a sample audio
dataset = load_dataset("chuuhtetnaing/myanmar-speech-dataset-openslr-80")
dataset = dataset.cast_column("audio", Audio(sampling_rate=16000))
test_dataset = dataset['test']
input_speech = test_dataset[42]['audio']

pipe = pipeline(model='chuuhtetnaing/whisper-tiny-myanmar')

output = pipe(input_speech, generate_kwargs={"language": "myanmar", "task": "transcribe"})
print(output['text']) # ကျွန်မ ပြည်ပ မှာ ပညာ သင် တော့ စာမြီးပွဲ ကို တပတ်တခါ စစ်တယ်

🔧 Technical Details

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e - 08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 200
num_epochs: 30
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
No log	1.0	18	1.2679	357.6135
1.483	2.0	36	1.0660	102.5378
1.0703	3.0	54	0.9530	106.3669
1.0703	4.0	72	0.8399	100.5343
0.8951	5.0	90	0.7728	107.6581
0.7857	6.0	108	0.7143	107.5245
0.6614	7.0	126	0.5174	104.4078
0.6614	8.0	144	0.3004	90.3384
0.3519	9.0	162	0.2447	82.4577
0.2165	10.0	180	0.2333	83.8825
0.2165	11.0	198	0.2022	77.0258
0.1532	12.0	216	0.1759	73.0632
0.1039	13.0	234	0.1852	72.0837
0.0675	14.0	252	0.1902	71.2823
0.0675	15.0	270	0.1882	70.5254
0.0517	16.0	288	0.2002	69.7240
0.0522	17.0	306	0.1965	67.7649
0.0522	18.0	324	0.1935	68.2102
0.0404	19.0	342	0.2132	67.9430
0.0308	20.0	360	0.2110	66.6963
0.0236	21.0	378	0.2141	65.9394
0.0236	22.0	396	0.2200	64.4702
0.0116	23.0	414	0.2227	63.4016
0.0055	24.0	432	0.2244	64.1585
0.0025	25.0	450	0.2254	62.4666
0.0025	26.0	468	0.2282	63.1790
0.0006	27.0	486	0.2320	61.7097
0.0002	28.0	504	0.2342	62.0659
0.0002	29.0	522	0.2350	62.0214
0.0001	30.0	540	0.2353	61.8878

Framework versions

Transformers 4.35.2
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.15.1

📄 License

This model is licensed under the Apache 2.0 license.

📋 Model Information

Property	Details
Model Type	Fine - tuned version of openai/whisper - tiny
Training Data	chuuhtetnaing/myanmar - speech - dataset - openslr - 80
Metrics	Wer
Pipeline Tag	Automatic Speech Recognition
Library Name	Transformers
Language	Myanmar

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご