Whisper-small-sinhala_v3 Open-Source Speech Recognition Model - Freely Achieve Precise Conversion of Sinhala Speech to Text

Whisper Small Sinhala V3

Developed by Lingalingeswaran

This model is a fine-tuned speech recognition model based on openai/whisper-small for Sinhala language datasets, supporting Sinhala speech-to-text tasks.

Speech Recognition

Transformers

OtherOpen Source License:Apache-2.0 #Sinhala Speech Recognition #Low-Resource Optimization #Multi-Scenario ASR

Downloads 75

Release Time : 1/26/2025

Model Overview

A Sinhala automatic speech recognition (ASR) model fine-tuned on the Whisper-small architecture, suitable for Sinhala speech transcription scenarios.

Model Features

Sinhala Optimization

Specially fine-tuned for Sinhala speech characteristics to improve recognition accuracy

Lightweight Model

Based on Whisper-small architecture, balancing performance and resource consumption

End-to-End Recognition

Directly converts speech to text without intermediate processing steps

Model Capabilities

Sinhala Speech Recognition

Real-Time Speech-to-Text

Audio File Transcription

Use Cases

Speech Transcription

Meeting Minutes

Automatically converts Sinhala meeting recordings into text transcripts

Media Subtitle Generation

Automatically generates subtitles for Sinhala video content

Voice Assistants

Sinhala Voice Command Recognition

Used to support Sinhala voice interaction systems

🚀 Whisper Small sinhala v3 - Lingalingeswaran

This model is a fine - tuned version of [openai/whisper - small](https://huggingface.co/openai/whisper - small) on the Lingalingeswaran/asr - sinhala - dataset_json_v1 dataset, achieving specific results in speech recognition.

🚀 Quick Start

This model is a fine - tuned version of [openai/whisper - small](https://huggingface.co/openai/whisper - small) on the Lingalingeswaran/asr - sinhala - dataset_json_v1 dataset. It achieves the following results on the evaluation set:

Loss: 0.2086
Wer: 46.4577

✨ Features

Fine - tuned on the Lingalingeswaran/asr - sinhala - dataset_json_v1 dataset for Sinhala speech recognition.
Achieved specific loss and WER metrics on the evaluation set.

📦 Installation

No installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

Basic Usage

Here is an example of how to use the model for Sinhala speech recognition with Gradio:

import gradio as gr
from transformers import pipeline

# Initialize the pipeline with the specified model
pipe = pipeline(model="Lingalingeswaran/whisper-small-sinhala_v3")

def transcribe(audio):
    # Transcribe the audio file to text
    text = pipe(audio)["text"]
    return text

# Create the Gradio interface

iface = gr.Interface(
    fn=transcribe,
    inputs=gr.Audio(sources=["microphone", "upload"], type="filepath"),
    outputs="text",
    title="Whisper Small Sinhala",
    description="Realtime demo for Sinhala speech recognition using a fine-tuned Whisper small model.",
)

# Launch the interface
if __name__ == "__main__":
    iface.launch()

📚 Documentation

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e - 05
train_batch_size: 16
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon = 1e - 08 and optimizer_args = No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 3000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.1852	1.7606	1000	0.1875	50.9772
0.0602	3.5211	2000	0.1886	47.5774
0.0238	5.2817	3000	0.2086	46.4577

Framework versions

Transformers 4.48.1
Pytorch 2.5.1+cu121
Datasets 3.2.0
Tokenizers 0.21.0

📄 License

This model is licensed under the apache - 2.0 license.

📋 Information Table

Property	Details
Library Name	transformers
Language	si
License	apache - 2.0
Base Model	openai/whisper - small
Tags	generated_from_trainer
Datasets	Lingalingeswaran/asr - sinhala - dataset_json_v1
Metrics	wer
Model Name	Whisper Small sinhala v3 - Lingalingeswaran
Task	Automatic Speech Recognition
Dataset Name	Lingalingeswaran/asr - sinhala - dataset_json_v1
Dataset Type	Lingalingeswaran/asr - sinhala - dataset_json_v1
Dataset Args	'config: si, split: test'
Wer Value	46.457654723127035

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご