Whisper-Hindi-Small Open-Source Speech Recognition Model - Free Deployment for Accurate Hindi Speech Recognition

Whisper Hindi Small

Developed by vasista22

A Hindi speech recognition model fine-tuned based on OpenAI Whisper-small, trained on multiple public ASR corpora

Speech Recognition OtherOpen Source License:Apache-2.0 #Hindi speech recognition #Low word error rate #Multi-scenario adaptation

Downloads 477

Release Time : 1/8/2023

Model Overview

This is an automatic speech recognition (ASR) model specifically optimized for Hindi, fine-tuned based on OpenAI's Whisper-small architecture. Mainly used for converting Hindi speech to text.

Model Features

Hindi Optimization

Specially fine-tuned and optimized for Hindi speech recognition

Multi-dataset Training

Trained on multiple public Hindi ASR corpora including GramVaani, ULCA, and Shrutilipi

Efficient Inference Support

Supports accelerated inference using whisper-jax

Model Capabilities

Hindi speech recognition

Long audio processing (supports chunk processing)

Use Cases

Speech Transcription

Hindi Speech Transcription

Convert Hindi speech content to text

🚀 Whisper Hindi Small

This model is a fine - tuned version of openai/whisper-small on Hindi data from multiple public ASR corpuses, contributing to the Whisper fine - tuning sprint.

🚀 Quick Start

This model is a fine - tuned variant of openai/whisper-small based on Hindi data from multiple publicly accessible ASR corpuses. It was developed as part of the Whisper fine - tuning sprint.

NOTE: The code for training this model can be reused from the whisper-finetune repository.

✨ Features

Fine - tuned on multiple Hindi ASR corpuses.
Code for training and evaluation is publicly available for re - use.
Supports faster inference with whisper - jax.

📦 Installation

The installation and evaluation codes can be found in the whisper-finetune repository.

💻 Usage Examples

Basic Usage

To infer a single audio file using this model, you can use the following code snippet:

>>> import torch
>>> from transformers import pipeline

>>> # path to the audio file to be transcribed
>>> audio = "/path/to/audio.format"
>>> device = "cuda:0" if torch.cuda.is_available() else "cpu"

>>> transcribe = pipeline(task="automatic-speech-recognition", model="vasista22/whisper-hindi-small", chunk_length_s=30, device=device)
>>> transcribe.model.config.forced_decoder_ids = transcribe.tokenizer.get_decoder_prompt_ids(language="hi", task="transcribe")

>>> print('Transcription: ', transcribe(audio)["text"])

Advanced Usage

For faster inference using whisper - jax, follow the installation steps here and then use the following code:

>>> import jax.numpy as jnp
>>> from whisper_jax import FlaxWhisperForConditionalGeneration, FlaxWhisperPipline

>>> # path to the audio file to be transcribed
>>> audio = "/path/to/audio.format"

>>> transcribe = FlaxWhisperPipline("vasista22/whisper-hindi-small", batch_size=16)
>>> transcribe.model.config.forced_decoder_ids = transcribe.tokenizer.get_decoder_prompt_ids(language="hi", task="transcribe")

>>> print('Transcription: ', transcribe(audio)["text"])

📚 Documentation

Training and evaluation data

Training Data:
Evaluation Data:
- GramVaani ASR Corpus Test Set
- Google/Fleurs Test Set

Training hyperparameters

The following hyperparameters were used during training:

Property	Details
learning_rate	1.75e - 05
train_batch_size	48
eval_batch_size	32
seed	22
optimizer	adamw_bnb_8bit
lr_scheduler_type	linear
lr_scheduler_warmup_steps	20000
training_steps	19377 (Initially set to 129180 steps)
mixed_precision_training	True

🔧 Technical Details

This model is a fine - tuned version of openai/whisper-small on Hindi data. The fine - tuning was carried out as part of the Whisper fine - tuning sprint.

📄 License

This model is licensed under the Apache - 2.0 license.

Acknowledgement

This work was conducted at Speech Lab, IIT Madras. The compute resources were funded by the "Bhashini: National Language translation Mission" project of the Ministry of Electronics and Information Technology (MeitY), Government of India.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご