indic_whisper_hi_multi_gpu Open Source Speech Recognition Model - Super-accurate Recognition Optimized for Indian Languages

Indic Whisper Hi Multi Gpu

Developed by parthiv11

IndicWhisper is a cutting-edge speech recognition model optimized for Indian languages, excelling in various benchmarks for Indian languages.

Speech Recognition OtherOpen Source License:MIT #Hindi Speech Recognition #JAX Acceleration #Low Word Error Rate

Downloads 72

Release Time : 2/28/2024

Model Overview

JAX-accelerated IndicWhisper is a speech recognition model optimized for Indian languages, significantly outperforming other public models and serving as the ideal choice for Hindi speech recognition tasks.

Model Features

JAX Acceleration

Integrated JAX acceleration support significantly improves TPU/GPU computational performance, achieving over 70x speedup compared to the original PyTorch implementation.

Outstanding Benchmark Performance

On the Hindi subset of the Vistaar benchmark, the average word error rate is significantly better than other public models.

Optimized for Indian Languages

Specially optimized for Indian languages, delivering impressive performance across various Indian language benchmarks.

Model Capabilities

Hindi Speech Recognition

Multilingual Support

High-Performance Computing

Use Cases

Speech Recognition

Hindi Speech-to-Text

Convert Hindi speech content into text

Word error rate is significantly lower than other public models

🚀 IndicWhisper With JAX (more faster)

IndicWhisper is a state - of - the - art speech recognition model fine - tuned on Indian languages, offering pre - trained checkpoints and code for training and evaluation.

🚀 Quick Start

IndicWhisper is a state - of - the - art speech recognition model fine - tuned on Indian languages. This repository contains the code for training and evaluating the model, as well as pre - trained checkpoints for immediate use.

✨ Features

High Performance: IndicWhisper achieves impressive Word Error Rates (WERs) on various benchmarks for Indian languages, outperforming other publicly available models.
JAX Mode: Recently added support for JAX mode significantly enhances performance on both TPUs and GPUs, making it the fastest Whisper implementation available.

📚 Documentation

Overview

IndicWhisper achieves impressive Word Error Rates (WERs) on various benchmarks for Indian languages. It outperforms other publicly available models, making it a valuable asset for speech recognition tasks in Indian languages.

Performance on Vistaar Benchmark (Hindi Subset)

Model	Kathbath	Kathbath - Hard	FLEURS	CommonVoice	IndicTTS	MUCS	Gramvaani	Average
Google STT	14.3	16.7	19.4	20.8	18.3	17.8	59.9	23.9
IndicWav2vec	12.2	16.2	18.3	20.2	15	22.9	42.1	21
Azure STT	13.6	15.1	24.3	14.6	15.2	15.1	42.3	20
Nvidia - medium	14	15.6	19.4	20.4	12.3	12.4	41.3	19.4
Nvidia - large	12.7	14.2	15.7	21.2	12.2	11.8	42.6	18.6
IndicWhisper	10.3	12.0	11.4	15.0	7.6	12	26.8	13.6

💻 Usage Examples

Basic Usage

from whisper_jax import FlaxWhisperForConditionalGeneration, FlaxWhisperPipline
import jax.numpy as jnp

pipeline = FlaxWhisperPipline('parthiv11/indic_whisper_hi_multi_gpu', dtype=jnp.bfloat16)
transcript = pipeline('sample.mp3')

Acknowledgements

We would like to express our gratitude to the following organizations for their support:

EkStep Foundation for their generous grant, which facilitated the establishment of the Centre for AI4Bharat at IIT Madras.
The Ministry of Electronics and Information Technology (NLTM) for its grant to support the creation of datasets and models for Indian languages under the Bhashini project.
The Centre for Development of Advanced Computing, India (C - DAC), for providing access to the Param Siddhi supercomputer for training our models.
Microsoft for its grant to create datasets, tools, and resources for Indian languages.
For JAX guide on github

📄 License

IndicWhisper and the associated Vistaar benchmark are MIT - licensed. This license applies to all the fine - tuned language models included in this repository.

Contributors

Kaushal Bhogale (AI4Bharat)
Sai Narayan Sundaresan (IITKGP, AI4Bharat)
Abhigyan Raman (AI4Bharat)
Tahir Javed (IITM, AI4Bharat)
Mitesh Khapra (IITM, AI4Bharat, RBCDSAI)
Pratyush Kumar (Microsoft, AI4Bharat)

🤝 Contributing

We welcome contributions from the community to further improve IndicWhisper. If you have any ideas, bug fixes, or enhancements, please feel free to submit a pull request.

Thank you for your interest in IndicWhisper! We hope it proves to be a valuable tool for your speech recognition needs in Indian languages.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご