I

Indic Seamless

Developed by ai4bharat
A speech-to-text translation model for Indian languages fine-tuned on SeamlessM4T-v2, supporting 13 Indian languages with performance surpassing the base model and competing systems.
Downloads 917
Release Time : 3/4/2025

Model Overview

This model specializes in speech-to-text translation (STT) for Indian languages, fine-tuned on the BhasaAnuvaad dataset and setting new records on the Fleurs dataset.

Model Features

Multilingual Support
Supports 13 Indian languages, covering major Indian language families.
High Performance
Sets new records on the Fleurs dataset and significantly outperforms other systems on the BhasaAnuvaad test set.
Strict Data Filtering
Applied threshold filtering for alignment score (0.8) and mining score (0.6) to the dataset before training.

Model Capabilities

Speech-to-text translation
Multilingual speech recognition
Batch audio processing

Use Cases

Speech Transcription
Single Audio Transcription
Transcribe a single audio file into text in a specified Indian language
Higher accuracy than the base model and competing systems
Batch Processing
Dataset Batch Transcription
Batch transcription processing for speech datasets like Fleurs
Supports batch processing with high efficiency
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase