W

Wav2vec2 Pretrained Clsril 23 10k

Developed by Harveenchadha
An audio pre-training model based on self-supervised learning, capable of learning cross-lingual speech representations from raw audio of 23 Indian languages
Downloads 32
Release Time : 3/2/2022

Model Overview

CLSRIL-23 is a speech representation model based on the wav2vec 2.0 architecture, trained through contrastive learning tasks to learn shared speech feature representations across 23 Indian languages. This model is particularly suitable for speech processing tasks in India's multilingual environment.

Model Features

Multilingual support
Supports speech representation learning for 23 Indian languages, covering major Indo-Aryan language families
Self-supervised learning
Utilizes self-supervised learning methods to learn effective speech representations without requiring large amounts of labeled data
Shared quantized representation
Jointly learns shared latent quantized representations across all languages, facilitating cross-lingual transfer
Large-scale training data
Total training data exceeds 9000 hours, with Hindi having the largest volume (4563.7 hours)

Model Capabilities

Cross-lingual speech representation learning
Speech feature extraction
Multilingual speech processing

Use Cases

Speech recognition
Multilingual automatic speech recognition
Building speech recognition systems in India's multilingual environment
Speech technology development
Speech feature extraction
Serving as a pre-trained feature extractor for downstream speech tasks
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase