W

Wav2vec2 Xls R 300m Bengali

Developed by arijitx
A Bengali automatic speech recognition model fine-tuned from facebook/wav2vec2-xls-r-300m, trained on the OpenSLR_SLR53 dataset
Downloads 533
Release Time : 3/2/2022

Model Overview

This is an optimized automatic speech recognition (ASR) model for Bengali, fine-tuned based on Facebook's wav2vec2-xls-r-300m architecture, specifically designed for Bengali speech-to-text tasks.

Model Features

High-accuracy Bengali recognition
Achieves a Word Error Rate (WER) of 0.153 and Character Error Rate (CER) of 0.034 on the OpenSLR_SLR53 test set
Language model integration support
Can be combined with a 5-gram language model to further improve recognition accuracy
Professional dataset training
Fine-tuned using the OpenSLR_SLR53 professional Bengali dataset
Optimized training parameters
Uses data augmentation techniques such as audio time masking (0.75 probability) and feature masking (0.25 probability)

Model Capabilities

Bengali speech recognition
Speech-to-text
Language model integration support

Use Cases

Speech transcription
Bengali meeting minutes
Automatically transcribe Bengali meeting recordings into text records
Accuracy rate of 84.7% (WER 0.153)
Voice assistant
Provide speech recognition capabilities for Bengali voice assistants
Education
Language learning applications
Help learners practice Bengali pronunciation and listening
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase