W

Wav2vec2 Bn 300m

Developed by Tahsin-Mayeesha
A fine-tuned Bengali automatic speech recognition model based on facebook/wav2vec2-xls-r-300m, trained using the OPENSLR_SLR53 dataset
Downloads 25
Release Time : 3/2/2022

Model Overview

This is an optimized automatic speech recognition (ASR) model for Bengali, fine-tuned on the wav2vec2-xls-r-300m architecture, demonstrating excellent performance on the OpenSLR dataset

Model Features

High Accuracy Bengali Recognition
Achieves a word error rate (WER) of 17.78% and a character error rate (CER) of 4.39% on the OpenSLR test set
Supports Language Model Integration
Can be combined with a 5-gram language model to further improve recognition accuracy
Large-scale Training Data
Trained using 218,703 samples from the OPENSLR_SLR53 dataset

Model Capabilities

Bengali Speech Recognition
Speech-to-Text
Supports Language Model Enhancement

Use Cases

Speech Transcription
Bengali Speech Transcription
Convert Bengali speech content into text
Achieved 0.17776 WER (with language model) on the test set
Voice Assistants
Bengali Voice Interaction
Provides speech recognition capabilities for Bengali voice assistants
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase