A

Asr Wav2vec2 Commonvoice Rw

Developed by speechbrain
This is an end-to-end model for automatic speech recognition in Rwandan, based on the wav2vec 2.0 pre-trained model combined with CTC and attention mechanisms, fine-tuned on the CommonVoice dataset.
Downloads 28
Release Time : 3/2/2022

Model Overview

This model provides automatic speech recognition for Rwandan, including two modules: a tokenizer and an acoustic model, supporting audio input with a sampling rate of 16kHz.

Model Features

End-to-end speech recognition
Provides a complete processing pipeline from audio input to text output
Pre-trained model fine-tuning
Based on the wav2vec2-large-xlsr-53 pre-trained model, fine-tuned on Rwandan data
Dual decoding mechanism
Uses both CTC and attention mechanisms for decoding to improve recognition accuracy
Automatic audio processing
Built-in audio normalization function, automatically handles sampling rate and channel conversion

Model Capabilities

Rwandan speech recognition
Audio transcription
Batch speech processing

Use Cases

Speech transcription
Speech to text
Convert Rwandan speech content into text
Word error rate 18.91%
Voice assistant
Rwandan voice interaction
Provides recognition capability for Rwandan voice assistants
Featured Recommended AI Models
ยฉ 2025AIbase