Wav2vec2 Large Xlsr 53 Sw
Swahili automatic speech recognition model fine-tuned on XLSR-53 large model, supports 16kHz sampling rate audio input
Downloads 158
Release Time : 3/2/2022
Model Overview
This model is an automatic speech recognition (ASR) model fine-tuned on Swahili datasets based on Facebook's wav2vec2-large-xlsr-53 model, capable of converting Swahili speech to text.
Model Features
Multi-dataset Fine-tuning
Fine-tuned on three Swahili datasets (ALFFA, Gamayun, and IWSLT) to improve recognition accuracy
16kHz Sampling Rate Support
Optimized specifically for 16kHz sampling rate audio input
No Language Model Required
Can be used directly without additional language model support
Model Capabilities
Swahili speech recognition
Speech-to-text
Automatic speech transcription
Use Cases
Speech Transcription
Swahili Speech Transcription
Convert Swahili speech content into text format
Test WER of 40%
Voice Assistants
Swahili Voice Interaction
Provide speech recognition capability for Swahili voice assistants
Featured Recommended AI Models
Š 2025AIbase