W

Wav2vec2 Large Xlsr 53 Sw

Developed by alokmatta
Swahili automatic speech recognition model fine-tuned on XLSR-53 large model, supports 16kHz sampling rate audio input
Downloads 158
Release Time : 3/2/2022

Model Overview

This model is an automatic speech recognition (ASR) model fine-tuned on Swahili datasets based on Facebook's wav2vec2-large-xlsr-53 model, capable of converting Swahili speech to text.

Model Features

Multi-dataset Fine-tuning
Fine-tuned on three Swahili datasets (ALFFA, Gamayun, and IWSLT) to improve recognition accuracy
16kHz Sampling Rate Support
Optimized specifically for 16kHz sampling rate audio input
No Language Model Required
Can be used directly without additional language model support

Model Capabilities

Swahili speech recognition
Speech-to-text
Automatic speech transcription

Use Cases

Speech Transcription
Swahili Speech Transcription
Convert Swahili speech content into text format
Test WER of 40%
Voice Assistants
Swahili Voice Interaction
Provide speech recognition capability for Swahili voice assistants
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase