W

Wav2vec2 Large Xlsr 53 Spanish

Developed by LuisG07
A Spanish automatic speech recognition (ASR) model fine-tuned from the facebook/wav2vec2-large-xlsr-53 model, trained on the Common Voice Spanish dataset, supporting speech input at 16kHz sampling rate.
Downloads 50
Release Time : 3/2/2022

Model Overview

This is a Wav2Vec2 model for Spanish automatic speech recognition (ASR), fine-tuned based on the XLSR-53 architecture, capable of converting Spanish speech to text.

Model Features

High Accuracy Recognition
Achieves 8.82% Word Error Rate (WER) and 2.58% Character Error Rate (CER) on the Common Voice Spanish test set.
Language Model Enhancement
When combined with a language model, the WER can be further reduced to 6.27% and CER to 2.06%.
16kHz Sampling Rate Support
Specifically optimized for processing speech input at 16kHz sampling rate.
Open Source License
Licensed under Apache-2.0, allowing both commercial and research use.

Model Capabilities

Spanish speech recognition
Speech-to-text
Automatic speech transcription

Use Cases

Speech Transcription
Speech Content Transcription
Automatically convert Spanish speech content into text
Highly accurate transcription results
Voice Assistants
Spanish Voice Command Recognition
Used for building Spanish voice assistants or command control systems
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase