W

Wav2vec2 Large XLSR 53 Catalan

Developed by PereLluis13
A Catalan automatic speech recognition (ASR) model fine-tuned from the facebook/wav2vec2-large-xlsr-53 model, trained on the Common Voice Catalan dataset with a word error rate (WER) of 8.11%.
Downloads 11.57k
Release Time : 3/2/2022

Model Overview

This is a model for Catalan automatic speech recognition, fine-tuned based on the XLSR-53 architecture, supporting 16kHz sampling rate audio input.

Model Features

High-performance recognition
Achieves 8.11% word error rate on the Common Voice Catalan test set
No language model required
Can be used directly without additional language model integration
Optimized training process
Training optimized by adjusting batch size and gradient steps, with pitch processing applied to some samples

Model Capabilities

Catalan speech recognition
16kHz audio processing

Use Cases

Speech-to-text
Catalan transcription
Convert Catalan speech to text
8.11% word error rate
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase