W

Wav2vec2 Xls R 300m Ca

Developed by PereLluis13
A speech recognition model fine-tuned on Catalan datasets based on facebook/wav2vec2-xls-r-300m, supporting automatic speech recognition tasks.
Downloads 116
Release Time : 3/2/2022

Model Overview

This model is an automatic speech recognition (ASR) model for the Catalan language, fine-tuned on multiple Catalan datasets to convert speech into text.

Model Features

Multi-dataset Training
Fine-tuned on multiple Catalan datasets including MOZILLA-FOUNDATION/COMMON_VOICE_8_0, tv3_parla, and parlament_parla.
Digit Conversion Support
Uses special processing to convert digits into textual form, improving digit recognition accuracy.
Optimized Training Process
Employs carefully designed preprocessing workflows and training hyperparameters, including linear learning rate scheduling and AMP mixed-precision training.

Model Capabilities

Catalan speech recognition
Speech-to-text
Digit recognition

Use Cases

Media Transcription
TV Program Subtitle Generation
Automatically generates subtitles for Catalan TV programs
Achieved a WER of 23.32% on the tv3_parla dataset
Meeting Minutes
Parliament Meeting Transcription
Automatically transcribes Catalan parliament meeting content
Achieved a WER of 8.05% on the parlament_parla dataset
Voice Assistants
Catalan Voice Input
Provides speech recognition capabilities for Catalan voice assistants
Achieved a WER of 13.17% on the Common Voice dataset
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase