W

Wav2vec2 Large Xlsr 53 Portuguese

Developed by jonatasgrosman
This is a fine-tuned XLSR-53 large model for Portuguese speech recognition tasks, trained on the Common Voice 6.1 dataset, supporting Portuguese speech-to-text conversion.
Downloads 4.9M
Release Time : 3/2/2022

Model Overview

This model is a Portuguese automatic speech recognition (ASR) model fine-tuned based on the facebook/wav2vec2-large-xlsr-53 architecture, capable of converting Portuguese speech into text.

Model Features

High-precision Portuguese recognition
Achieves a word error rate (WER) of 11.31% and a character error rate (CER) of 3.74% on the Common Voice Portuguese test set.
Language model enhancement support
When combined with a language model, the word error rate can be further reduced to 9.01% and the character error rate to 3.21%.
16kHz sampling rate support
Optimized specifically for 16kHz sampled speech input.
GPU-accelerated training
Utilizes GPU computing resources provided by OVHcloud for efficient training.

Model Capabilities

Portuguese speech recognition
Real-time speech-to-text
Batch audio processing

Use Cases

Speech transcription
Meeting transcription
Automatically converts Portuguese meeting recordings into text transcripts
Accuracy approximately 90% (WER 9.01% with LM)
Voice memo conversion
Converts personal voice memos into searchable text
Base accuracy 88.69% (WER 11.31)
Assistive technology
Voice input system
Provides voice input solutions for Portuguese-speaking users
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase