W

Wav2vec2 Large 100k Voxpopuli Ft Common Voice Plus TTS Dataset Plus Data Augmentation Portuguese

Developed by Edresson
This is a Portuguese speech recognition model based on Facebook's Wav2vec2 Large 100k Voxpopuli, fine-tuned using the Common Voice 7.0 and TTS Portuguese datasets with data augmentation techniques applied.
Downloads 22
Release Time : 3/2/2022

Model Overview

This model specializes in Portuguese speech recognition tasks, improving recognition accuracy through data augmentation and additional fine-tuning with TTS datasets.

Model Features

Data augmentation fine-tuning
Uses TTS-generated data and voice conversion techniques for data augmentation to improve model performance
Multi-dataset training
Combines training with Common Voice 7.0 and specialized TTS Portuguese datasets
High-performance recognition
Achieves a 20.20% word error rate on the Common Voice 7.0 test set

Model Capabilities

Portuguese speech recognition
Audio-to-text conversion
Automatic speech recognition

Use Cases

Speech transcription
Portuguese speech-to-text
Converts Portuguese speech content into text
Word error rate 20.20%
Voice assistants
Portuguese voice command recognition
Used for voice command recognition in Portuguese voice assistant systems
Featured Recommended AI Models
ยฉ 2025AIbase