W

Wav2vec2 Large Xlsr 53 Italian

Developed by jonatasgrosman
An Italian automatic speech recognition model fine-tuned from the facebook/wav2vec2-large-xlsr-53 model, trained on the Common Voice 6.1 dataset
Downloads 1,012
Release Time : 3/2/2022

Model Overview

This model is an automatic speech recognition (ASR) model optimized for Italian, fine-tuned based on the XLSR-53 architecture, supporting speech input conversion at 16kHz sampling rate

Model Features

High-performance Italian recognition
Achieves a word error rate (WER) of 9.41% and a character error rate (CER) of 2.29% on the Common Voice Italian test set
Language model enhancement
When combined with a language model, the word error rate can be further reduced to 6.91% and the character error rate to 1.83%
Multi-scenario applicability
Performs well on both standard test sets and robust speech competition development sets, demonstrating strong generalization capabilities
Easy integration
Provides two usage methods: the HuggingSound library and custom scripts, facilitating quick integration into applications

Model Capabilities

Italian speech-to-text
16kHz audio processing
Batch speech recognition
Long audio chunk processing

Use Cases

Speech transcription
Italian speech content transcription
Convert Italian speech content into text format
Highly accurate transcription results, suitable for content archiving and analysis
Voice assistants
Italian voice command recognition
Used for command recognition in Italian voice assistant systems
Low-latency, high-accuracy command recognition
Accessibility applications
Speech-to-text assistance
Provides real-time speech-to-text services for hearing-impaired individuals
Highly accurate real-time conversion
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase