Wavlm Libri Clean 100h Large
Automatic speech recognition model fine-tuned on the LIBRISPEECH_ASR - CLEAN dataset based on microsoft/wavlm-large
Downloads 8,171
Release Time : 3/2/2022
Model Overview
This model is a fine-tuned version of the WavLM-Large architecture on the LibriSpeech clean-100h dataset, focusing on English speech recognition tasks, achieving a low word error rate (WER) on the evaluation set.
Model Features
High-performance speech recognition
After fine-tuning on the LibriSpeech clean-100h dataset, the word error rate (WER) is as low as 0.0491
Based on WavLM-Large architecture
Uses Microsoft's WavLM-Large pre-trained model as the foundation, with powerful speech feature extraction capabilities
Multi-GPU training optimization
Uses 8 GPUs for distributed training, optimizing training efficiency through techniques like gradient accumulation
Model Capabilities
English speech recognition
High-precision speech-to-text
Continuous speech recognition
Use Cases
Speech transcription
Audiobook transcription
Automatically transcribes English audiobook content into text
Word error rate of 4.91% on the LibriSpeech evaluation set
Voice assistants
Voice command recognition
Used for English voice command recognition in smart devices
Featured Recommended AI Models