W

Wavlm Libri Clean 100h Large

Developed by patrickvonplaten
Automatic speech recognition model fine-tuned on the LIBRISPEECH_ASR - CLEAN dataset based on microsoft/wavlm-large
Downloads 8,171
Release Time : 3/2/2022

Model Overview

This model is a fine-tuned version of the WavLM-Large architecture on the LibriSpeech clean-100h dataset, focusing on English speech recognition tasks, achieving a low word error rate (WER) on the evaluation set.

Model Features

High-performance speech recognition
After fine-tuning on the LibriSpeech clean-100h dataset, the word error rate (WER) is as low as 0.0491
Based on WavLM-Large architecture
Uses Microsoft's WavLM-Large pre-trained model as the foundation, with powerful speech feature extraction capabilities
Multi-GPU training optimization
Uses 8 GPUs for distributed training, optimizing training efficiency through techniques like gradient accumulation

Model Capabilities

English speech recognition
High-precision speech-to-text
Continuous speech recognition

Use Cases

Speech transcription
Audiobook transcription
Automatically transcribes English audiobook content into text
Word error rate of 4.91% on the LibriSpeech evaluation set
Voice assistants
Voice command recognition
Used for English voice command recognition in smart devices
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase