Wav2vec2 Large Lv60h 100h 2nd Try
A wav2vec2-large-lv60 speech recognition model fine-tuned on the LibriSpeech dataset, supporting English speech-to-text tasks
Downloads 20
Release Time : 3/2/2022
Model Overview
This model is part of the wav2vec2 series released by Facebook Research. It was pre-trained through self-supervised learning and fine-tuned on 100 hours of LibriSpeech-clean data for English speech recognition tasks.
Model Features
Efficient Fine-tuning
Achieves performance close to full-data fine-tuning with only 100 hours of labeled data
Dynamic Batch Padding
Automatically optimizes batch padding strategy during training to improve GPU utilization
Mixed Precision Training
Supports fp16 mixed precision training to reduce memory usage and accelerate training
Model Capabilities
English speech recognition
High-accuracy speech-to-text conversion
Long audio processing (supports batches up to 750 seconds)
Use Cases
Speech Transcription
Automatic Meeting Minutes Generation
Automatically converts English meeting recordings into text transcripts
Achieves WER of 4.0 (clean)/10.3 (other) on the LibriSpeech test set
Podcast Content Indexing
Creates searchable text indexes for English podcast episodes
Featured Recommended AI Models