# LibriSpeech optimization
Asr Wav2vec2 Librispeech
Apache-2.0
This is an end-to-end automatic speech recognition system trained on the LibriSpeech dataset, combining the wav2vec 2.0 pre-trained model and CTC technology, excelling in English speech recognition tasks.
Speech Recognition English
A
speechbrain
1,667
9
Assignment1 Joane
MIT
A speech-to-text (S2T) model for automatic speech recognition (ASR)
Speech Recognition
Transformers English

A
Classroom-workshop
22
0
Wav2vec2 Conformer Rope Large 960h Ft
Apache-2.0
This model incorporates rotary position embedding technology, is pre-trained and fine-tuned on 960 hours of LibriSpeech data sampled at 16kHz, and is suitable for English speech recognition tasks.
Speech Recognition
Transformers English

W
facebook
22.02k
10
Wav2vec2 Conformer Rel Pos Large 960h Ft
Apache-2.0
A Wav2Vec2-Conformer model based on 16kHz sampled speech audio, using relative positional embedding technology, pre-trained and fine-tuned on 960 hours of Librispeech data
Speech Recognition
Transformers English

W
facebook
1,038
5
Wav2vec2 Large 10min Lv60 Self
Apache-2.0
This model is a large-scale speech recognition model based on the Wav2Vec2 architecture, pre-trained and fine-tuned on 10 minutes of data from Libri-Light and Librispeech, using self-training objectives, suitable for 16kHz sampled speech audio.
Speech Recognition
Transformers English

W
Splend1dchan
177
0
Wav2vec2 Large 100h Lv60 Self
Apache-2.0
Wav2Vec2-Large-100h-Lv60 is a large model pre-trained and fine-tuned on 100 hours of Libri-Light and Librispeech speech data, trained with self-training objectives, suitable for speech recognition tasks with 16kHz sampling rate.
Speech Recognition
Transformers English

W
Splend1dchan
17
0
S2t Small Librispeech Asr
MIT
A speech-to-text (S2T) model for automatic speech recognition (ASR), based on a sequence-to-sequence transformer architecture
Speech Recognition
Transformers English

S
facebook
10.92k
27
Wavlm Libri Clean 100h Base
An automatic speech recognition model fine-tuned on the LIBRISPEECH_ASR - CLEAN dataset based on microsoft/wavlm-base
Speech Recognition
Transformers

W
patrickvonplaten
6,515
1
Wav2vec2 Base 960h
Apache-2.0
The Wav2Vec2 base model developed by Facebook, pre-trained and fine-tuned on 960 hours of LibriSpeech audio for English automatic speech recognition tasks.
Speech Recognition
Transformers English

W
facebook
2.1M
331
Wav2vec2 Base 100h
Apache-2.0
Wav2Vec2 Base is an automatic speech recognition model pre-trained and fine-tuned on 16kHz sampled LibriSpeech audio for 100 hours.
Speech Recognition
Transformers English

W
facebook
4,380
6
Sew D Tiny 100k
Apache-2.0
SEW-D is a compressed and efficient speech pre-training model developed by ASAPP Research, pre-trained on 16kHz sampled speech audio, suitable for various downstream speech tasks.
Speech Recognition
Transformers English

S
asapp
1,074
2
Wav2vec2 2 Bert Large No Adapter Frozen Enc
This model is a speech recognition model trained on the librispeech_asr dataset, achieving a word error rate (WER) of 2.0133 on the evaluation set.
Speech Recognition
Transformers

W
speech-seq2seq
25
2
Dcunet Libri1Mix Enhsingle 16k
Audio enhancement model trained based on the Asteroid framework, specifically designed for mono speech enhancement tasks
Audio Enhancement
D
JorisCos
69
5
Dprnntasnet Ks2 Libri1Mix Enhsingle 16k
An audio enhancement model trained on the Asteroid framework, specifically designed for single-channel speech enhancement tasks, trained on the Libri1Mix dataset.
Audio Enhancement
D
JorisCos
4,859
1
Dptnet Libri1Mix Enhsingle 16k
Audio enhancement model trained based on the Asteroid framework, focusing on mono speech enhancement tasks
Audio Enhancement
D
JorisCos
4,446
3
Featured Recommended AI Models