Wav2vec2 Conformer Rel Pos Large 100h Ft
W
Wav2vec2 Conformer Rel Pos Large 100h Ft
Developed by facebook
A large-scale Wav2Vec2-Conformer speech recognition model using relative position embedding, fine-tuned on 100 hours of Librispeech data
Downloads 99
Release Time : 4/18/2022
Model Overview
This is an automatic speech recognition (ASR) model based on the Wav2Vec2-Conformer architecture, employing relative position embedding technology, fine-tuned on 100 hours of Librispeech data, suitable for English speech recognition tasks with 16kHz sampling rate.
Model Features
Relative Position Embedding
Uses relative position embedding technology, potentially improving performance for long-sequence speech recognition
Conformer Architecture
Combines the advantages of Transformer and CNN, capable of capturing both local and global speech features
Efficient Training
Fine-tuned on 100 hours of Librispeech data, more efficient compared to full 960-hour training
Model Capabilities
English Speech Recognition
16kHz Sampling Rate Audio Processing
Use Cases
Speech-to-Text
Meeting Minutes
Automatically convert English meeting recordings into text transcripts
Podcast Transcription
Transcribe English podcast content into text
Featured Recommended AI Models
Š 2025AIbase