W

Wav2vec2 Conformer Rel Pos Large 100h Ft

Developed by facebook
A large-scale Wav2Vec2-Conformer speech recognition model using relative position embedding, fine-tuned on 100 hours of Librispeech data
Downloads 99
Release Time : 4/18/2022

Model Overview

This is an automatic speech recognition (ASR) model based on the Wav2Vec2-Conformer architecture, employing relative position embedding technology, fine-tuned on 100 hours of Librispeech data, suitable for English speech recognition tasks with 16kHz sampling rate.

Model Features

Relative Position Embedding
Uses relative position embedding technology, potentially improving performance for long-sequence speech recognition
Conformer Architecture
Combines the advantages of Transformer and CNN, capable of capturing both local and global speech features
Efficient Training
Fine-tuned on 100 hours of Librispeech data, more efficient compared to full 960-hour training

Model Capabilities

English Speech Recognition
16kHz Sampling Rate Audio Processing

Use Cases

Speech-to-Text
Meeting Minutes
Automatically convert English meeting recordings into text transcripts
Podcast Transcription
Transcribe English podcast content into text
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase