W

Wav2vec2 2 Bart Large Tedlium

Developed by sanchit-gandhi
A sequence-to-sequence automatic speech recognition model trained on the TEDLIUM corpus, combining the Wav2Vec2 speech encoder and the Bart text decoder
Downloads 111
Release Time : 6/29/2022

Model Overview

This model is used for English speech recognition tasks. It adopts a hybrid architecture with Wav2Vec2 as the speech encoder and Bart as the text decoder, and performs excellently on the TED speech dataset

Model Features

Hybrid Architecture
Combining the advantages of the Wav2Vec2 speech encoder and the Bart text decoder to achieve efficient speech recognition
High Performance
Achieved a Word Error Rate (WER) of 6.4% on the TEDLIUM test set, showing excellent performance
Pretrained Initialization
The encoder and decoder are initialized with the pretrained weights of Wav2Vec2 LV-60k and Bart large respectively

Model Capabilities

English Speech Recognition
Long Audio Processing
High-quality Transcription

Use Cases

Meeting Minutes
TED Speech Transcription
Automatically convert TED speech audio into a written transcript
Word Error Rate of 6.4% on the test set
Education
Lecture Recording Transcription
Convert academic lecture recordings into text for notes or subtitles
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase