A

Assignment1 Joane

Developed by Classroom-workshop
A speech-to-text (S2T) model for automatic speech recognition (ASR)
Downloads 22
Release Time : 6/2/2022

Model Overview

This model is an end-to-end sequence-to-sequence transformer trained with standard autoregressive cross-entropy loss and generates transcriptions autoregressively.

Model Features

End-to-end model
Generates text directly from speech features without intermediate processing steps
High accuracy
Achieves excellent performance of 4.3 (WER, clean) and 9.0 (WER, other) on LibriSpeech test sets
Autoregressive generation
Generates transcriptions autoregressively to improve output quality

Model Capabilities

English speech recognition
End-to-end speech-to-text
Real-time speech transcription

Use Cases

Speech transcription
Meeting minutes
Automatically convert meeting recordings into text transcripts
Highly accurate transcripts
Voice notes
Convert voice memos into searchable text
Easily retrievable and organized text content
Assistive technology
Hearing assistance
Provide real-time captions for the hearing impaired
Improved accessibility
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase