S

S2t Large Librispeech Asr

Developed by facebook
An end-to-end sequence-to-sequence transformer model for automatic speech recognition (ASR), trained on the LibriSpeech dataset
Downloads 422
Release Time : 3/2/2022

Model Overview

This model is a speech-to-text transformer (S2T) trained using standard autoregressive cross-entropy loss, capable of converting speech signals into corresponding text transcriptions

Model Features

End-to-end model
Directly generates text transcriptions from speech signals without intermediate processing steps
High performance
Achieves WER scores of 3.3 (clean) and 7.5 (other) on the LibriSpeech test set
Transformer-based architecture
Utilizes modern transformer architecture for sequence modeling

Model Capabilities

English speech recognition
Real-time speech-to-text
Long audio processing

Use Cases

Speech transcription
Meeting minutes
Automatically convert meeting recordings into text transcripts
Highly accurate transcription results
Podcast transcription
Convert English podcast content into text
Supports long audio processing
Assistive technology
Hearing assistance
Provide real-time captions for hearing-impaired individuals
Low-latency speech recognition
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase