S

S2t Medium Librispeech Asr

Developed by facebook
A speech-to-text (S2T) model for automatic speech recognition (ASR), based on a sequence-to-sequence transformer architecture
Downloads 1,086
Release Time : 3/2/2022

Model Overview

This model is an end-to-end sequence-to-sequence transformer model trained using standard autoregressive cross-entropy loss, capable of converting speech to text

Model Features

End-to-end speech recognition
Directly generates text from speech features without intermediate processing steps
Autoregressive generation
Uses autoregressive approach to progressively generate transcribed text
LibriSpeech training
Trained on the LibriSpeech dataset, suitable for English speech recognition

Model Capabilities

Speech recognition
English transcription
End-to-end speech-to-text

Use Cases

Speech transcription
Meeting minutes
Automatically convert meeting recordings into text transcripts
Voice notes
Convert voice memos into searchable text
Assistive technology
Real-time captions
Provide real-time speech-to-text services for the hearing impaired
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase