S

S2t Small Librispeech Asr

Developed by facebook
A speech-to-text (S2T) model for automatic speech recognition (ASR), based on a sequence-to-sequence transformer architecture
Downloads 10.92k
Release Time : 3/2/2022

Model Overview

This model is an end-to-end speech recognition model trained using standard autoregressive cross-entropy loss, capable of converting speech into text

Model Features

End-to-end speech recognition
Directly generates text output from speech input without intermediate processing steps
Transformer-based architecture
Utilizes advanced sequence-to-sequence transformer model architecture
High accuracy
Performs exceptionally well on the LibriSpeech test set, with a WER of 4.3 on the clean test set and 9.0 on the other test set

Model Capabilities

English speech recognition
End-to-end speech-to-text conversion
Long audio processing

Use Cases

Speech transcription
Audio content transcription
Convert English speech content into text format
Highly accurate transcription results
Assistive technology
Real-time caption generation
Generate real-time captions for English videos or live streams
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase