A

Assignment1 Jack

Developed by Classroom-workshop
A speech-to-text (S2T) model for automatic speech recognition (ASR), based on a sequence-to-sequence transformer architecture
Downloads 24
Release Time : 6/2/2022

Model Overview

This model is an end-to-end sequence-to-sequence transformer model trained using standard autoregressive cross-entropy loss, capable of converting speech to text

Model Features

End-to-end speech recognition
Directly generates text from speech features without intermediate processing steps
Transformer-based architecture
Adopts standard sequence-to-sequence transformer structure with excellent sequence modeling capabilities
Trained on LibriSpeech dataset
Trained on the widely-used LibriSpeech dataset, ensuring reliable recognition performance

Model Capabilities

English speech recognition
End-to-end speech-to-text conversion
16kHz sampling rate audio processing

Use Cases

Speech transcription
Meeting minutes
Automatically convert meeting recordings into text transcripts
Voice notes
Convert voice memos into searchable text
Assistive technology
Real-time captioning
Provide real-time speech-to-text services for the hearing impaired
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase