A

Asr Streaming Conformer Librispeech

Developed by speechbrain
This is an end-to-end automatic speech recognition system pre-trained on the LibriSpeech dataset, supporting both streaming and non-streaming modes, suitable for English speech recognition.
Downloads 304
Release Time : 2/15/2024

Model Overview

The model uses Conformer architecture and RNN-T loss training, supports dynamic chunk training for streaming transcription, and performs excellently on the LibriSpeech test set.

Model Features

Streaming & Non-streaming Support
Supports dynamic chunk training, can work with different chunk sizes to balance latency and accuracy
High-performance Recognition
Achieves 2.72% word error rate on LibriSpeech test-clean set
Dynamic Chunk Convolution
Implements dynamic chunk convolution technology, unifying streaming and non-streaming processing

Model Capabilities

English speech recognition
Real-time streaming transcription
Offline audio file transcription

Use Cases

Speech-to-Text
Real-time Meeting Minutes
Used for real-time transcription of meetings or lectures
Achieves 3.13% word error rate with 960ms chunk size
Audio File Transcription
Convert pre-recorded English audio files to text
Achieves 2.72% word error rate in full-context mode
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase