A

Asr Streaming Conformer Gigaspeech

Developed by speechbrain
An English automatic speech recognition model pre-trained on the GigaSpeech dataset, supporting both streaming and non-streaming transcription
Downloads 66
Release Time : 11/6/2024

Model Overview

This is an end-to-end automatic speech recognition system trained using the Conformer architecture and RNN-T loss, supporting dynamic chunk training for streaming transcription capabilities.

Model Features

Streaming support
Supports dynamic chunk training and can perform streaming transcription under different chunk sizes
High performance
Achieves a word error rate of 11.00% on the GigaSpeech test set (non-streaming mode)
Flexible configuration
Can be adjusted to balance between latency and accuracy according to requirements
Suitable for multiple scenarios
Supports both offline transcription and real-time streaming transcription modes

Model Capabilities

English speech recognition
Real-time streaming transcription
Offline batch transcription
Dynamic chunk processing

Use Cases

Speech transcription
Real-time speech-to-text
Used for real-time meeting records or live caption generation
Achieves a word error rate of 11.53% at a chunk size of 960ms
Audio file transcription
Batch processes audio files and converts them into text
Achieves a word error rate of 11.00% in non-streaming mode
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase