S

Stt En Fastconformer Ctc Xlarge

Developed by nvidia
NVIDIA FastConformer-CTC XLarge is an Automatic Speech Recognition (ASR) model with approximately 600 million parameters, designed specifically for English speech transcription and trained using the FastConformer architecture and CTC loss.
Downloads 216
Release Time : 6/12/2023

Model Overview

This model can transcribe English speech into lowercase text and performs excellently on multiple public datasets, suitable for general audio transcription tasks.

Model Features

Optimized FastConformer architecture
Adopts 8x depthwise separable convolution downsampling, with significant optimization compared to the standard Conformer model
Trained on multiple datasets
Trained on a composite dataset containing thousands of hours of English speech, covering various domains and accents
High-performance
Achieves a word error rate of 1.8% (clean) and 3.65% (other) on the LibriSpeech test set

Model Capabilities

English speech recognition
Audio transcription
Supports 16kHz mono audio input

Use Cases

Speech transcription
Meeting minutes
Automatically transcribe meeting recordings into written records
Highly accurate transcription results
Voice notes
Convert voice notes into searchable text
Assistive technology
Real-time subtitle generation
Generate real-time subtitles for videos or live content
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase