P

Parakeet Ctc 1.1b

Developed by nvidia
Parakeet CTC 1.1B is an automatic speech recognition model jointly developed by NVIDIA NeMo and Suno.ai, based on the FastConformer architecture with approximately 1.1 billion parameters, supporting English speech transcription.
Downloads 14.78k
Release Time : 12/28/2023

Model Overview

This model is an automatic speech recognition (ASR) system capable of converting speech into lowercase English text. It employs an optimized FastConformer architecture and CTC loss function for training.

Model Features

Large-scale training data
Trained on 64K hours of English speech data, including 40K hours of private data and 24K hours of public datasets
Optimized FastConformer architecture
Utilizes an optimized version of Conformer with 8x depthwise separable convolution downsampling, improving processing efficiency
Multi-domain adaptability
Performs excellently across various speech datasets, including conference speech, telephone speech, and public speaking in different scenarios

Model Capabilities

English speech recognition
Audio transcription
Speech-to-text

Use Cases

Speech transcription
Meeting minutes
Automatically transcribe business meeting content
Achieves WER of 15.62 on AMI meeting test set
Call recording transcription
Convert telephone conversation content into text
Performs well on Switchboard dataset
Media processing
Podcast transcription
Automatically generate transcripts for podcast episodes
Achieves WER as low as 1.83-3.54 on LibriSpeech test set
Video caption generation
Automatically generate subtitles for video content
Achieves WER of 6.53 on VoxPopuli test set
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase