P

Parakeet Ctc 0.6b

Developed by nvidia
Parakeet CTC 0.6B is an automatic speech recognition model jointly developed by NVIDIA NeMo and Suno.ai, based on the FastConformer architecture with approximately 600 million parameters, supporting English speech transcription.
Downloads 6,528
Release Time : 12/28/2023

Model Overview

This model is a high-performance automatic speech recognition system capable of accurately transcribing English speech into text, suitable for various speech recognition scenarios.

Model Features

High-Performance Speech Recognition
Optimized with the FastConformer architecture, featuring 8x depthwise separable convolution downsampling for efficient speech recognition capabilities.
Large-Scale Training Data
Trained on 64K hours of English speech data, including multiple public and private datasets, covering various speech scenarios.
Low Word Error Rate
Outstanding performance on multiple test sets, such as achieving a WER as low as 1.87% on the LibriSpeech test set.

Model Capabilities

English Speech Recognition
Audio Transcription
Supports 16kHz Mono Audio Input

Use Cases

Speech Transcription
Meeting Minutes
Automatically transcribe meeting recordings to improve meeting documentation efficiency.
Achieved a WER of 16.3% on the AMI meeting test set
Speech-to-Text
Convert speech content into editable text format.
Achieved a WER of 1.87%-3.76% on the LibriSpeech test set
Speech Analysis
Speech Content Analysis
Analyze speech content to extract key information.
Demonstrated excellent performance on multiple test sets
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase