P

Parakeet Tdt Ctc 1.1b

Developed by nvidia
Parakeet TDT-CTC 1.1B is an automatic speech recognition model capable of transcribing English speech with punctuation and capitalization, jointly developed by NVIDIA NeMo and Suno.ai.
Downloads 35.19k
Release Time : 5/7/2024

Model Overview

This model is an automatic speech recognition model based on Hybrid FastConformer TDT-CTC architecture, capable of efficiently processing audio transcription tasks up to 11 hours in length.

Model Features

Efficient Long Audio Processing
Capable of transcribing audio up to 11 hours in one go, taking less than 16 seconds to transcribe 90 minutes of audio on an A100.
High Accuracy Transcription
Performs excellently on multiple test datasets, achieving a WER as low as 1.82% on the LibriSpeech test set.
Punctuation and Capitalization Support
Automatically identifies and adds punctuation marks and correct capitalization formats.

Model Capabilities

English speech recognition
Long audio transcription
Automatic punctuation addition
Automatic capitalization recognition

Use Cases

Speech Transcription
Meeting Minutes
Automatically transcribes business meeting content
Achieves a WER of 15.94% on the AMI meeting test set
Academic Lecture Transcription
Transcribes university lectures or academic speeches
Achieves a WER of 3.87% on the TEDLIUM-v3 test set
Media Content Processing
Podcast Transcription
Automatically converts podcast content into text
Achieves a WER of 6.19% on the Vox Populi test set
Film and TV Subtitle Generation
Generates subtitles for film and TV content
Achieves a WER as low as 1.82% on the LibriSpeech test set
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase