P

Parakeet Tdt 1.1b

Developed by nvidia
Parakeet TDT 1.1B is an automatic speech recognition (ASR) model jointly developed by NVIDIA NeMo and Suno.ai, capable of transcribing speech into lowercase English letters.
Downloads 12.27k
Release Time : 1/25/2024

Model Overview

This is an automatic speech recognition model based on the FastConformer-TDT architecture, with approximately 1.1 billion parameters, designed for efficient speech transcription.

Model Features

Efficient Architecture
Utilizes FastConformer-TDT architecture with 8x depthwise separable convolution downsampling for optimized performance
Fast Inference
TDT (Token and Duration Transformer) design significantly improves inference speed
Large-scale Training
Trained on 64K hours of English speech data, including various public and private datasets
Multi-domain Applicability
Performs excellently on multiple test datasets, suitable for speech recognition tasks across different domains

Model Capabilities

Speech Recognition
Audio Transcription
English Speech Processing

Use Cases

Speech Transcription
Meeting Minutes
Automatically transcribes meeting audio content
Achieves a WER of 15.90 on the AMI test set
Speech-to-Text
Converts speech content into editable text
Achieves a WER as low as 1.39 on the LibriSpeech test set
Speech Analysis
Speech Data Analysis
Processes and analyzes large-scale speech data
Achieves a WER of 9.55 on the GigaSpeech test set
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase