P

Parakeet Tdt 0.6b V2

Developed by nvidia
An automatic speech recognition model with 600 million parameters, supporting English transcription, punctuation, capitalization, and timestamp prediction
Downloads 242.71k
Release Time : 4/15/2025

Model Overview

Parakeet TDT 0.6B V2 is a high-quality automatic speech recognition (ASR) model designed for English transcription, supporting precise timestamp prediction, automatic punctuation, and capitalization processing.

Model Features

Precise Timestamp Prediction
Supports word-level, character-level, and segment-level timestamp prediction
Automatic Punctuation and Capitalization
Automatically processes punctuation and capitalization in transcribed text
Long Audio Processing Capability
Can process audio segments up to 24 minutes in one go
Robust Performance
Demonstrates robustness in transcribing spoken numbers and song lyrics

Model Capabilities

Speech-to-Text
Timestamp Prediction
Punctuation Restoration
Capitalization Restoration

Use Cases

Conversational AI
Voice Assistants
Building intelligent assistants with voice interaction
Transcription Services
Meeting Minutes
Automatically transcribe meeting audio
WER of 11.16 on AMI test set
Subtitle Generation
Automatically generate subtitles for video content
Speech Analysis
Speech Data Analysis
Analyze speech data to extract insights
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase