W

Whisper Large V2

Developed by openai
Whisper is a pre-trained automatic speech recognition (ASR) and speech translation model, trained on 680,000 hours of labeled data with strong generalization capabilities
Downloads 176.55k
Release Time : 12/5/2022

Model Overview

A Transformer-based encoder-decoder model supporting multilingual speech recognition and translation tasks, adaptable to various datasets without fine-tuning

Model Features

Large-scale weakly supervised training
Trained on 680,000 hours of labeled data covering multiple languages and domains
Zero-shot learning capability
Adaptable to new datasets and domains without fine-tuning
Multi-task support
Simultaneously supports speech recognition and speech translation tasks
Long audio processing
Supports transcription of arbitrary-length audio through chunk processing

Model Capabilities

English speech recognition
Multilingual speech recognition
Speech-to-English translation
Long audio transcription
Timestamped transcription

Use Cases

Speech transcription
Meeting minutes
Automatically convert meeting recordings into text transcripts
Supports transcription in 98 languages
Podcast subtitle generation
Automatically generate subtitles for podcast content
English transcription WER 3.0% (LibriSpeech test set)
Speech translation
Real-time translation
Translate foreign language speech into English text in real time
Supports translation from multiple languages like French to English
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase