W

Whisper Small

Developed by unsloth
Whisper is a pre-trained automatic speech recognition (ASR) and speech translation model, trained on 680,000 hours of annotated data with strong generalization capabilities.
Downloads 50
Release Time : 5/14/2025

Model Overview

A Transformer-based encoder-decoder model that supports multilingual speech recognition and translation tasks, capable of adapting to various datasets and domains without fine-tuning.

Model Features

Large-scale Weakly Supervised Training
Trained on 680,000 hours of diverse speech data covering multiple languages and accents
Zero-shot Transfer Capability
Performs well on new languages and domains without fine-tuning
Multi-task Unified Architecture
Single model supports both speech recognition and translation tasks
Long Audio Processing
Supports transcription of audio of any length through chunking algorithms

Model Capabilities

Speech-to-text
Cross-language speech translation
Multilingual recognition
Timestamped transcription

Use Cases

Speech Transcription
Automated Meeting Minutes
Convert meeting recordings into text transcripts in real-time
English test set WER 3.43% (LibriSpeech clean)
Podcast Subtitle Generation
Create multilingual subtitles for non-English podcasts
Speech Translation
Real-time Speech Translation
Translate languages like French into English text in real-time
Examples demonstrate smooth cross-language conversion
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase