# Real-time Speech Processing
Lite Whisper Large V3 Acc
Apache-2.0
Lite-Whisper is a compressed version of OpenAI Whisper, utilizing LiteASR technology to reduce model size while maintaining high accuracy.
Speech Recognition
Transformers

L
efficient-speech
57
3
Ultravox V0 5 Llama 3 2 1b
MIT
Ultravox is a multimodal voice large language model based on Llama3.2-1B and Whisper-large-v3, capable of processing both voice and text inputs.
Text-to-Audio
Transformers Supports Multiple Languages

U
fixie-ai
167.25k
21
Phil Pyannote Speaker Diarization Endpoint
MIT
A speaker diarization model based on pyannote.audio 2.0, designed for automatic detection and segmentation of different speakers in audio.
Speaker Analysis
P
tawkit
215
7
Metricgan Plus Voicebank
Apache-2.0
This is a speech enhancement model trained using the MetricGAN+ method, capable of effectively improving speech quality.
Audio Enhancement English
M
speechbrain
55.91k
65
Featured Recommended AI Models