Model Selection

Real-time Speech Processing

# Real-time Speech Processing

Lite Whisper Large V3 Acc

Lite-Whisper is a compressed version of OpenAI Whisper, utilizing LiteASR technology to reduce model size while maintaining high accuracy.

Speech Recognition

efficient-speech

Ultravox V0 5 Llama 3 2 1b

Ultravox is a multimodal voice large language model based on Llama3.2-1B and Whisper-large-v3, capable of processing both voice and text inputs.

Transformers Supports Multiple Languages

Phil Pyannote Speaker Diarization Endpoint

A speaker diarization model based on pyannote.audio 2.0, designed for automatic detection and segmentation of different speakers in audio.

Speaker Analysis

Metricgan Plus Voicebank

This is a speech enhancement model trained using the MetricGAN+ method, capable of effectively improving speech quality.

Audio Enhancement English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase