AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Speech-Text Multimodal

# Speech-Text Multimodal

Ultravox V0 3
MIT
Ultravox is a multimodal speech large language model built upon Llama3.1-8B-Instruct and Whisper-small, capable of processing both speech and text inputs.
Text-to-Audio Transformers English
U
fixie-ai
48.30k
17
Ultravox V0 2
MIT
Ultravox is a multimodal voice large language model built upon Llama3-8B-Instruct and Whisper-small, capable of processing both speech and text inputs.
Audio-to-Text Transformers English
U
fixie-ai
792
51
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase