AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Multimodal Enhancement

# Multimodal Enhancement

Llama 3 EZO VLM 1
A Japanese vision-language model based on Llama-3-8B-Instruct, enhanced with additional pretraining and instruction tuning for improved Japanese capabilities
Image-to-Text Japanese
L
AXCXEPT
19
7
PVD 160k Mistral 7b
Apache-2.0
A text-based vector graphics reasoning model that enhances understanding of vector graphics through intermediate textual visual descriptions
Image-to-Text Transformers
P
mikewang
15
4
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase