# Multimodal Enhancement
Llama 3 EZO VLM 1
A Japanese vision-language model based on Llama-3-8B-Instruct, enhanced with additional pretraining and instruction tuning for improved Japanese capabilities
Image-to-Text Japanese
L
AXCXEPT
19
7
PVD 160k Mistral 7b
Apache-2.0
A text-based vector graphics reasoning model that enhances understanding of vector graphics through intermediate textual visual descriptions
Image-to-Text
Transformers

P
mikewang
15
4
Featured Recommended AI Models