Model Selection

Multimodal Enhancement

# Multimodal Enhancement

Llama 3 EZO VLM 1

A Japanese vision-language model based on Llama-3-8B-Instruct, enhanced with additional pretraining and instruction tuning for improved Japanese capabilities

Image-to-Text Japanese

PVD 160k Mistral 7b

A text-based vector graphics reasoning model that enhances understanding of vector graphics through intermediate textual visual descriptions

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase