L

Llava V1.5 Mlp2x 336px Pretrain Vicuna 7b V1.5

Developed by liuhaotian
LLaVA is an open-source multimodal chatbot, fine-tuned based on LLaMA/Vicuna and trained with GPT-generated multimodal instruction-following data.
Downloads 173
Release Time : 10/5/2023

Model Overview

LLaVA is an autoregressive language model based on the Transformer architecture, primarily used for research on large multimodal models and chatbots.

Model Features

Multimodal Capability
Combines visual and language understanding to process both image and text inputs.
Instruction Following
Capable of understanding and executing complex multimodal instructions.
Open-source
The model is fully open-source and available for research and development.

Model Capabilities

Image understanding
Visual question answering
Multimodal dialogue
Instruction following

Use Cases

Research
Multimodal Model Research
Used for research at the intersection of computer vision and natural language processing.
Application Development
Intelligent Chatbot
Develop intelligent dialogue systems capable of understanding image content.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase