L

Llava Pretrain Vicuna 7b V1.3

Developed by liuhaotian
LLaVA is an open-source multimodal chatbot, fine-tuned on GPT-generated multimodal instruction-following data based on LLaMA/Vicuna.
Downloads 54
Release Time : 8/2/2023

Model Overview

LLaVA is an autoregressive language model based on the Transformer architecture, primarily used for research on large multimodal models and chatbots.

Model Features

Multimodal Capability
Combines visual and language understanding to handle joint image-text tasks.
Instruction Following
Capable of understanding and executing complex multimodal instructions.
Open-source Model
Built upon the open-source LLaMA/Vicuna models.

Model Capabilities

Image-Text Understanding
Multimodal Dialogue
Visual Question Answering
Image Caption Generation

Use Cases

Research
Multimodal Model Research
Used for studying vision-language joint representation learning.
Chatbot Development
Serves as a foundational model for multimodal chatbots.
Education
Visual-Assisted Learning
Helps students understand image content and answer questions.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase