L

Llava Llama 2 13b Chat Lightning Preview

Developed by liuhaotian
LLaVA is an open-source multimodal chatbot model based on the Transformer architecture, obtained by fine-tuning LLaMA/Vicuna on multimodal instruction-following data generated by GPT.
Downloads 2,122
Release Time : 7/19/2023

Model Overview

LLaVA is mainly used for research on multimodal large models and chatbots. It supports multimodal processing capabilities for images and text, providing support for research in fields such as computer vision and natural language processing.

Model Features

Multimodal capabilities
Fine-tuned on multimodal instruction-following data generated by GPT, with multimodal capabilities for processing images and text.
Transformer architecture
Adopts an autoregressive language model and is built based on the Transformer architecture.
Open-source research support
Provides open-source support for researchers and enthusiasts in fields such as computer vision and natural language processing.

Model Capabilities

Image understanding
Text generation
Visual reasoning
Multimodal dialogue

Use Cases

Academic research
Multimodal model research
Used to study the multimodal interaction capabilities of images and text.
Visual reasoning tasks
Evaluated on the ScienceQA dataset, achieving a new state-of-the-art level in collaboration with GPT-4.
Achieved the best performance on the ScienceQA dataset
Application development
Intelligent chatbot
Develop a chatbot with image understanding and dialogue capabilities.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase