Llava Llama 2 13b Chat Lightning Preview
LLaVA is an open-source multimodal chatbot model based on the Transformer architecture, obtained by fine-tuning LLaMA/Vicuna on multimodal instruction-following data generated by GPT.
Downloads 2,122
Release Time : 7/19/2023
Model Overview
LLaVA is mainly used for research on multimodal large models and chatbots. It supports multimodal processing capabilities for images and text, providing support for research in fields such as computer vision and natural language processing.
Model Features
Multimodal capabilities
Fine-tuned on multimodal instruction-following data generated by GPT, with multimodal capabilities for processing images and text.
Transformer architecture
Adopts an autoregressive language model and is built based on the Transformer architecture.
Open-source research support
Provides open-source support for researchers and enthusiasts in fields such as computer vision and natural language processing.
Model Capabilities
Image understanding
Text generation
Visual reasoning
Multimodal dialogue
Use Cases
Academic research
Multimodal model research
Used to study the multimodal interaction capabilities of images and text.
Visual reasoning tasks
Evaluated on the ScienceQA dataset, achieving a new state-of-the-art level in collaboration with GPT-4.
Achieved the best performance on the ScienceQA dataset
Application development
Intelligent chatbot
Develop a chatbot with image understanding and dialogue capabilities.
Featured Recommended AI Models
Š 2025AIbase