đ LLaVA Model Card
LLaVA is an open - source chatbot that offers valuable insights and capabilities for research in large multimodal models and chatbots. It is trained on specific datasets and provides a new perspective in the field of AI.
đ Quick Start
No quick start steps are provided in the original document.
⨠Features
- LLaVA is an auto - regressive language model based on the transformer architecture.
- It is trained by fine - tuning LLaMA/Vicuna on GPT - generated multimodal instruction - following data.
đĻ Installation
No installation steps are provided in the original document.
đģ Usage Examples
No usage examples are provided in the original document.
đ Documentation
Model details
Property |
Details |
Model Type |
LLaVA is an open - source chatbot trained by fine - tuning LLaMA/Vicuna on GPT - generated multimodal instruction - following data. It is an auto - regressive language model, based on the transformer architecture. |
Model Date |
LLaVA - LLaMA - 2 - 7B - Chat - LoRA - Preview was trained in July 2023. |
Paper or resources for more information |
https://llava-vl.github.io/ |
Intended use
Property |
Details |
Primary intended uses |
The primary use of LLaVA is research on large multimodal models and chatbots. |
Primary intended users |
The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence. |
Training dataset
- 558K filtered image - text pairs from LAION/CC/SBU, captioned by BLIP.
- 80K GPT - generated multimodal instruction - following data.
Evaluation dataset
A preliminary evaluation of the model quality is conducted by creating a set of 90 visual reasoning questions from 30 unique images randomly sampled from COCO val 2014 and each is associated with three types of questions: conversational, detailed description, and complex reasoning. We utilize GPT - 4 to judge the model outputs.
We also evaluate our model on the ScienceQA dataset. Our synergy with GPT - 4 sets a new state - of - the - art on the dataset.
See https://llava-vl.github.io/ for more details.
đ§ Technical Details
No technical details are provided in the original document.
đ License
Llama 2 is licensed under the LLAMA 2 Community License,
Copyright (c) Meta Platforms, Inc. All Rights Reserved.
Where to send questions or comments about the model:
[https://github.com/haotian - liu/LLaVA/issues](https://github.com/haotian - liu/LLaVA/issues)