đ LLaVA Model Card
LLaVA is an open - source chatbot that combines image and text processing capabilities. It offers valuable insights for research in multimodal models and chatbot development.
đ Quick Start
This section is not provided in the original document, so it is skipped.
⨠Features
LLaVA is an auto - regressive language model based on the transformer architecture. It is fine - tuned on multimodal instruction - following data, enabling it to handle image - text - to - text tasks effectively.
đĻ Installation
This section is not provided in the original document, so it is skipped.
đģ Usage Examples
This section is not provided in the original document, so it is skipped.
đ Documentation
Model details
Property |
Details |
Model Type |
LLaVA is an open - source chatbot trained by fine - tuning LLM on multimodal instruction - following data. It is an auto - regressive language model, based on the transformer architecture. Base LLM: [lmsys/vicuna - 13b - v1.5](https://huggingface.co/lmsys/vicuna - 13b - v1.5) |
Model Date |
LLaVA - v1.6 - Vicuna - 13B was trained in December 2023. |
Paper or resources for more information |
[https://llava - vl.github.io/](https://llava - vl.github.io/) |
Intended use
Primary intended uses
The primary use of LLaVA is research on large multimodal models and chatbots.
Primary intended users
The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence.
Training dataset
- 558K filtered image - text pairs from LAION/CC/SBU, captioned by BLIP.
- 158K GPT - generated multimodal instruction - following data.
- 500K academic - task - oriented VQA data mixture.
- 50K GPT - 4V data mixture.
- 40K ShareGPT data.
Evaluation dataset
A collection of 12 benchmarks, including 5 academic VQA benchmarks and 7 recent benchmarks specifically proposed for instruction - following LMMs.
đ§ Technical Details
This section is not provided in the original document, so it is skipped.
đ License
Llama 2 is licensed under the LLAMA 2 Community License, Copyright (c) Meta Platforms, Inc. All Rights Reserved.
Where to send questions or comments about the model: [https://github.com/haotian - liu/LLaVA/issues](https://github.com/haotian - liu/LLaVA/issues)