Open-source multimodal chatbot llava-1.6-mistral-7b-gguf - Free deployment with multiple quantization options

Llava 1.6 Mistral 7b Gguf

Developed by cjpais

LLaVA is an open-source multimodal chatbot, trained by fine-tuning LLM on multimodal instruction-following data. This version is the GGUF quantized version, offering multiple quantization options.

Text-to-Image Open Source License:Apache-2.0 #Multimodal Dialogue #Low-resource Deployment #Image-Text Understanding

Downloads 9,652

Release Time : 2/1/2024

Model Overview

A multimodal model based on Mistral-7B-Instruct-v0.2, supporting image and text inputs to generate text outputs. Primarily used for research on large multimodal models and chatbots.

Model Features

Multimodal Capability

Supports processing both image and text inputs to generate relevant text outputs

Multiple Quantization Options

Offers various quantization versions from 3-bit to 8-bit to meet different hardware requirements

Optimized Projector

Updated quantization parameters and projector to improve model performance

Model Capabilities

Image Understanding

Multimodal Dialogue

Visual Question Answering

Instruction Following

Use Cases

Research

Multimodal Model Research

Used for research at the intersection of computer vision and natural language processing

Chatbot Development

Develop intelligent dialogue systems capable of understanding image content

Education

Visual-Assisted Learning

Helps students understand complex concepts through images

🚀 GGUF Quantized LLaVA 1.6 Mistral 7B

This project provides GGUF quantized versions of the LLaVA 1.6 Mistral 7B model, updated with quants and projector from PR #5267.

🚀 Quick Start

The GGUF quantized LLaVA 1.6 Mistral 7B model offers multiple quantization options for different use - cases. You can choose the appropriate quantized file according to your needs.

📚 Documentation

Provided files

Name	Quant method	Bits	Size	Use case
llava-v1.6-mistral-7b.Q3_K_XS.gguf	Q3_K_XS	3	2.99 GB	very small, high quality loss
llava-v1.6-mistral-7b.Q3_K_M.gguf	Q3_K_M	3	3.52 GB	very small, high quality loss
llava-v1.6-mistral-7b.Q4_K_M.gguf	Q4_K_M	4	4.37 GB	medium, balanced quality - recommended
llava-v1.6-mistral-7b.Q5_K_S.gguf	Q5_K_S	5	5.00 GB	large, low quality loss - recommended
llava-v1.6-mistral-7b.Q5_K_M.gguf	Q5_K_M	5	5.13 GB	large, very low quality loss - recommended
llava-v1.6-mistral-7b.Q6_K.gguf	Q6_K	6	5.94 GB	very large, extremely low quality loss
llava-v1.6-mistral-7b.Q8_0.gguf	Q8_0	8	7.7 GB	very large, extremely low quality loss - not recommended

ORIGINAL LLaVA Model Card

Model details

Property	Details
Model Type	LLaVA is an open - source chatbot trained by fine - tuning LLM on multimodal instruction - following data. It is an auto - regressive language model, based on the transformer architecture. Base LLM: mistralai/Mistral-7B-Instruct-v0.2
Model Date	LLaVA-v1.6-Mistral-7B was trained in December 2023.
Paper or resources for more information	https://llava-vl.github.io/

License

mistralai/Mistral-7B-Instruct-v0.2 license.

⚠️ Important Note

For questions or comments about the model, please visit https://github.com/haotian-liu/LLaVA/issues

Intended use

Property	Details
Primary intended uses	The primary use of LLaVA is research on large multimodal models and chatbots.
Primary intended users	The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence.

Training dataset

558K filtered image - text pairs from LAION/CC/SBU, captioned by BLIP.
158K GPT - generated multimodal instruction - following data.
500K academic - task - oriented VQA data mixture.
50K GPT - 4V data mixture.
40K ShareGPT data.

Evaluation dataset

A collection of 12 benchmarks, including 5 academic VQA benchmarks and 7 recent benchmarks specifically proposed for instruction - following LMMs.

📄 License

This project is under the apache-2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご