MQT LLaVA 7b
M
MQT LLaVA 7b
Developed by gordonhu
MQT-LLaVA is an open-source multimodal chatbot model based on the Transformer architecture. It is trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction data.
Downloads 349
Release Time : 5/28/2024
Model Overview
MQT-LLaVA is an open-source model for multimodal large model and chatbot research. It can handle image and text inputs and generate text outputs.
Model Features
Open-source model
Completely open-source, available for research and commercial use (subject to the LLAMA 2 license)
Multimodal processing ability
Capable of handling image and text inputs simultaneously and generating coherent text responses
Large-scale training data
More than 1 million multimodal training data, including image-text pairs and instruction data, are used
Model Capabilities
Multimodal dialogue
Visual question answering
Image understanding and description
Text generation
Instruction following
Use Cases
Academic research
Multimodal large model research
Used to explore visual-language joint representation learning
Chatbot development
Build a dialogue system that can understand image content
Educational applications
Visual-assisted learning
Help students understand complex concepts through images
Featured Recommended AI Models
Š 2025AIbase