Bakllava 1
B
Bakllava 1
Developed by SkunkworksAI
BakLLaVA-1 is a multimodal model based on Mistral 7B and enhanced with the LLaVA 1.5 architecture, outperforming Llama 2 13B on multiple benchmarks.
Downloads 152
Release Time : 10/12/2023
Model Overview
BakLLaVA-1 is an open-source multimodal model that combines Mistral 7B's language capabilities with LLaVA 1.5's visual understanding, suitable for image-text understanding and generation tasks.
Model Features
Powerful multimodal capabilities
Combines Mistral 7B's language model with LLaVA 1.5's visual understanding architecture, achieving excellent image-text understanding and generation capabilities.
Performance surpassing Llama 2 13B
Outperforms the Llama 2 13B model on multiple benchmarks.
Open-source availability
The model is fully open-source under Apache 2.0 license, facilitating research and development use.
Model Capabilities
Image-text understanding
Visual question answering
Multimodal instruction following
Image caption generation
Use Cases
Academic research
Visual question answering system
Used to build systems capable of answering questions about image content
Performs well on academic VQA tasks
Content generation
Automatic image captioning
Generates detailed textual descriptions for images
Capable of producing accurate and rich image descriptions
Featured Recommended AI Models
Š 2025AIbase