Llava MORE Llama 3 1 8B Finetuning
LLaVA-MORE is an enhanced version based on the LLaVA architecture, integrating LLaMA 3.1 as the language model, focusing on image-to-text tasks.
Downloads 215
Release Time : 7/30/2024
Model Overview
LLaVA-MORE enhances the renowned LLaVA architecture by integrating LLaMA 3.1 as the language model. This model is primarily used for image-to-text tasks and supports visual instruction tuning.
Model Features
Enhanced Visual Instruction Tuning
Improves visual instruction tuning capabilities by integrating LLaMA 3.1 as the language model.
Two-Stage Training
Provides first-stage and second-stage checkpoints for easy use in different scenarios.
Model Capabilities
Image-to-Text Generation
Visual Instruction Understanding
Use Cases
Visual Question Answering
Image Caption Generation
Generates detailed textual descriptions based on input images.
Visual Instruction Response
Generates corresponding textual responses based on visual input and instructions.
Featured Recommended AI Models
Š 2025AIbase