L

Llava MORE Llama 3 1 8B Finetuning

Developed by aimagelab
LLaVA-MORE is an enhanced version based on the LLaVA architecture, integrating LLaMA 3.1 as the language model, focusing on image-to-text tasks.
Downloads 215
Release Time : 7/30/2024

Model Overview

LLaVA-MORE enhances the renowned LLaVA architecture by integrating LLaMA 3.1 as the language model. This model is primarily used for image-to-text tasks and supports visual instruction tuning.

Model Features

Enhanced Visual Instruction Tuning
Improves visual instruction tuning capabilities by integrating LLaMA 3.1 as the language model.
Two-Stage Training
Provides first-stage and second-stage checkpoints for easy use in different scenarios.

Model Capabilities

Image-to-Text Generation
Visual Instruction Understanding

Use Cases

Visual Question Answering
Image Caption Generation
Generates detailed textual descriptions based on input images.
Visual Instruction Response
Generates corresponding textual responses based on visual input and instructions.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase