L

Llava V1.5 13B GPTQ

Developed by TheBloke
Llava v1.5 13B is a multimodal model developed by Haotian Liu, combining visual and language capabilities to understand and generate content based on images and text.
Downloads 131
Release Time : 10/15/2023

Model Overview

Llava v1.5 13B is a multimodal model based on the Llama architecture, supporting joint processing of images and text, suitable for tasks such as visual question answering and image caption generation.

Model Features

Multimodal Capability
Combines visual and language processing abilities to understand and generate content based on images and text.
Efficient Quantization
Offers various GPTQ quantization versions to accommodate different hardware needs and reduce inference costs.
High Performance
Based on the Llama architecture, it possesses strong reasoning and generation capabilities.

Model Capabilities

Image Understanding
Text Generation
Visual Question Answering
Image Caption Generation

Use Cases

Education
Visual Question Answering
Answers user questions based on image content.
Provides accurate and detailed answers.
Content Generation
Image Caption Generation
Generates detailed textual descriptions for images.
Produces natural and accurate descriptive text.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase