Q

Qwen2.5 VL 7B Instruct Quantized.w4a16

Developed by RedHatAI
Quantized version of Qwen2.5-VL-7B-Instruct, supporting vision-text input and text output, with weights quantized to INT4 and activations to FP16.
Downloads 605
Release Time : 2/7/2025

Model Overview

This is a quantized model based on Qwen/Qwen2.5-VL-7B-Instruct, optimized for efficient inference and suitable for multimodal tasks.

Model Features

Efficient Quantization
Weights quantized to INT4 and activations to FP16, significantly reducing model size and inference costs
Multimodal Support
Supports vision and text input, capable of understanding and generating text related to images
vLLM Optimization
Optimized for the vLLM inference engine, supporting efficient deployment

Model Capabilities

Visual Question Answering
Image Caption Generation
Multimodal Dialogue
Document Understanding

Use Cases

Visual Question Answering
Image Content Understanding
Answer natural language questions about image content
Achieved 73.90% accuracy on the VQAv2 dataset
Document Processing
Document Question Answering
Extract information from scanned documents or PDFs and answer questions
Achieved 94.13% ANLS score on the DocVQA dataset
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase