G

Gemma 3 4b It Quantized.w4a16

Developed by RedHatAI
A quantized version based on google/gemma-3-4b-it, using INT4 weight quantization and FP16 activation quantization to optimize inference efficiency
Downloads 195
Release Time : 6/4/2025

Model Overview

A quantized version of the Gemma 3 4B instruction-tuned model, supporting visual-text input and text output, suitable for multimodal reasoning tasks

Model Features

Efficient quantization
Using INT4 weight quantization and FP16 activation quantization, significantly reducing the computational resource requirements
Multimodal support
Supporting the joint input of images and text to achieve visual-language understanding and generation
High-performance inference
Optimized through the vLLM backend to achieve efficient inference speed
High-precision maintenance
After quantization, the average performance recovery rate reaches 97.42%, and the recovery rate for visual tasks reaches 98.86%

Model Capabilities

Image content understanding
Multimodal dialogue
Visual question answering
Text generation

Use Cases

Visual content analysis
Image description generation
Analyze the input image and generate a natural language description
Achieved an accuracy of 40.11% on the MMMU validation set
Chart understanding
Parse the chart content and answer related questions
Achieved an accuracy of 49.32% on ChartQA
Intelligent dialogue
Multimodal chat assistant
Conduct natural dialogue by combining image and text input
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase