G

Gemma 3 12b It Qat Int4 GGUF

Developed by unsloth
Gemma 3 is Google's lightweight open model series based on Gemini technology. The 12B version employs Quantization-Aware Training (QAT) technology, supports multimodal input, and features a 128K context window.
Downloads 1,921
Release Time : 4/25/2025

Model Overview

Gemma 3 is a multimodal model capable of processing text and image inputs to generate text outputs, available in both pretrained and instruction-tuned variants. It supports over 140 languages and is suitable for tasks like Q&A, summarization, and reasoning.

Model Features

Quantization-Aware Training (QAT)
Utilizes QAT technology for efficient quantization, reducing memory footprint while maintaining model quality comparable to bfloat16.
Multimodal Processing
Supports text and image inputs (896x896 resolution), with each image encoded as 256 tokens.
Extended Context
The 12B model supports a context window length of 128K tokens.
Multilingual Support
Training data covers 140+ languages, providing robust cross-lingual capabilities.

Model Capabilities

Text Generation
Image Content Analysis
Multilingual Processing
Code Generation
Mathematical Reasoning
Visual Question Answering

Use Cases

Content Generation
Automatic Summarization
Generates concise summaries of long documents
Achieved 78.2 points on TriviaQA benchmark (5-shot)
Creative Writing
Generates stories or poems based on prompts
Knowledge Q&A
Open-Domain Q&A
Answers various factual questions
Achieved 31.4 points on Natural Questions benchmark (5-shot)
Visual Understanding
Image Captioning
Generates natural language descriptions for images
Achieved 111 points on COCO Captions benchmark
Document Analysis
Parses content and structure from document images
Achieved 82.3 points on DocVQA validation set
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase