Q

Qwq 32B GGUF

Developed by Mungert
Ultra-low-bit quantization (1-2 bit) large language model using IQ-DynamicGate technology, supporting multilingual text generation tasks
Downloads 5,770
Release Time : 4/4/2025

Model Overview

Quantized version based on Qwen2.5-32B, achieving ultra-low-bit quantization (1-2 bit) through dynamic precision allocation technology, improving model accuracy while maintaining memory efficiency.

Model Features

IQ-DynamicGate Quantization Technology
Dynamic precision allocation with layered strategy: the first and last 25% layers use IQ4_XS, while the middle 50% layers use IQ2_XXS/IQ3_S, significantly reducing error propagation
Key Component Protection
Embedding and output layers use Q5_K quantization, reducing error propagation by 38% compared to standard 1-2 bit quantization
Multi-format Support
Provides BF16, F16, and various quantization formats (Q4_K, Q6_K, Q8_0, etc.), adapting to different hardware requirements

Model Capabilities

Multilingual text generation
Chat dialogue
Low-resource environment inference

Use Cases

Resource-constrained deployment
Edge device text generation
Running chatbots on memory-limited edge devices
IQ1_M quantized version reduces perplexity by 43.9%
Research applications
Ultra-low-bit quantization research
Exploring the performance limits of 1-2 bit quantization
IQ2_S quantization reduces perplexity by 36.9%
Featured Recommended AI Models
ยฉ 2025AIbase