Q

Qwq 32B FP8 Dynamic

Developed by RedHatAI
FP8 quantized version of QwQ-32B, reducing storage and memory requirements by 50% through dynamic quantization while maintaining 99.75% of the original model accuracy
Downloads 3,107
Release Time : 3/5/2025

Model Overview

Optimized quantized version based on Qwen/QwQ-32B, utilizing FP8 dynamic quantization technology for weights and activations, suitable for efficient inference deployment

Model Features

FP8 Dynamic Quantization
FP8 quantization for weights and activations, reducing storage and memory requirements by approximately 50%
High Accuracy Retention
Maintains 99.75% of the original model accuracy across multiple benchmarks
vLLM Optimization Support
Optimized for the vLLM inference engine, supporting efficient deployment

Model Capabilities

Text generation
Dialogue systems
Code generation
Mathematical reasoning

Use Cases

Intelligent dialogue
Role-playing dialogue
Supports dialogue generation in specific character styles
Example demonstrates pirate-style response capabilities
Mathematical reasoning
Mathematical problem solving
Solves complex mathematical problems
Achieved 97.44% accuracy on the MATH-500 test
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase