QWQ 32B FP8
Q
QWQ 32B FP8
Developed by qingcheng-ai
QwQ-32B-FP8 is the FP8 quantized version of the QwQ-32B model, maintaining nearly the same accuracy as the BF16 version while supporting faster inference speed.
Downloads 144
Release Time : 3/21/2025
Model Overview
FP8 quantized version of the QwQ-32B model, suitable for efficient inference tasks, with performance comparable to the original BF16 version.
Model Features
Efficient inference
The FP8 quantized version supports faster inference speed while maintaining the same accuracy as the BF16 version.
High performance
Excellent performance on the MMLU benchmark, achieving the same score as the original BF16 version.
Lightweight
Reduces model size through FP8 quantization technology, suitable for resource-constrained environments.
Model Capabilities
Text generation
Efficient inference
Use Cases
Natural language processing
Question answering system
Can be used to build high-performance question answering systems to handle complex queries.
Achieved a score of 61.2 on the MMLU benchmark, demonstrating excellent performance.
Text generation
Suitable for various text generation tasks, such as content creation, summarization, etc.
Featured Recommended AI Models
Š 2025AIbase