Q

QWQ 32B FP8

Developed by qingcheng-ai
QwQ-32B-FP8 is the FP8 quantized version of the QwQ-32B model, maintaining nearly the same accuracy as the BF16 version while supporting faster inference speed.
Downloads 144
Release Time : 3/21/2025

Model Overview

FP8 quantized version of the QwQ-32B model, suitable for efficient inference tasks, with performance comparable to the original BF16 version.

Model Features

Efficient inference
The FP8 quantized version supports faster inference speed while maintaining the same accuracy as the BF16 version.
High performance
Excellent performance on the MMLU benchmark, achieving the same score as the original BF16 version.
Lightweight
Reduces model size through FP8 quantization technology, suitable for resource-constrained environments.

Model Capabilities

Text generation
Efficient inference

Use Cases

Natural language processing
Question answering system
Can be used to build high-performance question answering systems to handle complex queries.
Achieved a score of 61.2 on the MMLU benchmark, demonstrating excellent performance.
Text generation
Suitable for various text generation tasks, such as content creation, summarization, etc.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase