Q

Qwen3 30B A3B Quantized.w4a16

Developed by RedHatAI
INT4 quantized version of Qwen3-30B-A3B, reducing disk and GPU memory requirements by 75% while maintaining high performance.
Downloads 379
Release Time : 5/6/2025

Model Overview

Quantized model based on Qwen3-30B-A3B, suitable for inference, function calling, multilingual instruction following, and translation tasks.

Model Features

Efficient weight quantization
Adopts INT4 quantization scheme, reducing disk and GPU memory requirements by 75%.
High-performance inference
Maintains performance close to the original model in multiple benchmarks, with a recovery rate of over 98%.
Multilingual support
Supports multilingual instruction following and translation tasks.
Optimized deployment
Supports efficient deployment with vLLM backend and is compatible with OpenAI services.

Model Capabilities

Text generation
Function calling
Multilingual instruction following
Translation

Use Cases

Natural language processing
Multilingual translation
Supports high-quality translation between multiple languages.
Instruction following
Capable of understanding and executing complex multilingual instructions.
Reasoning tasks
Mathematical reasoning
Excels in mathematical reasoning tasks.
Achieved 86.66 points in GSM-8K tasks
Logical reasoning
Maintains high performance in logical reasoning tasks.
Achieved 62.97 points in ARC Challenge tasks
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase