Q

Qwen3 8B AWQ

Developed by Qwen
Qwen3-8B-AWQ is the latest generation of large language model with 8.2B parameters in the Tongyi Qianwen series, which uses AWQ 4-bit quantization technology to optimize inference efficiency. It supports the switching between thinking and non-thinking modes and has excellent reasoning, instruction-following, and intelligent agent capabilities.
Downloads 13.99k
Release Time : 5/3/2025

Model Overview

Based on the 4-bit quantized version of Qwen3-8B, it significantly reduces the computational resource requirements while maintaining the model performance. It supports a 32K context length and can be extended to 131K tokens through YaRN.

Model Features

Dual-mode dynamic switching
Supports seamless switching between thinking mode (complex reasoning) and non-thinking mode (efficient dialogue), which can be controlled by the enable_thinking parameter or /think, /no_think instructions.
Enhanced reasoning ability
Surpasses previous-generation models in mathematics, code generation, and logical reasoning. A special decoding strategy is used in thinking mode to improve performance.
Efficient quantization
Uses AWQ 4-bit quantization technology to reduce video memory usage by 75% while maintaining model accuracy.
Ultra-long context
Natively supports 32K tokens and can process long texts up to 131K tokens through YaRN technology.

Model Capabilities

Complex logical reasoning
Multi-round dialogue
Code generation
Multilingual translation
Tool invocation
Creative writing
Mathematical calculation

Use Cases

Intelligent assistant
Personalized dialogue
Achieve in-depth reasoning dialogue through thinking mode or conduct efficient daily communication in non-thinking mode.
A more natural interaction experience with a 40% increase in response speed.
Development assistance
Code completion
Generate high-quality code snippets using enhanced code understanding ability.
Reach the leading level among open-source models in the HumanEval benchmark test.
Data analysis
Long document processing
Analyze ultra-long technical documents or legal texts in combination with YaRN technology.
Support the understanding of 131K tokens context.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase