# 4-bit quantization inference

Kimi K2 Instruct 4bit
Other
Kimi-K2-Instruct-4bit is a 4-bit quantized model converted from moonshotai/Kimi-K2-Instruct, suitable for the MLX framework.
Large Language Model
K
mlx-community
1,131
4
Qwen3 30B A3B 4bit DWQ 10072025
Apache-2.0
The 4-bit quantized version of Qwen3-30B-A3B, suitable for efficient inference on the MLX framework
Large Language Model
Q
mlx-community
150
2
Deepseek R1 0528 AWQ
MIT
The 4-bit AWQ quantized version of the DeepSeek-R1-0528 671B model, suitable for use on high-end GPU nodes
Large Language Model Transformers
D
adamo1139
161
2
Gemma 3 12b It 4bit DWQ
A 4-bit quantized version of the Gemma 3 12B model, suitable for the MLX framework and supporting efficient text generation tasks.
Large Language Model
G
mlx-community
554
2
GLM 4 32B 0414.w4a16 Gptq
MIT
This is a model that uses the GPTQ method to perform 4-bit quantization on GLM-4-32B-0414, suitable for consumer-grade hardware.
Large Language Model Safetensors
G
mratsim
785
2
Google Gemma 2 27b It AWQ
Gemma 2 27B IT is a 4-bit large language model based on AutoAWQ quantization, suitable for dialogue and instruction-following tasks.
Large Language Model Safetensors
G
mbley
122
2
Qwq 32B Preview AWQ
Apache-2.0
The AWQ 4-bit quantization version of QwQ-32B-Preview significantly reduces memory usage and computational requirements, making it suitable for hardware deployment with limited resources.
Large Language Model Transformers English
Q
KirillR
2,247
26
Mistral 7B Instruct V0.3 GPTQ 4bit
Apache-2.0
The 4-bit quantized version of Mistral-7B-Instruct-v0.3, which optimizes inference performance through the GPTQ method while maintaining high accuracy
Large Language Model Transformers
M
RedHatAI
9,897
19
Llama 2 7b MedQuAD
Apache-2.0
A medical Q&A model fine-tuned on the MedQuAD dataset based on Llama-2-7b-chat
Large Language Model
L
EdwardYu
27
2
Falcon 7B Instruct GPTQ
Apache-2.0
The 4-bit quantized version of Falcon-7B-Instruct, quantized using the AutoGPTQ tool, suitable for efficient inference in resource-constrained environments.
Large Language Model Transformers English
F
TheBloke
189
67
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase