Qwen_Qwen3-8B-GGUF Open Source Model - Freely Support Compatibility with llama.cpp for Dialogues and Communications

Qwen Qwen3 8B GGUF

Developed by tensorblock

GGUF format quantized version of Qwen3-8B, provided by TensorBlock, compatible with llama.cpp

Large Language Model Open Source License:Apache-2.0 #Lightweight Quantization #Multi-turn Dialogue Optimization #Chinese Priority Support

Downloads 452

Release Time : 4/29/2025

Model Overview

This repository contains GGUF format model files of Qwen/Qwen3-8B, suitable for text generation tasks, supporting multiple quantization levels

Model Features

Multiple Quantization Levels

Provides 12 different quantization levels from Q2_K to Q8_0 to meet various scenario needs

llama.cpp Compatibility

Compatible with llama.cpp versions up to commit b5214, facilitating local deployment

Optimized Prompt Template

Provides standardized prompt templates for easier interaction with the model

Model Capabilities

Text Generation

Dialogue Systems

Content Creation

Use Cases

Content Generation

Creative Writing

Generate creative text content such as stories and poems

Technical Documentation

Automatically generate technical documents and instructions

Dialogue Systems

Intelligent Assistant

Build conversational AI assistants

library_name: transformers license: apache-2.0 license_link: https://huggingface.co/Qwen/Qwen3-8B/blob/main/LICENSE pipeline_tag: text-generation base_model: Qwen/Qwen3-8B tags:

TensorBlock
GGUF

Feedback and support: TensorBlock's Twitter/X, Telegram Group and Discord server

Qwen/Qwen3-8B - GGUF

This repo contains GGUF format model files for Qwen/Qwen3-8B.

The files were quantized using machines provided by TensorBlock, and they are compatible with llama.cpp as of commit b5214.

Our projects

Awesome MCP Servers	TensorBlock Studio

A comprehensive collection of Model Context Protocol (MCP) servers.	A lightweight, open, and extensible multi-LLM interaction studio.
ðŸ‘€ See what we built ðŸ‘€	ðŸ‘€ See what we built ðŸ‘€

Prompt template

<|im_start|>system
{system_prompt}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

Model file specification

Filename	Quant type	File Size	Description
Qwen3-8B-Q2_K.gguf	Q2_K	3.282 GB	smallest, significant quality loss - not recommended for most purposes
Qwen3-8B-Q3_K_S.gguf	Q3_K_S	3.770 GB	very small, high quality loss
Qwen3-8B-Q3_K_M.gguf	Q3_K_M	4.124 GB	very small, high quality loss
Qwen3-8B-Q3_K_L.gguf	Q3_K_L	4.431 GB	small, substantial quality loss
Qwen3-8B-Q4_0.gguf	Q4_0	4.775 GB	legacy; small, very high quality loss - prefer using Q3_K_M
Qwen3-8B-Q4_K_S.gguf	Q4_K_S	4.802 GB	small, greater quality loss
Qwen3-8B-Q4_K_M.gguf	Q4_K_M	5.028 GB	medium, balanced quality - recommended
Qwen3-8B-Q5_0.gguf	Q5_0	5.721 GB	legacy; medium, balanced quality - prefer using Q4_K_M
Qwen3-8B-Q5_K_S.gguf	Q5_K_S	5.721 GB	large, low quality loss - recommended
Qwen3-8B-Q5_K_M.gguf	Q5_K_M	5.851 GB	large, very low quality loss - recommended
Qwen3-8B-Q6_K.gguf	Q6_K	6.726 GB	very large, extremely low quality loss
Qwen3-8B-Q8_0.gguf	Q8_0	8.710 GB	very large, extremely low quality loss - not recommended

Downloading instruction

Command line

Firstly, install Huggingface Client

pip install -U "huggingface_hub[cli]"

Then, downoad the individual model file the a local directory

huggingface-cli download tensorblock/Qwen_Qwen3-8B-GGUF --include "Qwen3-8B-Q2_K.gguf" --local-dir MY_LOCAL_DIR

If you wanna download multiple model files with a pattern (e.g., *Q4_K*gguf), you can try:

huggingface-cli download tensorblock/Qwen_Qwen3-8B-GGUF --local-dir MY_LOCAL_DIR --local-dir-use-symlinks False --include='*Q4_K*gguf'

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご