QVQ-72B-Preview-GGUF Open-source Model - Supports Local Deployment and Inference, Enabling Easy and Convenient Use

QVQ 72B Preview GGUF

Developed by tensorblock

The GGUF quantized version of QVQ-72B-Preview, suitable for local deployment and inference.

EnglishOpen Source License:Other #72B Large Parameter Model #Multi-precision Quantization Support #Low-quality Loss Recommendation

Downloads 220

Release Time : 12/26/2024

Model Overview

This is a large language model with a parameter scale of 72B, quantized in GGUF format for efficient operation in the local environment.

Model Features

Multiple Quantization Options

Provides multiple quantization levels from Q2_K to Q8_0 to meet the needs of different scenarios

Efficient Local Operation

The GGUF format optimizes local inference performance and reduces hardware requirements

Compatibility with llama.cpp

Compatible with the latest version of llama.cpp for easy integration into existing workflows

Model Capabilities

Text Generation

Dialogue System

Content Creation

Code Generation

Use Cases

Content Creation

Article Writing

Generate high-quality long articles

Dialogue System

Intelligent Assistant

Build a knowledge-rich dialogue AI

🚀 Qwen/QVQ-72B-Preview - GGUF

This repository offers GGUF format model files for Qwen/QVQ-72B-Preview, quantized with the help of TensorBlock, and compatible with llama.cpp.

🚀 Quick Start

This repo contains GGUF format model files for Qwen/QVQ-72B-Preview. The files were quantized using machines provided by TensorBlock, and they are compatible with llama.cpp as of commit b4391.

✨ Features

Our projects

Project	Description	Image	Link
Awesome MCP Servers	A comprehensive collection of Model Context Protocol (MCP) servers.		See what we built
TensorBlock Studio	A lightweight, open, and extensible multi-LLM interaction studio.		See what we built

📚 Documentation

Prompt template

<|im_start|>system
{system_prompt}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

Model file specification

Filename	Quant type	File Size	Description
QVQ-72B-Preview-Q2_K.gguf	Q2_K	29.812 GB	smallest, significant quality loss - not recommended for most purposes
QVQ-72B-Preview-Q3_K_S.gguf	Q3_K_S	34.488 GB	very small, high quality loss
QVQ-72B-Preview-Q3_K_M.gguf	Q3_K_M	37.699 GB	very small, high quality loss
QVQ-72B-Preview-Q3_K_L.gguf	Q3_K_L	39.505 GB	small, substantial quality loss
QVQ-72B-Preview-Q4_0.gguf	Q4_0	41.232 GB	legacy; small, very high quality loss - prefer using Q3_K_M
QVQ-72B-Preview-Q4_K_S.gguf	Q4_K_S	43.889 GB	small, greater quality loss
QVQ-72B-Preview-Q4_K_M.gguf	Q4_K_M	47.416 GB	medium, balanced quality - recommended
QVQ-72B-Preview-Q5_0	Q5_0	50.164 GB	legacy; medium, balanced quality - prefer using Q4_K_M
QVQ-72B-Preview-Q5_K_S	Q5_K_S	51.375 GB	large, low quality loss - recommended
QVQ-72B-Preview-Q5_K_M	Q5_K_M	54.447 GB	large, very low quality loss - recommended
QVQ-72B-Preview-Q6_K	Q6_K	64.348 GB	very large, extremely low quality loss
QVQ-72B-Preview-Q8_0	Q8_0	77.263 GB	very large, extremely low quality loss - not recommended

📦 Installation

Downloading instruction

Command line

Firstly, install Huggingface Client

pip install -U "huggingface_hub[cli]"

Then, download the individual model file to a local directory

huggingface-cli download tensorblock/QVQ-72B-Preview-GGUF --include "QVQ-72B-Preview-Q2_K.gguf" --local-dir MY_LOCAL_DIR

If you wanna download multiple model files with a pattern (e.g., *Q4_K*gguf), you can try:

huggingface-cli download tensorblock/QVQ-72B-Preview-GGUF --local-dir MY_LOCAL_DIR --local-dir-use-symlinks False --include='*Q4_K*gguf'

📄 License

License: other
License Name: qwen
License Link: https://huggingface.co/Qwen/QVQ-72B-Preview/blob/main/LICENSE

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご