Qwen3 Reranker 0.6B W4A16 G128
Q

Qwen3 Reranker 0.6B W4A16 G128

Developed by boboliu
The GPTQ quantized version of Qwen3-Reranker-0.6B, with optimized video memory usage and small precision loss
Downloads 151
Release Time : 6/7/2025

Model Overview

This is a GPTQ quantized model based on Qwen/Qwen3-Reranker-0.6B, mainly used for text classification tasks. The quantization technology significantly reduces the video memory usage while maintaining high precision.

Model Features

Video Memory Optimization
The video memory usage is reduced from 3228M to 2124M (without FA2), significantly improving resource efficiency
Precision Preservation
The expected precision loss is <5%. The actual test shows that the precision loss of the embedding model is only about 0.7%
Efficient Quantization
Use GPTQ quantization technology, combined with Ultrachat, T2Ranking, and COIG-CQIA as the calibration set

Model Capabilities

Text Classification
Text Reordering

Use Cases

Information Retrieval
Search Result Reordering
Reorder the results returned by the search engine to improve relevance
Text Processing
Document Classification
Automatically classify a large number of documents
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase