Qwen3 Reranker 0.6B W4A16 G128
The GPTQ quantized version of Qwen3-Reranker-0.6B, with optimized video memory usage and small precision loss
Downloads 151
Release Time : 6/7/2025
Model Overview
This is a GPTQ quantized model based on Qwen/Qwen3-Reranker-0.6B, mainly used for text classification tasks. The quantization technology significantly reduces the video memory usage while maintaining high precision.
Model Features
Video Memory Optimization
The video memory usage is reduced from 3228M to 2124M (without FA2), significantly improving resource efficiency
Precision Preservation
The expected precision loss is <5%. The actual test shows that the precision loss of the embedding model is only about 0.7%
Efficient Quantization
Use GPTQ quantization technology, combined with Ultrachat, T2Ranking, and COIG-CQIA as the calibration set
Model Capabilities
Text Classification
Text Reordering
Use Cases
Information Retrieval
Search Result Reordering
Reorder the results returned by the search engine to improve relevance
Text Processing
Document Classification
Automatically classify a large number of documents
Featured Recommended AI Models
Qwen2.5 VL 7B Abliterated Caption It I1 GGUF
Apache-2.0
Quantized version of Qwen2.5-VL-7B-Abliterated-Caption-it, supporting multilingual image description tasks.
Image-to-Text
Transformers Supports Multiple Languages

Q
mradermacher
167
1
Nunchaku Flux.1 Dev Colossus
Other
The Nunchaku quantized version of the Colossus Project Flux, designed to generate high-quality images based on text prompts. This model minimizes performance loss while optimizing inference efficiency.
Image Generation English
N
nunchaku-tech
235
3
Qwen2.5 VL 7B Abliterated Caption It GGUF
Apache-2.0
This is a static quantized version based on the Qwen2.5-VL-7B model, focusing on image captioning generation tasks and supporting multiple languages.
Image-to-Text
Transformers Supports Multiple Languages

Q
mradermacher
133
1
Olmocr 7B 0725 FP8
Apache-2.0
olmOCR-7B-0725-FP8 is a document OCR model based on the Qwen2.5-VL-7B-Instruct model. It is fine-tuned using the olmOCR-mix-0225 dataset and then quantized to the FP8 version.
Image-to-Text
Transformers English

O
allenai
881
3
Lucy 128k GGUF
Apache-2.0
Lucy-128k is a model developed based on Qwen3-1.7B, focusing on proxy-based web search and lightweight browsing, and can run efficiently on mobile devices.
Large Language Model
Transformers English

L
Mungert
263
2
Š 2025AIbase