Qwen3-Embedding-0.6B-W4A16-G128 Open-Source Model - Optimizes GPU memory usage with minimal performance loss

Qwen3 Embedding 0.6B W4A16 G128

Developed by boboliu

GPTQ quantized version of Qwen3-Embedding-0.6B, optimized for video memory usage with minimal performance loss

Text Embedding Open Source License:Apache-2.0 #GPTQ Quantization #Multilingual Embedding #Video Memory Optimization

Downloads 131

Release Time : 6/6/2025

Model Overview

A GPTQ quantized model based on Qwen3-Embedding-0.6B, mainly used for text embedding and similarity calculation tasks. It reduces video memory usage through quantization technology.

Model Features

Video Memory Optimization

Through GPTQ quantization technology, the video memory usage is reduced from 3228M to 2124M.

Performance Balance

The performance loss on C - MTEB is only 1.69%, maintaining a high accuracy rate.

Efficient Inference

The inference efficiency of the quantized model is improved, making it suitable for environments with limited resources.

Model Capabilities

Text Embedding

Similarity Calculation

Feature Extraction

Multilingual Processing

Use Cases

Information Retrieval

Document Retrieval

Used for large - scale document similarity matching and retrieval.

Scored 69.10 on the C - MTEB retrieval task.

Text Classification

Semantic Classification

Semantic classification task based on text embedding.

Scored 71.36 on the C - MTEB classification task.

Clustering Analysis

Text Clustering

Text clustering analysis based on embedding vectors.

Scored 66.12 on the C - MTEB clustering task.

C-MTEB	Param.	Mean(Task)	Mean(Type)	Class.	Clust.	Pair Class.	Rerank.	Retr.	STS
multilingual-e5-large-instruct	0.6B	58.08	58.24	69.80	48.23	64.52	57.45	63.65	45.81
bge-multilingual-gemma2	9B	67.64	75.31	59.30	86.67	68.28	73.73	55.19	-
gte-Qwen2-1.5B-instruct	1.5B	67.12	67.79	72.53	54.61	79.5	68.21	71.86	60.05
gte-Qwen2-7B-instruct	7.6B	71.62	72.19	75.77	66.06	81.16	69.24	75.70	65.20
ritrieve_zh_v1	0.3B	72.71	73.85	76.88	66.5	85.98	72.86	76.97	63.92
Qwen3-Embedding-0.6B	0.6B	66.33	67.45	71.40	68.74	76.42	62.58	71.03	54.52
This Model	0.6B-W4A16	65.21	66.30	71.36	66.12	74.96	62.63	69.10	53.65

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Qwen3 Embedding 0.6B W4A16 G128

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Qwen3-Embedding-0.6B-W4A16-G128

🚀 Quick Start

✨ Features

Benefit

Cost

📄 License