Q

Qwen3 Embedding 8B W4A16 G128

Developed by boboliu
GPTQ quantized version of Qwen3-Embedding-8B, significantly reducing VRAM requirements while maintaining high performance
Downloads 322
Release Time : 6/6/2025

Model Overview

A 4-bit quantized model based on Qwen3-Embedding-8B for text embedding tasks, significantly reducing VRAM requirements while maintaining high performance

Model Features

VRAM optimization
VRAM usage reduced from 24G to 19624M, can run on 3090/4090 graphics cards
Performance retention
Only 0.81% performance loss in C-MTEB test, still maintains a high level after quantization
Efficient quantization
Adopts W4A16 (4-bit weights, 16-bit activations) quantization scheme

Model Capabilities

Text vectorization
Semantic similarity calculation
Information retrieval
Text classification
Text clustering

Use Cases

Information retrieval
Document search
Convert queries and documents into vectors for similarity matching
Obtained a score of 77.39 in the retrieval task
Text classification
Multi-class classification
Use embedding vectors for text classification
Obtained a score of 76.85 in the classification task
Semantic analysis
Semantic similarity calculation
Calculate the semantic similarity between text pairs
Obtained a score of 62.80 in the STS task
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase