Q

Qwen3 Embedding 0.6B Onnx Uint8

Developed by electroglyph
This is a quantized model based on ONNX, which is the uint8 quantized version of Qwen/Qwen3-Embedding-0.6B. It reduces the model size while maintaining retrieval performance.
Downloads 112
Release Time : 6/8/2025

Model Overview

This model is a text embedding model used to generate vector representations of text, suitable for tasks such as information retrieval and semantic search.

Model Features

Efficient quantization
Adopts uint8 quantization technology to significantly reduce the model size while maintaining retrieval performance.
High performance
Compared with the full f32 model, the difference in retrieval performance is only about 1%.
Compatibility
Compatible with qdrant fastembed, facilitating deployment and use in relevant environments.
Optimized quantization strategy
By excluding 484 sensitive nodes from quantization, a good balance is achieved between model size and accuracy.

Model Capabilities

Text vectorization
Semantic search
Information retrieval

Use Cases

Information retrieval
Document search
Convert documents into vector representations to achieve semantic-based document search.
Recommendation system
Content recommendation
Implement personalized recommendation through content vector similarity.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase