đ VectorPath SearchMap: Conversational E-commerce Search Embedding Model
VectorPath SearchMap is a specialized embedding model tailored for e-commerce search. It makes search more conversational and intuitive, enabling users to find relevant products with natural language queries.
⨠Features
- Optimized for conversational e-commerce queries
- Handles complex, natural language search intents
- Supports multi-attribute product search
- Efficient 1024-dimensional embeddings (configurable up to 8192)
- Specialized for product and hotel search scenarios
đ Quick Start
Try out the model in our interactive Colab Demo!
đĻ Installation
Using Sentence Transformers
!pip install -U torch==2.5.1 transformers==4.44.2 sentence-transformers==2.7.0 xformers==0.0.28.post3
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('vectopath/SearchMap_Preview', trust_remote_code=True)
query = "A treat my dog and I can eat together"
query_embedding = model.encode(query)
product_description = "Organic peanut butter dog treats, safe for human consumption..."
product_embedding = model.encode(product_description)
Using with FAISS for Vector Search
import numpy as np
import faiss
embedding_dimension = 1024
index = faiss.IndexFlatL2(embedding_dimension)
product_embeddings = model.encode(product_descriptions, show_progress_bar=True)
index.add(np.array(product_embeddings).astype('float32'))
query_embedding = model.encode([query])
distances, indices = index.search(
np.array(query_embedding).astype('float32'),
k=10
)
đģ Usage Examples
Basic Usage
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('vectopath/SearchMap_Preview', trust_remote_code=True)
query = "A treat my dog and I can eat together"
query_embedding = model.encode(query)
Advanced Usage
import numpy as np
import faiss
embedding_dimension = 1024
index = faiss.IndexFlatL2(embedding_dimension)
product_embeddings = model.encode(product_descriptions, show_progress_bar=True)
index.add(np.array(product_embeddings).astype('float32'))
query_embedding = model.encode([query])
distances, indices = index.search(
np.array(query_embedding).astype('float32'),
k=10
)
Example Search Queries
The model excels at understanding natural language queries like:
- "A treat my dog and I can eat together"
- "Lightweight waterproof hiking backpack for summer trails"
- "Eco-friendly kitchen gadgets for a small apartment"
- "Comfortable shoes for standing all day at work"
- "Cereal for my 4 year old son that likes to miss breakfast"
đ Documentation
Model Details
Property |
Details |
Model Type |
SearchMap, a conversational e-commerce search embedding model |
Base Model |
Stella Embed 400M v5 |
Embedding Dimensions |
Configurable (512, 768, 1024, 2048, 4096, 6144, 8192) |
Training Data |
100,000+ e-commerce products across 32 categories |
License |
MIT |
Framework |
PyTorch / Sentence Transformers |
Performance and Limitations
Evaluation
The model's evaluation metrics are available on the MTEB Leaderboard.
- The model is currently by far the best embedding model under 1B parameters size and very easy to run locally on a small GPU due to its memory size.
- The model also is No. 1 by a far margin on the SemRel24STS task with an accuracy of 81.12% beating Google Gemini embedding model (second place) 73.14% (as at 30th March 2025). SemRel24STS evaluates the ability of systems to measure the semantic relatedness between two sentences over 14 different languages.
- We noticed the model does exceptionally well with legal and news retrieval and similarity task from the MTEB leaderboard.
Strengths
- Excellent at understanding conversational and natural language queries.
- Strong performance in e-commerce and hotel search scenarios.
- Handles complex multi-attribute queries.
- Efficient computation with configurable embedding dimensions.
Current Limitations
- May not fully prioritize weighted terms in queries.
- Limited handling of slang and colloquial language.
- Regional language variations might need fine-tuning.
Training Details
The model was trained using:
- Supervised learning with Sentence Transformers.
- 100,000+ product dataset across 32 categories.
- AI-generated conversational search queries.
- Positive and negative product examples for contrast learning.
Intended Use
This model is designed for:
- E-commerce product search and recommendations.
- Hotel and accommodation search.
- Product catalog vectorization.
- Semantic similarity matching.
- Query understanding and intent detection.
đ§ Technical Details
The model is fine-tuned on the Stella Embed 400M v5 base model. It uses supervised learning with Sentence Transformers and a large e-commerce product dataset for training. The configurable embedding dimensions allow for efficient computation in different scenarios.
đ License
This model is released under the MIT License. See the LICENSE file for more details.
đ Citation
If you use this model in your research, please cite:
@misc{vectorpath2025searchmap,
title={SearchMap: Conversational E-commerce Search Embedding Model},
author={VectorPath Research Team},
year={2025},
publisher={Hugging Face},
journal={HuggingFace Model Hub},
}
đ¤ Contact and Community