O

Opensearch Neural Sparse Encoding V2 Distill

Developed by opensearch-project
The OpenSearch Neural Sparse Encoding Model v2 Distilled is an efficient learned sparse retrieval model designed for OpenSearch, capable of encoding queries and documents into 30,522-dimensional sparse vectors.
Downloads 4,964
Release Time : 7/17/2024

Model Overview

This model is primarily used for retrieval tasks, converting queries and documents into sparse vectors, supporting sparse retrieval based on Lucene inverted indexes, and is suitable for various information retrieval scenarios.

Model Features

Efficient Sparse Retrieval
Supports sparse retrieval based on Lucene inverted indexes, improving retrieval efficiency.
Distilled Optimization
Compared to the base model, the parameter count is reduced by half while maintaining or improving performance.
Multi-dataset Training
Training data includes 14 public datasets such as MS MARCO, eli5 Q&A, and squad Q&A pairs.
Semantic Relevance Matching
Even when original texts share no overlapping words, the model can still achieve effective matching through semantic relevance.

Model Capabilities

Text Retrieval
Query Expansion
Document Expansion
Semantic Matching

Use Cases

Information Retrieval
Document Retrieval
Quickly retrieve relevant documents from large document libraries.
Achieves an average NDCG@10 of 0.528 on the BEIR benchmark subset
Q&A System
Used for relevant passage retrieval in Q&A systems.
Achieves an NDCG@10 of 0.561 on the NQ (Natural Questions) dataset
Search Engine
OpenSearch Integration
Serves as the core component for neural sparse retrieval functionality in OpenSearch.
Supports efficient retrieval based on Lucene inverted indexes
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase