O

Opensearch Neural Sparse Encoding V1

Developed by opensearch-project
OpenSearch Neural Sparse Encoding Model v1, used to encode queries and documents into 30,522-dimensional sparse vectors for efficient search relevance and retrieval.
Downloads 10.20k
Release Time : 3/7/2024

Model Overview

This is a learned sparse retrieval model that can encode queries and documents into 30,522-dimensional sparse vectors, performing excellently in terms of search relevance and retrieval efficiency. The model is trained on the MS MARCO dataset and supports learned sparse retrieval using the Lucene inverted index.

Model Features

Efficient sparse encoding
Encode queries and documents into 30,522-dimensional sparse vectors. Non-zero dimensional indices represent corresponding tokens in the vocabulary, and weights represent the importance of the tokens.
Excellent relevance performance
Performs excellently on multiple datasets in the BEIR benchmark, with an average NDCG@10 of 0.524.
OpenSearch integration
Designed specifically for OpenSearch clusters, supporting efficient retrieval using the Lucene inverted index.
Zero-shot performance
Performs well on unseen datasets and can be used without fine-tuning.

Model Capabilities

Text sparse encoding
Information retrieval
Query-document matching
Zero-shot transfer learning

Use Cases

Search engine
Document retrieval
Efficiently retrieve relevant documents in a large document collection.
Achieves an average NDCG@10 of 0.524 in the BEIR benchmark.
Question-answering system
Match user questions with candidate answers.
Achieves an NDCG@10 of 0.553 on the NQ dataset.
Professional domain search
Scientific literature retrieval
Retrieve relevant papers in a scientific literature database.
Achieves an NDCG@10 of 0.723 on the SciFact dataset.
Medical information retrieval
Retrieve medical-related documents and information.
Achieves an NDCG@10 of 0.771 on the TrecCovid dataset.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase