đ Redis semantic caching embedding model based on Alibaba-NLP/gte-modernbert-base
This model is a sentence-transformers model fine - tuned from Alibaba-NLP/gte-modernbert-base on the Quora dataset. It maps sentences and paragraphs to a 768 - dimensional dense vector space, which can be used for semantic textual similarity for semantic caching purposes.
đ Quick Start
First, you need to install the Sentence Transformers library. Then, you can load this model and run inference.
Installation
pip install -U sentence-transformers
Usage
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("redis/langcache-embed-v1")
sentences = [
'Will the value of Indian rupee increase after the ban of 500 and 1000 rupee notes?',
'What will be the implications of banning 500 and 1000 rupees currency notes on Indian economy?',
"Are Danish Sait's prank calls fake?",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
⨠Features
- Maps sentences and paragraphs to a 768 - dimensional dense vector space.
- Can be used for semantic textual similarity for semantic caching.
đĻ Installation
pip install -U sentence-transformers
đģ Usage Examples
Basic Usage
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("redis/langcache-embed-v1")
sentences = [
'Will the value of Indian rupee increase after the ban of 500 and 1000 rupee notes?',
'What will be the implications of banning 500 and 1000 rupees currency notes on Indian economy?',
"Are Danish Sait's prank calls fake?",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
đ Documentation
Model Details
Model Description
Property |
Details |
Model Type |
Sentence Transformer |
Base model |
Alibaba-NLP/gte-modernbert-base |
Maximum Sequence Length |
8192 tokens |
Output Dimensionality |
768 dimensions |
Similarity Function |
Cosine Similarity |
Training Dataset |
Quora |
Model Sources
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Binary Classification
Metric |
Value |
cosine_accuracy |
0.90 |
cosine_f1 |
0.87 |
cosine_precision |
0.84 |
cosine_recall |
0.90 |
cosine_ap |
0.92 |
Training Dataset
Quora
- Dataset: Quora
- Size: 323491 training samples
- Columns:
question_1
, question_2
, and label
Evaluation Dataset
Quora
- Dataset: Quora
- Size: 53486 evaluation samples
- Columns:
question_1
, question_2
, and label
đ License
No license information provided in the original document.
đ Citation
BibTeX
Redis Langcache - embed Models
@inproceedings{langcache-embed-v1,
title = "Advancing Semantic Caching for LLMs with Domain-Specific Embeddings and Synthetic Data",
author = "Gill, Cechmanek, Hutcherson, Rajamohan, Agarwal, Gulzar, Singh, Dion",
month = "04",
year = "2025",
url = "https://arxiv.org/abs/2504.02268",
}
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}