đ Redis semantic caching embedding model based on Alibaba-NLP/gte-modernbert-base
This model is a sentence-transformers model fine-tuned from Alibaba-NLP/gte-modernbert-base on the Medical dataset. It maps sentences and paragraphs to a 768-dimensional dense vector space, which can be used for semantic textual similarity for semantic caching in the medical domain.
đ Quick Start
First, install the Sentence Transformers library:
pip install -U sentence-transformers
Then, you can load this model and run inference.
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("redis/langcache-embed-medical-v1")
sentences = [
'Will the value of Indian rupee increase after the ban of 500 and 1000 rupee notes?',
'What will be the implications of banning 500 and 1000 rupees currency notes on Indian economy?',
"Are Danish Sait's prank calls fake?",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
⨠Features
- Maps sentences & paragraphs to a 768 - dimensional dense vector space.
- Suitable for semantic textual similarity in the medical domain for semantic caching.
đĻ Installation
pip install -U sentence-transformers
đģ Usage Examples
Basic Usage
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("redis/langcache-embed-medical-v1")
sentences = [
'Will the value of Indian rupee increase after the ban of 500 and 1000 rupee notes?',
'What will be the implications of banning 500 and 1000 rupees currency notes on Indian economy?',
"Are Danish Sait's prank calls fake?",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
đ Documentation
Model Details
Model Description
Property |
Details |
Model Type |
Sentence Transformer |
Base model |
Alibaba-NLP/gte-modernbert-base |
Maximum Sequence Length |
8192 tokens |
Output Dimensionality |
768 dimensions |
Similarity Function |
Cosine Similarity |
Training Dataset |
Medical |
Model Sources
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Binary Classification
Metric |
Value |
cosine_accuracy |
0.92 |
cosine_f1 |
0.93 |
cosine_precision |
0.92 |
cosine_recall |
0.93 |
cosine_ap |
0.97 |
Training Dataset
Medical
- Dataset: Medical dataset
- Size: 2438 samples
- Columns:
question_1
, question_2
, and label
Evaluation Dataset
Medical
- Dataset: Medical dataset
- Size: 610 samples
- Columns:
question_1
, question_2
, and label
đ§ Technical Details
The model is fine - tuned from Alibaba-NLP/gte-modernbert-base on the Medical dataset. It uses a Sentence Transformer architecture with a specific pooling layer to map text to a 768 - dimensional vector space for semantic similarity tasks.
đ License
No license information provided in the original document.
đ Citation
BibTeX
Redis Langcache-embed Models
@inproceedings{langcache-embed-v1,
title = "Advancing Semantic Caching for LLMs with Domain-Specific Embeddings and Synthetic Data",
author = "Gill, Cechmanek, Hutcherson, Rajamohan, Agarwal, Gulzar, Singh, Dion",
month = "04",
year = "2025",
url = "https://arxiv.org/abs/2504.02268",
}
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}