langcache - embed - v1開源模型 - 免費實現語義文本相似度計算與緩存功能

首頁

Langcache Embed V1

由redis開發

這是一個基於阿里巴巴NLP/gte-modernbert-base微調的sentence-transformers模型，用於語義文本相似度計算，實現語義緩存功能。

文本嵌入

Safetensors

#語義緩存 #長文本嵌入 #高維向量

下載量 2,138

發布時間 : 3/21/2025

模型概述

該模型將句子和段落映射到768維稠密向量空間，可用於語義文本相似度計算，以實現語義緩存功能。

模型特點

高維語義嵌入

將文本映射到768維稠密向量空間，捕捉深層語義信息

長文本支持

最大支持8192個token的序列長度，適合處理長文本

高性能相似度計算

基於餘弦相似度的高效文本相似度計算

模型能力

語義文本相似度計算

文本嵌入生成

語義緩存支持

使用案例

語義緩存

問答系統緩存

通過語義相似度匹配緩存相似問題，減少重複計算

提高系統響應速度，降低計算成本

內容去重

識別語義相似的內容，實現高效去重

提高內容管理效率

🚀 基於Alibaba-NLP/gte-modernbert-base的Redis語義緩存嵌入模型

本項目是一個基於 sentence-transformers 的模型，它在 Quora 數據集上對 Alibaba-NLP/gte-modernbert-base 進行了微調。該模型可將句子和段落映射到768維的密集向量空間，用於語義文本相似度計算，以實現語義緩存。

🚀 快速開始

首先，安裝 Sentence Transformers 庫：

pip install -U sentence-transformers

然後，你可以加載此模型並進行推理：

from sentence_transformers import SentenceTransformer

# 從 🤗 Hub 下載
model = SentenceTransformer("redis/langcache-embed-v1")
# 進行推理
sentences = [
    'Will the value of Indian rupee increase after the ban of 500 and 1000 rupee notes?',
    'What will be the implications of banning 500 and 1000 rupees currency notes on Indian economy?',
    "Are Danish Sait's prank calls fake?",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# 獲取嵌入向量的相似度分數
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)

✨ 主要特性

映射能力：能夠將句子和段落映射到768維的密集向量空間。
應用場景：可用於語義文本相似度計算，實現語義緩存。

📦 安裝指南

安裝 Sentence Transformers 庫：

pip install -U sentence-transformers

💻 使用示例

基礎用法

from sentence_transformers import SentenceTransformer

# 從 🤗 Hub 下載
model = SentenceTransformer("redis/langcache-embed-v1")
# 進行推理
sentences = [
    'Will the value of Indian rupee increase after the ban of 500 and 1000 rupee notes?',
    'What will be the implications of banning 500 and 1000 rupees currency notes on Indian economy?',
    "Are Danish Sait's prank calls fake?",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# 獲取嵌入向量的相似度分數
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)

📚 詳細文檔

模型詳情

屬性	詳情
模型類型	Sentence Transformer
基礎模型	Alibaba-NLP/gte-modernbert-base
最大序列長度	8192 tokens
輸出維度	768 維
相似度函數	餘弦相似度
訓練數據集	Quora

模型來源

文檔：Sentence Transformers 文檔
倉庫：GitHub 上的 Sentence Transformers
Hugging Face：Hugging Face 上的 Sentence Transformers

完整模型架構

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

二分類

指標	值
餘弦準確率	0.90
餘弦 F1 值	0.87
餘弦精確率	0.84
餘弦召回率	0.90
餘弦平均精度	0.92

訓練數據集

Quora

數據集：Quora
規模：323491 個訓練樣本
列：question_1、question_2 和 label

評估數據集

Quora

數據集：Quora
規模：53486 個評估樣本
列：question_1、question_2 和 label

🔧 技術細節

該模型基於 Sentence Transformers 框架，在 Quora 數據集上對 Alibaba-NLP/gte-modernbert-base 進行微調。通過將句子和段落映射到768維的密集向量空間，利用餘弦相似度計算語義文本相似度，以實現語義緩存。

📄 許可證

文檔未提及相關信息。

📖 引用

BibTeX

Redis Langcache-embed 模型

@inproceedings{langcache-embed-v1,
    title = "Advancing Semantic Caching for LLMs with Domain-Specific Embeddings and Synthetic Data",
    author = "Gill, Cechmanek, Hutcherson, Rajamohan, Agarwal, Gulzar, Singh, Dion",
    month = "04",
    year = "2025",
    url = "https://arxiv.org/abs/2504.02268",
}

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}