開源LettuceDetect-large-modernbert-en-v1模型 - 有效檢測RAG應用幻覺，支持長上下文處理

首頁

Lettucedect Large Modernbert En V1

由KRLabsOrg開發

LettuceDetect 是一個基於 ModernBERT 的幻覺檢測模型，專為 RAG 應用設計，支持長上下文處理。

大型語言模型

Transformers

英語開源協議:MIT #長上下文幻覺檢測 #RAG應用優化 #標記級精度

下載量 438

發布時間 : 2/10/2025

模型概述

該模型用於在上下文和答案對中進行幻覺檢測，識別未被給定上下文支持的標記，適用於檢索增強生成（RAG）應用。

模型特點

長上下文支持

支持最多 8192 個標記的上下文處理，適用於需要處理詳細文檔的任務。

標記級別檢測

能夠識別答案文本中未被上下文支持的標記，提供精確的幻覺檢測。

高性能

在 RAGTruth 數據集上表現優異，優於 GPT-4 和 LLAMA-2-13B 等模型。

模型能力

幻覺檢測

標記分類

長上下文處理

使用案例

檢索增強生成（RAG）

答案驗證

驗證生成的答案是否基於給定的上下文，避免幻覺內容。

在 RAGTruth 數據集上 F1 得分 79.22%。

🚀 LettuceDetect：幻覺檢測模型

LettuceDetect 是一個基於 Transformer 的模型，用於對上下文和答案對進行幻覺檢測，專為檢索增強生成（RAG）應用程序而設計。該模型基於 ModernBERT 構建，因其支持擴展上下文（最多 8192 個標記）而被特別選用和訓練。這種長上下文能力對於需要處理詳細和廣泛文檔以準確確定答案是否得到給定上下文支持的任務至關重要。

LettuceDetect Logo

模型名稱：lettucedect-large-modernbert-en-v1
組織：KRLabsOrg
Github：https://github.com/KRLabsOrg/LettuceDetect

🚀 快速開始

安裝

安裝 'lettucedetect' 倉庫：

pip install lettucedetect

使用模型

from lettucedetect.models.inference import HallucinationDetector

# 對於基於 Transformer 的方法：
detector = HallucinationDetector(
    method="transformer", model_path="KRLabsOrg/lettucedect-base-modernbert-en-v1"
)

contexts = ["France is a country in Europe. The capital of France is Paris. The population of France is 67 million.",]
question = "What is the capital of France? What is the population of France?"
answer = "The capital of France is Paris. The population of France is 69 million."

# 獲取跨度級別的預測，指示答案中哪些部分被認為是幻覺內容。
predictions = detector.predict(context=contexts, question=question, answer=answer, output_format="spans")
print("Predictions:", predictions)

# Predictions: [{'start': 31, 'end': 71, 'confidence': 0.9944414496421814, 'text': ' The population of France is 69 million.'}]

✨ 主要特性

基於 ModernBERT 架構：具有擴展上下文支持（最多 8192 個標記），能處理詳細和廣泛的文檔。
準確的幻覺檢測：訓練模型識別答案文本中未得到給定上下文支持的標記，並以跨度形式呈現結果。
高性能表現：在 RAGTruth 數據集上的測試中，大型模型 lettucedetect-large-v1 取得了 79.22% 的整體 F1 分數，優於多種其他方法。

📦 安裝指南

安裝 'lettucedetect' 倉庫：

pip install lettucedetect

💻 使用示例

基礎用法

from lettucedetect.models.inference import HallucinationDetector

# 對於基於 Transformer 的方法：
detector = HallucinationDetector(
    method="transformer", model_path="KRLabsOrg/lettucedect-base-modernbert-en-v1"
)

contexts = ["France is a country in Europe. The capital of France is Paris. The population of France is 67 million.",]
question = "What is the capital of France? What is the population of France?"
answer = "The capital of France is Paris. The population of France is 69 million."

# 獲取跨度級別的預測，指示答案中哪些部分被認為是幻覺內容。
predictions = detector.predict(context=contexts, question=question, answer=answer, output_format="spans")
print("Predictions:", predictions)

# Predictions: [{'start': 31, 'end': 71, 'confidence': 0.9944414496421814, 'text': ' The population of France is 69 million.'}]

📚 詳細文檔

模型詳情

屬性	詳情
架構	ModernBERT（大型），具有擴展上下文支持（最多 8192 個標記）
任務	標記分類 / 幻覺檢測
訓練數據集	RagTruth
語言	英語

工作原理

該模型經過訓練，用於識別答案文本中未得到給定上下文支持的標記。在推理過程中，模型返回標記級別的預測，然後將其聚合為跨度。這使用戶能夠確切看到答案中哪些部分被認為是幻覺內容。

性能表現

示例級別結果

我們在 RAGTruth 數據集的測試集上評估了我們的模型。我們的大型模型 lettucedetect-large-v1 取得了 79.22% 的整體 F1 分數，優於基於提示的方法（如 GPT - 4，63.4%）和基於編碼器的模型（如 Luna，65.4%）。它還超過了微調後的 LLAMA - 2 - 13B（78.7%），並與最先進的微調後的 LLAMA - 3 - 8B（83.9%）具有競爭力。總體而言，lettucedetect-large-v1 和 lettucedect-base-v1 是性能非常出色的模型，在推理環境中也非常有效。

Example-level Results

跨度級別結果

在跨度級別上，我們的模型在所有數據類型上都取得了最佳分數，顯著優於以前的模型。請注意，這裡我們沒有與 RAG - HAT 等模型進行比較，因為它們沒有提供跨度級別的評估。

Span-level Results

📄 許可證

本項目採用 MIT 許可證。

🔖 引用

如果您使用該模型或工具，請引用以下論文：

@misc{Kovacs:2025,
      title={LettuceDetect: A Hallucination Detection Framework for RAG Applications}, 
      author={Ádám Kovács and Gábor Recski},
      year={2025},
      eprint={2502.17125},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.17125}, 
}