Open-source LettuceDetect-large-modernbert-en-v1 Model - Effectively Detect Hallucinations in RAG Applications and Support Long Context Processing

Lettucedect Large Modernbert En V1

Developed by KRLabsOrg

LettuceDetect is a hallucination detection model based on ModernBERT, specifically designed for RAG applications with support for long-context processing.

Large Language Model

Transformers

EnglishOpen Source License:MIT #Long-context hallucination detection #RAG application optimization #Token-level precision

Downloads 438

Release Time : 2/10/2025

Model Overview

This model is used for hallucination detection in context-answer pairs, identifying tokens in answers that are not supported by the given context, making it suitable for Retrieval-Augmented Generation (RAG) applications.

Model Features

Long-context support

Supports processing contexts of up to 8192 tokens, suitable for tasks requiring detailed document handling.

Token-level detection

Capable of identifying unsupported tokens in answer texts, providing precise hallucination detection.

High performance

Excels on the RAGTruth dataset, outperforming models like GPT-4 and LLAMA-2-13B.

Model Capabilities

Hallucination detection

Token classification

Long-context processing

Use Cases

Retrieval-Augmented Generation (RAG)

Answer verification

Verify whether generated answers are based on the given context to avoid hallucinated content.

Achieved an F1 score of 79.22% on the RAGTruth dataset.

🚀 LettuceDetect: Hallucination Detection Model

LettuceDetect is a transformer - based model designed for hallucination detection in Retrieval - Augmented Generation (RAG) applications, leveraging the extended context support of ModernBERT.

LettuceDetect Logo

Model Name: lettucedect-large-modernbert-en-v1
Organization: KRLabsOrg
Github: https://github.com/KRLabsOrg/LettuceDetect

🚀 Quick Start

LettuceDetect is a powerful tool for detecting hallucinations in context - answer pairs. It's built on ModernBERT, which supports up to 8192 tokens, making it suitable for processing detailed and extensive documents.

✨ Features

Extended Context Support: Based on ModernBERT, it can handle up to 8192 tokens, crucial for tasks requiring in - depth document processing.
Accurate Detection: Trained to identify hallucinated tokens in answers, providing span - level predictions.
High Performance: Outperforms many existing models in both example - level and span - level evaluations.

📦 Installation

Install the 'lettucedetect' repository:

pip install lettucedetect

💻 Usage Examples

Basic Usage

from lettucedetect.models.inference import HallucinationDetector

# For a transformer - based approach:
detector = HallucinationDetector(
    method="transformer", model_path="KRLabsOrg/lettucedect-base-modernbert-en-v1"
)

contexts = ["France is a country in Europe. The capital of France is Paris. The population of France is 67 million.",]
question = "What is the capital of France? What is the population of France?"
answer = "The capital of France is Paris. The population of France is 69 million."

# Get span - level predictions indicating which parts of the answer are considered hallucinated.
predictions = detector.predict(context=contexts, question=question, answer=answer, output_format="spans")
print("Predictions:", predictions)

# Predictions: [{'start': 31, 'end': 71, 'confidence': 0.9944414496421814, 'text': ' The population of France is 69 million.'}]

📚 Documentation

Model Details

Property	Details
Model Type	ModernBERT (Large) with extended context support (up to 8192 tokens)
Task	Token Classification / Hallucination Detection
Training Data	RagTruth
Language	English

How It Works

The model is trained to identify tokens in the answer text that are not supported by the given context. During inference, it returns token - level predictions which are then aggregated into spans, allowing users to see exactly which parts of the answer are considered hallucinated.

Performance

Example level results

We evaluate our model on the test set of the RAGTruth dataset. Our large model, lettucedetect - large - v1, achieves an overall F1 score of 79.22%, outperforming prompt - based methods like GPT - 4 (63.4%) and encoder - based models like Luna (65.4%). It also surpasses fine - tuned LLAMA - 2 - 13B (78.7%) (presented in RAGTruth) and is competitive with the SOTA fine - tuned LLAMA - 3 - 8B (83.9%) (presented in the RAG - HAT paper).

Example - level Results

Span - level results

At the span level, our model achieves the best scores across all data types, significantly outperforming previous models. Note that here we don't compare to models, like RAG - HAT, since they have no span - level evaluation presented.

Span - level Results

📄 License

This project is licensed under the MIT license.

📖 Citing

If you use the model or the tool, please cite the following paper:

@misc{Kovacs:2025,
      title={LettuceDetect: A Hallucination Detection Framework for RAG Applications}, 
      author={Ádám Kovács and Gábor Recski},
      year={2025},
      eprint={2502.17125},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.17125}, 
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご