rank1-3b開源信息檢索重排序模型 - 生成推理鏈精準判斷信息相關性

首頁

Rank1 3b

由jhu-clsp開發

rank1-3b是一個基於Qwen2.5-3B訓練的信息檢索重排序模型，通過生成推理鏈進行相關性判斷

大型語言模型

Transformers

英語開源協議:MIT #推理鏈重排序 #信息檢索增強 #測試時計算

下載量 103

發布時間 : 3/11/2025

模型概述

該模型採用測試時計算方法，在判斷文檔相關性前先生成顯式推理鏈，提高了複雜檢索任務的性能

模型特點

測試時計算推理

在判斷相關性前生成顯式推理鏈，將複雜決策分解為邏輯步驟

二元相關性判斷

通過true/false標記進行判斷，並轉換為置信度分數

多尺寸變體

提供從5億到320億參數的不同規模模型選擇

量化支持

提供多個量化版本以降低資源需求

模型能力

信息檢索重排序

相關性推理

文檔排序

查詢-文檔匹配

使用案例

信息檢索

搜索引擎結果重排序

對初步檢索結果進行精細化排序

提高搜索結果的相關性

問答系統文檔篩選

從候選文檔中篩選最相關的答案來源

提升問答系統的準確性

🚀 rank1-3b：信息檢索重排序的測試時計算模型

rank1是一個推理重排序模型，它在進行相關性判斷之前會進行“思考”。這個具有30億參數的模型基於Qwen2.5 - 3B基礎模型進行訓練，並利用測試時計算在判斷文檔與查詢是否相關之前生成推理鏈。

🚀 快速開始

rank1是一個推理重排序模型，在信息檢索領域有著獨特的應用。它能在做出相關性判斷前生成明確的推理鏈，將複雜的相關性決策分解為邏輯步驟，從而提升在各種檢索任務中的性能。

📄 論文 | 🚀 GitHub倉庫

✨ 主要特性

創新的推理機制：rank1在進行相關性判斷之前會生成明確的推理鏈，與傳統直接輸出分數的重排序器不同，它將複雜的相關性決策分解為邏輯步驟，有助於提高在各種檢索任務中的性能。
多模型變體：提供了從0.5B到32B不同參數規模的模型變體，以及基於不同基礎模型（如Qwen2.5、Mistral、Llama 3.1）訓練的模型，還包括量化版本，滿足不同場景的需求。
豐富的關聯資源：有相關的訓練數據、預計算的運行文件以及官方GitHub倉庫等資源，方便用戶使用和進一步開發。
MTEB兼容性：與MTEB基準測試框架兼容，便於進行模型評估。

📦 安裝指南

詳細的安裝說明請參考GitHub倉庫。

💻 使用示例

基礎用法

注意，官方使用方法可在GitHub上找到，其中考慮了各種邊緣情況。但對於簡單用例，以下最小示例即可。

點擊展開：使用vLLM的最小示例

from vllm import LLM, SamplingParams
import math

# Initialize the model with vLLM
model = LLM(
    model="jhu-clsp/rank1-3b",
    tensor_parallel_size=1,  # Number of GPUs
    trust_remote_code=True,
    max_model_len=16000,     # Context length
    gpu_memory_utilization=0.9,
    dtype="float16",
)

# Set up sampling parameters
sampling_params = SamplingParams(
    temperature=0,
    max_tokens=8192,
    logprobs=20,
    stop=["</think> true", "</think> false"],
    skip_special_tokens=False
)

# Prepare the prompt
def create_prompt(query, document):
    return (
        "Determine if the following passage is relevant to the query. "
        "Answer only with 'true' or 'false'.\n"
        f"Query: {query}\n"
        f"Passage: {document}\n"
        "<think>"
    )

# Example usage
query = "What are the effects of climate change?"
document = "Climate change leads to rising sea levels, extreme weather events, and disruptions to ecosystems. These effects are caused by increasing greenhouse gas concentrations in the atmosphere due to human activities."

# Generate prediction
prompt = create_prompt(query, document)
outputs = model.generate([prompt], sampling_params)

# Extract score
output = outputs[0].outputs[0]
text = output.text
final_logits = output.logprobs[-1]

# Get token IDs for "true" and "false" tokens
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("jhu-clsp/rank1-3b")
true_token = tokenizer(" true", add_special_tokens=False).input_ids[0]
false_token = tokenizer(" false", add_special_tokens=False).input_ids[0]

# Calculate relevance score (probability of "true")
true_logit = final_logits[true_token].logprob
false_logit = final_logits[false_token].logprob
true_score = math.exp(true_logit)
false_score = math.exp(false_logit)
relevance_score = true_score / (true_score + false_score)

print(f"Reasoning chain: {text}")
print(f"Relevance score: {relevance_score}")

高級用法

from mteb import MTEB
from rank1 import rank1  # From the official repo

# Initialize the model
model = rank1(
    model_name_or_path="jhu-clsp/rank1-3b",
    num_gpus=1,
    device="cuda"
)

# Run evaluation on specific tasks
evaluation = MTEB(tasks=["NevIR"])
results = evaluation.run(model)

📚 詳細文檔

模型描述

rank1在信息檢索中引入了一種新穎的方法，即在進行相關性判斷之前生成明確的推理鏈。與傳統直接輸出分數的重排序器不同，rank1的工作流程如下：

接收查詢和文檔對。
在<think>...</think>部分生成推理鏈。
做出二元相關性判斷（true或false）。
根據true/false標記的對數概率返回置信分數。

這種方法有助於模型將複雜的相關性決策分解為邏輯步驟，提高在各種檢索任務中的性能。

模型家族

模型	基礎模型	描述
rank1-0.5b	Qwen2.5-0.5B	最小變體（0.5B參數）
rank1-1.5b	Qwen2.5-1.5B	較小變體（1.5B參數）
rank1-3b	Qwen2.5-3B	當前模型（3B參數）
rank1-7b	Qwen2.5-7B	較大變體（7B參數）
rank1-14b	Qwen2.5-14B	較大變體（14B參數）
rank1-32b	Qwen2.532B	最大變體（32B參數）
rank1-mistral-2501-24b	Mistral-Small 2501 24B	基於Mistral基礎模型訓練
rank1-llama3-8b	Llama 3.1 8B	基於Llama 3.1基礎模型訓練

量化變體

模型	描述
rank1-7b-awq	rank1-7b的量化版本
rank1-14b-awq	rank1-14b的量化版本
rank1-32b-awq	rank1-32b的量化版本
rank1-mistral-2501-24b-awq	rank1-mistral-24b的量化版本
rank1-llama3-8b-awq	rank1-llama3-8b的量化版本

關聯數據和資源

資源	描述
rank1-r1-msmarco	來自MS MARCO的所有R1輸出示例
rank1-training-data	用於rank1模型的訓練數據
rank1-run-files	用於前100文檔重排序的預計算運行文件
GitHub倉庫	rank1的官方倉庫

🔧 技術細節

rank1-3b在檢索基準測試中表現出色，尤其在需要複雜推理的任務上。該模型“思考”相關性決策的能力使其在處理細微主題時特別有效。具體的基準測試結果以及與其他模型的比較，請參考論文和官方GitHub倉庫。

📄 許可證

本項目採用MIT許可證。

引用

如果您在研究中使用了rank1，請引用我們的工作：

@misc{weller2025rank1testtimecomputereranking,
      title={Rank1: Test-Time Compute for Reranking in Information Retrieval}, 
      author={Orion Weller and Kathryn Ricci and Eugene Yang and Andrew Yates and Dawn Lawrie and Benjamin Van Durme},
      year={2025},
      eprint={2502.18418},
      archivePrefix={arXiv},
      primaryClass={cs.IR},
      url={https://arxiv.org/abs/2502.18418}, 
}