rank1-14b开源推理重排序模型 - 优化信息检索，提高检索任务性能

首页

Rank1 14b

由 jhu-clsp 开发

rank1是一个140亿参数的推理重排序模型，通过生成显式推理链再进行相关性判断，提高了信息检索任务的性能。

大型语言模型

Transformers

英语开源协议:MIT #推理链重排序 #测试时计算 #信息检索优化

下载量 23

发布时间 : 2/18/2025

模型简介

该模型基于Qwen2.5-14B基础模型训练，专门用于信息检索中的重排序任务。与传统方法不同，它在做出相关性判断前会生成推理链，从而提升复杂检索任务的准确性。

模型特点

测试时计算

在判断文档相关性前生成推理链，使决策过程更加透明和可解释

多规模变体

提供从5亿到320亿参数的不同规模模型，适应不同计算资源需求

量化支持

提供AWQ量化版本，降低部署资源需求

模型能力

信息检索

文档重排序

使用案例

搜索引擎优化

搜索结果重排序

对搜索引擎返回的前100个结果进行智能重排序，提升最相关结果的排名

相比传统方法，能更准确地识别微妙的相关性

问答系统

答案候选排序

在问答系统中对候选答案进行相关性排序

通过推理链分析，减少错误答案的排名

🚀 rank1-14b：信息检索重排的测试时计算模型

rank1是一个推理重排模型，它在做出相关性判断之前会进行“思考”。这个拥有140亿参数的模型基于Qwen2.5 - 14B基础模型进行训练，并利用测试时计算在决定文档是否与查询相关之前生成推理链。

📄 论文 | 🚀 GitHub仓库

🚀 快速开始

rank1是一个推理重排模型，在进行相关性判断前会“思考”。该模型基于Qwen2.5 - 14B基础模型训练，利用测试时计算生成推理链，以决定文档与查询的相关性。

✨ 主要特性

创新的信息检索方法：rank1在进行相关性判断之前会生成明确的推理链。与直接输出分数的传统重排器不同，rank1会接收查询和文档对，在 <think>...</think> 部分生成推理链，做出二元相关性判断（true 或 false），并根据真假标记的对数几率返回置信度分数。这种方法有助于模型将复杂的相关性决策分解为逻辑步骤，提高了在各种检索任务中的性能。
多模型变体：提供了多种不同参数规模的模型变体，包括 rank1 - 0.5b、rank1 - 1.5b、rank1 - 3b、rank1 - 7b、rank1 - 14b、rank1 - 32b、rank1 - mistral - 2501 - 24b 和 rank1 - llama3 - 8b，还有对应的量化版本。
丰富的关联数据和资源：有来自MS MARCO的所有R1输出示例、训练数据、预计算的运行文件等，并且有官方的GitHub仓库。
良好的性能：rank1 - 14b在检索基准测试中表现出色，尤其在需要复杂推理的任务上。其“思考”相关性决策的能力使其在处理细微主题时特别有效。
MTEB集成：rank1与 MTEB基准测试框架兼容。

📦 安装指南

详细的安装说明请参考 GitHub仓库。

💻 使用示例

基础用法

请注意，官方使用方法可在Github上找到，其中考虑了边缘情况。但对于简单用例，以下最小示例即可。

点击展开：使用vLLM的最小示例

from vllm import LLM, SamplingParams
import math

# Initialize the model with vLLM
model = LLM(
    model="jhu-clsp/rank1-14b",
    tensor_parallel_size=1,  # Number of GPUs
    trust_remote_code=True,
    max_model_len=16000,     # Context length
    gpu_memory_utilization=0.9,
    dtype="float16",
)

# Set up sampling parameters
sampling_params = SamplingParams(
    temperature=0,
    max_tokens=8192,
    logprobs=20,
    stop=["</think> true", "</think> false"],
    skip_special_tokens=False
)

# Prepare the prompt
def create_prompt(query, document):
    return (
        "Determine if the following passage is relevant to the query. "
        "Answer only with 'true' or 'false'.\n"
        f"Query: {query}\n"
        f"Passage: {document}\n"
        "<think>"
    )

# Example usage
query = "What are the effects of climate change?"
document = "Climate change leads to rising sea levels, extreme weather events, and disruptions to ecosystems. These effects are caused by increasing greenhouse gas concentrations in the atmosphere due to human activities."

# Generate prediction
prompt = create_prompt(query, document)
outputs = model.generate([prompt], sampling_params)

# Extract score
output = outputs[0].outputs[0]
text = output.text
final_logits = output.logprobs[-1]

# Get token IDs for "true" and "false" tokens
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("jhu-clsp/rank1-14b")
true_token = tokenizer(" true", add_special_tokens=False).input_ids[0]
false_token = tokenizer(" false", add_special_tokens=False).input_ids[0]

# Calculate relevance score (probability of "true")
true_logit = final_logits[true_token].logprob
false_logit = final_logits[false_token].logprob
true_score = math.exp(true_logit)
false_score = math.exp(false_logit)
relevance_score = true_score / (true_score + false_score)

print(f"Reasoning chain: {text}")
print(f"Relevance score: {relevance_score}")

高级用法

from mteb import MTEB
from rank1 import rank1  # From the official repo

# Initialize the model
model = rank1(
    model_name_or_path="jhu-clsp/rank1-14b",
    num_gpus=1,
    device="cuda"
)

# Run evaluation on specific tasks
evaluation = MTEB(tasks=["NevIR"])
results = evaluation.run(model)

📚 详细文档

模型家族

模型	基础模型	描述
[rank1 - 0.5b](https://huggingface.co/jhu - clsp/rank1 - 0.5b)	Qwen2.5 - 0.5B	最小变体（050亿参数）
[rank1 - 1.5b](https://huggingface.co/jhu - clsp/rank1 - 1.5b)	Qwen2.5 - 1.5B	较小变体（150亿参数）
[rank1 - 3b](https://huggingface.co/jhu - clsp/rank1 - 3b)	Qwen2.5 - 3B	较小变体（300亿参数）
[rank1 - 7b](https://huggingface.co/jhu - clsp/rank1 - 7b)	Qwen2.5 - 7B	较小变体（700亿参数）
[rank1 - 14b](https://huggingface.co/jhu - clsp/rank1 - 14b)	Qwen2.5 - 14B	当前模型（1400亿参数）
[rank1 - 32b](https://huggingface.co/jhu - clsp/rank1 - 32b)	Qwen2.5 - 32B	最大变体（3200亿参数）
[rank1 - mistral - 2501 - 24b](https://huggingface.co/jhu - clsp/rank1 - mistral - 2501 - 24b)	Mistral - Small 2501 24B	基于Mistral基础模型训练
[rank1 - llama3 - 8b](https://huggingface.co/jhu - clsp/rank1 - llama3 - 8b)	Llama 3.1 8B	基于Llama 3.1基础模型训练

量化变体

模型	描述
[rank1 - 7b - awq](https://huggingface.co/jhu - clsp/rank1 - 7b - awq)	rank1 - 7b的量化版本
[rank1 - 14b - awq](https://huggingface.co/jhu - clsp/rank1 - 14b - awq)	rank1 - 14b的量化版本
[rank1 - 32b - awq](https://huggingface.co/jhu - clsp/rank1 - 32b - awq)	rank1 - 32b的量化版本
[rank1 - mistral - 2501 - 24b - awq](https://huggingface.co/jhu - clsp/rank1 - mistral - 2501 - 24b - awq)	rank1 - mistral - 24b的量化版本
[rank1 - llama3 - 8b - awq](https://huggingface.co/jhu - clsp/rank1 - llama3 - 8b - awq)	rank1 - llama3 - 8b的量化版本

关联数据和资源

资源	描述
[rank1 - r1 - msmarco](https://huggingface.co/datasets/jhu - clsp/rank1 - r1 - msmarco)	来自MS MARCO的所有R1输出示例
[rank1 - training - data](https://huggingface.co/datasets/jhu - clsp/rank1 - training - data)	用于rank1模型的训练数据
[rank1 - run - files](https://huggingface.co/datasets/jhu - clsp/rank1 - run - files)	用于前100文档重排的预计算运行文件
GitHub仓库	官方rank1仓库

📄 许可证

MIT许可证

🔖 引用

如果您在研究中使用了rank1，请引用我们的工作：

@misc{weller2025rank1testtimecomputereranking,
      title={Rank1: Test-Time Compute for Reranking in Information Retrieval}, 
      author={Orion Weller and Kathryn Ricci and Eugene Yang and Andrew Yates and Dawn Lawrie and Benjamin Van Durme},
      year={2025},
      eprint={2502.18418},
      archivePrefix={arXiv},
      primaryClass={cs.IR},
      url={https://arxiv.org/abs/2502.18418}, 
}