SlimPLM開源檢索必要性判斷模型 - 助力大模型精準檢索時機與內容

首頁

Slimplm Retrieval Necessity Judgment

由zstanjj開發

SlimPLM是一個輕量級代理模型，用於決定何時為大型語言模型(LLM)進行檢索以及檢索什麼內容。

大型語言模型

Transformers

#檢索必要性判斷 #查詢改寫優化 #輕量級代理模型

下載量 26

發布時間 : 1/25/2024

模型概述

該模型主要用於檢索必要性判斷，幫助決定何時需要為大型語言模型進行信息檢索。

模型特點

輕量級設計

作為輕量代理模型，計算資源需求較低

檢索決策

能夠智能判斷何時需要為LLM進行信息檢索

中文優化

特別針對中文場景進行了優化

模型能力

檢索必要性判斷

查詢分析

結構化解析

使用案例

信息檢索系統

檢索決策支持

在問答系統中判斷是否需要檢索外部知識

提高系統效率，減少不必要的檢索開銷

🚀 SlimPLM

📝SlimPLM是一個用於自然語言處理的模型，可進行檢索必要性判斷、查詢重寫等任務。它能幫助解析用戶輸入為結構化格式，在大語言模型相關任務中發揮重要作用。

🚀 快速開始

以下是使用SlimPLM進行推理的示例代碼：

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# construct prompt
question = "Who voices Darth Vader in Star Wars Episodes III-VI, IX Rogue One, and Rebels?"
heuristic_answer = "The voice of Darth Vader in Star Wars is provided by British actor James Earl Jones. He first voiced the character in the 1977 film \"Star Wars: Episode IV - A New Hope\", and his performance has been used in all subsequent Star Wars films, including the prequels and sequels."
prompt = (f"<s>[INST] <<SYS>>\nYou are a helpful assistant. Your task is to parse user input into"
          f" structured formats according to the coarse answer. Current datatime is 2023-12-20 9:47:28"
          f" <</SYS>>\n Course answer: (({heuristic_answer}))\nQuestion: (({question})) [/INST]")
params_query_rewrite = {"repetition_penalty": 1.05, "temperature": 0.01, "top_k": 1, "top_p": 0.85,
                        "max_new_tokens": 512, "do_sample": False, "seed": 2023}

# deploy model
model = AutoModelForCausalLM.from_pretrained("zstanjj/SlimPLM-Retrieval-Necessity-Judgment").eval()
if torch.cuda.is_available():
    model.cuda()
tokenizer = AutoTokenizer.from_pretrained("zstanjj/SlimPLM-Retrieval-Necessity-Judgment")

# run inference 
input_ids = tokenizer.encode(prompt.format(question=question, answer=heuristic_answer), return_tensors="pt")
len_input_ids = len(input_ids[0])
if torch.cuda.is_available():
    input_ids = input_ids.cuda()
outputs = model.generate(input_ids)
res = tokenizer.decode(outputs[0][len_input_ids:], skip_special_tokens=True)
print(res)

✨ 主要特性

模型發佈：
- [1/25/2024]：檢索必要性判斷模型在 Hugging Face 發佈。
- [2/20/2024]：查詢重寫模型在 Hugging Face 發佈。
論文錄用：[5/19/2024] 我們的新工作 Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs 被 ACL 2024 main 會議錄用。

✏️ 引用

如果您使用了該模型，請引用以下論文：

@inproceedings{Tan2024SmallMB,
  title={Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs},
  author={Jiejun Tan and Zhicheng Dou and Yutao Zhu and Peidong Guo and Kun Fang and Ji-Rong Wen},
  year={2024},
  url={https://arxiv.org/abs/2402.12052}
}