SlimPLM-Query-Rewritingオープンソースクエリ書き換えモデル

ホーム

Slimplm Query Rewriting

zstanjjによって開発

クエリ書き換え用の軽量言語モデルで、ユーザー入力を構造化形式に解析し、検索効果を最適化できます。

大規模言語モデル

Transformers

#クエリ書き換え最適化 #検索拡張生成 #軽量推論

ダウンロード数 53

リリース時間 : 2/19/2024

モデル概要

このモデルは主にクエリ書き換えタスクに使用され、大まかな回答に基づいてユーザー入力を構造化形式に解析し、情報検索の精度と効率を向上させます。

モデル特徴

軽量設計

モデルパラメータ規模が小さく、プロキシモデルとして使用するのに適しており、LLMの検索タイミングと内容を決定します。

構造化解析

ユーザー入力と大まかな回答を構造化形式に解析し、後続の検索効果を最適化できます。

効率的な推論

モデルの推論速度が速く、リアルタイムのクエリ書き換えタスクに適しています。

モデル能力

テキスト構造化

クエリ書き換え

情報検索最適化

使用事例

情報検索

複雑なクエリ書き換え

ユーザーの複雑な自然言語クエリを、検索に適した構造化形式に書き換えます。

検索システムの精度と再現率を向上させます。

検索必要性判定

ユーザークエリが外部知識ベースの検索を必要とするかどうかを判断します。

不要な検索コストを削減します。

🚀 SlimPLM

SlimPLMは、LLMに関連する特定のタスクを処理するためのモデルです。このモデルは、クエリの書き換えや検索必要性の判断などのタスクに特化しており、関連する研究成果も学術会議で発表されています。

🚀 クイックスタート

このモデルを使用する場合は、私たちの**GitHubリポジトリ**にスターを付けてサポートしていただけると幸いです。あなたのスターは大きな意味を持ちます！

📝 論文 • 🤗 Hugging Face • 🧩 Github

✨ 主な機能

💻 使用例

基本的な使用法

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# construct prompt
question = "Who voices Darth Vader in Star Wars Episodes III-VI, IX Rogue One, and Rebels?"
heuristic_answer = "The voice of Darth Vader in Star Wars is provided by British actor James Earl Jones. He first voiced the character in the 1977 film \"Star Wars: Episode IV - A New Hope\", and his performance has been used in all subsequent Star Wars films, including the prequels and sequels."
prompt = (f"<s>[INST] <<SYS>>\nYou are a helpful assistant. Your task is to parse user input into"
          f" structured formats according to the coarse answer. Current datatime is 2023-12-20 9:47:28"
          f" <</SYS>>\n Course answer: (({heuristic_answer}))\nQuestion: (({question})) [/INST]")

# alternatively you can input question only
# prompt = (f"<s>[INST] <<SYS>>\nYou are a helpful assistant. Your task is to parse user input into"
#           f" structured formats. Current datatime is 2023-12-20 9:47:28"
#           f" <</SYS>>\n{question} [/INST]")

params_query_rewrite = {"repetition_penalty": 1.05, "temperature": 0.01, "top_k": 1, "top_p": 0.85,
                        "max_new_tokens": 512, "do_sample": False, "seed": 2023}

# deploy model
model = AutoModelForCausalLM.from_pretrained("zstanjj/SlimPLM-Query-Rewriting").eval()
if torch.cuda.is_available():
    model.cuda()
tokenizer = AutoTokenizer.from_pretrained("zstanjj/SlimPLM-Query-Rewriting")

# run inference 
input_ids = tokenizer.encode(prompt.format(question=question, answer=heuristic_answer), return_tensors="pt")
len_input_ids = len(input_ids[0])
if torch.cuda.is_available():
    input_ids = input_ids.cuda()
outputs = model.generate(input_ids)
res = tokenizer.decode(outputs[0][len_input_ids:], skip_special_tokens=True)
print(res)

📚 ドキュメント

引用方法

@inproceedings{Tan2024SmallMB,
  title={Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs},
  author={Jiejun Tan and Zhicheng Dou and Yutao Zhu and Peidong Guo and Kun Fang and Ji-Rong Wen},
  year={2024},
  url={https://arxiv.org/abs/2402.12052}
}