SlimPLMオープンソース検索必要性判断モデル - 大規模モデルの正確な検索タイミングと内容を支援

ホーム

Slimplm Retrieval Necessity Judgment

zstanjjによって開発

SlimPLMは、大規模言語モデル(LLM)のための検索をいつ行い、何を検索するかを決定する軽量プロキシモデルです。

大規模言語モデル

Transformers

#検索必要性判断 #クエリ書き換え最適化 #軽量プロキシモデル

ダウンロード数 26

リリース時間 : 1/25/2024

モデル概要

このモデルは主に検索必要性判断に使用され、大規模言語モデルのための情報検索が必要なタイミングを判断するのに役立ちます。

モデル特徴

軽量設計

軽量プロキシモデルとして、計算リソース要件が低い

検索意思決定

LLMのための情報検索が必要なタイミングをインテリジェントに判断可能

中国語最適化

特に中国語シナリオ向けに最適化されています

モデル能力

検索必要性判断

クエリ分析

構造化解析

使用事例

情報検索システム

検索意思決定サポート

Q&Aシステムにおいて外部知識の検索が必要かどうかを判断

システム効率向上、不要な検索コスト削減

🚀 SlimPLM

SlimPLMは、特定の自然言語処理タスクに特化したモデルです。論文やHugging Face、GitHubでの公開を通じて、研究者や開発者に利用されています。

🚀 クイックスタート

このモデルを使用するには、以下の手順に従ってください。

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# construct prompt
question = "Who voices Darth Vader in Star Wars Episodes III-VI, IX Rogue One, and Rebels?"
heuristic_answer = "The voice of Darth Vader in Star Wars is provided by British actor James Earl Jones. He first voiced the character in the 1977 film \"Star Wars: Episode IV - A New Hope\", and his performance has been used in all subsequent Star Wars films, including the prequels and sequels."
prompt = (f"<s>[INST] <<SYS>>\nYou are a helpful assistant. Your task is to parse user input into"
          f" structured formats according to the coarse answer. Current datatime is 2023-12-20 9:47:28"
          f" <</SYS>>\n Course answer: (({heuristic_answer}))\nQuestion: (({question})) [/INST]")
params_query_rewrite = {"repetition_penalty": 1.05, "temperature": 0.01, "top_k": 1, "top_p": 0.85,
                        "max_new_tokens": 512, "do_sample": False, "seed": 2023}

# deploy model
model = AutoModelForCausalLM.from_pretrained("zstanjj/SlimPLM-Retrieval-Necessity-Judgment").eval()
if torch.cuda.is_available():
    model.cuda()
tokenizer = AutoTokenizer.from_pretrained("zstanjj/SlimPLM-Retrieval-Necessity-Judgment")

# run inference 
input_ids = tokenizer.encode(prompt.format(question=question, answer=heuristic_answer), return_tensors="pt")
len_input_ids = len(input_ids[0])
if torch.cuda.is_available():
    input_ids = input_ids.cuda()
outputs = model.generate(input_ids)
res = tokenizer.decode(outputs[0][len_input_ids:], skip_special_tokens=True)
print(res)

✨ 主な機能

最新情報:
- [2024年1月25日]：Retrieval Necessity Judgment ModelがHugging Faceで公開されました。
- [2024年2月20日]：Query Rewriting ModelがHugging Faceで公開されました。
- [2024年5月19日]：新しい研究成果「Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs」がACL 2024 main会議に採択されました。

📚 ドキュメント

📝 論文
🤗 Hugging Face
🧩 Github

⚠️ 重要提示

このモデルを使用する場合は、必ず**GitHubリポジトリ**にスターをつけてサポートしてください。あなたのスターは大きな意味を持ちます！

✏️ 引用

@inproceedings{Tan2024SmallMB,
  title={Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs},
  author={Jiejun Tan and Zhicheng Dou and Yutao Zhu and Peidong Guo and Kun Fang and Ji-Rong Wen},
  year={2024},
  url={https://arxiv.org/abs/2402.12052}
}