open-instruct-stanford-alpaca-7b開源模型 - 基於Alpaca數據集微調，支持指令調優

首頁

Open Instruct Stanford Alpaca 7b

由allenai開發

基於斯坦福Alpaca數據集微調的7B參數LLaMa模型，專注於開放資源指令調優

大型語言模型

Transformers

英語#指令微調 #開放資源LLM #多任務評估

下載量 220

發布時間 : 6/7/2023

模型概述

該模型是基於LLaMa架構微調的大型語言模型，專門針對指令跟隨任務進行優化，能夠理解和執行自然語言指令

模型特點

開放資源指令調優

基於斯坦福Alpaca數據集進行微調，專注於開放資源的指令調優

高效參數規模

7B參數規模在保持性能的同時提高了推理效率

結構化輸入格式

採用特定的結構化輸入格式(<|user|>和<|assistant|>標記)以獲得最佳效果

模型能力

自然語言理解

指令跟隨

文本生成

問答系統

使用案例

教育

智能教學助手

作為教育輔助工具回答學生問題

研究

語言模型研究

用於開放資源指令調優的研究

🚀 開放指令斯坦福Alpaca 7B

本模型是在斯坦福Alpaca數據集上微調的70億參數LLaMa模型。請注意，這是一個模型差異文件（model diff），使用說明見下文。

該模型是論文《駱駝能走多遠？探索開放資源上指令調優的現狀》研究的一部分。用於訓練和評估此模型的代碼庫可在https://github.com/allenai/open-instruct找到。

本模型遵循LICENSE.txt中規定的AI模型許可證，以及原始LLaMa許可證（llama_license.txt）。

🚀 快速開始

📦 安裝指南

假設你已經有權訪問HF格式的LLaMa模型。你可以在這裡找到獲取訪問權限和轉換模型的詳細信息。

克隆https://github.com/allenai/open-instruct倉庫並安裝所需依賴，或者僅複製scripts/weight_diff.py文件，並安裝weight-diff-requirements.txt中列出的最小依賴項。然後將此模型差異文件下載或克隆到同一臺機器上。

💻 使用示例

基礎用法

運行以下命令來恢復模型：

python scripts/weight_diff.py recover --path_raw ${hf_llama_path} --path_tuned ${output_path} --path_diff ${diff_location}

運行上述命令後，你將得到一個恢復後的模型！請注意，這會佔用大量的內存，尤其是對於較大的模型。

📚 詳細文檔

輸入格式

該模型訓練時使用以下格式（注意換行符）：

<|user|>
Your message here!
<|assistant|>

為獲得最佳效果，請以這種方式格式化所有輸入。

性能

以下是該模型在我們的論文《駱駝能走多遠？探索開放資源上指令調優的現狀》所探索的各項基準測試中的性能表現：

零樣本MMLU	五樣本MMLU	GSM直接推理	GSM思維鏈	BBH直接推理	BBH思維鏈	TydiQA黃金段落	TydiQA閉卷	Codex-Eval通過率@1	Codex-Eval通過率@10	AlpacaFarm與Davinci-003對比	平均值
41.5	40.3	7.0	10.0	32.6	31.8	31.2	7.2	13.2	22.0	21.1	23.3

📄 許可證

如果你使用此模型，請引用我們的論文、LLaMa論文以及原始數據集：

@misc{wang2023far,
      title={How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources}, 
      author={Yizhong Wang and Hamish Ivison and Pradeep Dasigi and Jack Hessel and Tushar Khot and Khyathi Raghavi Chandu and David Wadden and Kelsey MacMillan and Noah A. Smith and Iz Beltagy and Hannaneh Hajishirzi},
      year={2023},
      eprint={2306.04751},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@misc{touvron2023llama,
      title={LLaMA: Open and Efficient Foundation Language Models}, 
      author={Hugo Touvron and Thibaut Lavril and Gautier Izacard and Xavier Martinet and Marie-Anne Lachaux and Timothée Lacroix and Baptiste Rozière and Naman Goyal and Eric Hambro and Faisal Azhar and Aurelien Rodriguez and Armand Joulin and Edouard Grave and Guillaume Lample},
      year={2023},
      eprint={2302.13971},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@misc{alpaca,
  author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },
  title = {Stanford Alpaca: An Instruction-following LLaMA model},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
}