開源Tulu 65B模型 - 基於多指令微調，具備強勁綜合性能

首頁

Tulu 65b

由allenai開發

Tulu 65B是基於多指令數據集微調的65B參數LLaMa模型，是開放資源指令調優研究的成果，綜合性能強勁。

大型語言模型

Transformers

英語#多指令微調 #65B參數規模 #綜合性能最優

下載量 20

發布時間 : 6/7/2023

模型概述

該模型通過FLAN V2、CoT、Dolly等多指令數據集微調，適用於多種自然語言處理任務，特別強調指令遵循能力。

模型特點

多指令數據集微調

整合FLAN V2、CoT、Dolly等7個高質量指令數據集進行訓練

嚴格輸入格式要求

採用特定對話格式(<|user|>/<|assistant|>標記)確保最佳生成效果

綜合性能優異

在MMLU、GSM、BBH等多個基準測試中表現突出

模型能力

指令理解與執行

多輪對話生成

複雜問題解答

代碼生成與解釋

知識推理

使用案例

智能助手

任務型對話系統

處理複雜多輪指令對話

在AlpacaFarm評估中優於Davinci-003模型

教育研究

開放域問答

回答各類知識性問題

在MMLU基準測試中5-shot準確率達61.1%

🚀 Tulu 65B

Tulu 65B是一個基於65B參數的LLaMa模型，它在多種指令數據集（FLAN V2、CoT、Dolly、Open Assistant 1、GPT4 - Alpaca、Code - Alpaca和ShareGPT）上進行了微調。請注意，這是一個模型差異文件，使用說明見下文。

該模型是論文How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources研究的一部分。用於訓練和評估該模型的代碼庫可在[https://github.com/allenai/open - instruct](https://github.com/allenai/open - instruct)找到。

這是本項目中訓練出的綜合性能最強的模型！

該模型遵循LICENSE.txt中給出的AI模型許可協議以及原始的Llama許可協議（llama_license.txt）。這些許可協議可在[我們的代碼庫](https://github.com/allenai/open - instruct/tree/main/model_licenses)中找到，模型許可見tulu_license.txt，Llama許可見llama_license.txt。

🚀 快速開始

訪問模型

若要訪問這些模型，請填寫此表單，我們將進行審核，並告知您的用例是否獲批。您在下方提供的信息僅用於評估訪問這些模型的資格。

數據集

屬性	詳情
數據集	databricks/databricks - dolly - 15k、OpenAssistant/oasst1、sahil2801/CodeAlpaca - 20k
語言	英語

額外字段

字段	類型
名字	文本
姓氏	文本
機構	文本
所在國家	文本
預期用途	文本
過往相關出版物	文本
我同意遵守與此製品相關的許可條款，包括領域和使用限制	複選框

📦 安裝指南

我們假設您已經可以訪問HF格式的LLaMa模型。您可以在https://huggingface.co/docs/transformers/main/model_doc/llama找到獲取訪問權限和轉換模型的詳細信息。

克隆[https://github.com/allenai/open - instruct](https://github.com/allenai/open - instruct)並安裝所需的依賴項，或者僅複製scripts/weight_diff.py並安裝weight - diff - requirements.txt中列出的最小依賴項。然後將此模型差異文件下載或克隆到同一臺機器上。

然後，運行以下命令：

python scripts/weight_diff.py recover --path_raw ${hf_llama_path} --path_tuned ${output_path} --path_diff ${diff_location}

這樣您就可以恢復模型了！請注意，這會佔用相當多的內存，尤其是對於較大的模型。

💻 使用示例

基礎用法

模型訓練使用以下格式（注意換行符）：

<|user|>
您的消息內容！
<|assistant|>

為獲得最佳效果，請以這種方式格式化所有輸入。確保在<|assistant|>後包含換行符，這對生成質量有很大影響。

📚 詳細文檔

性能表現

以下是該模型在論文How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources所涉及的基準測試中的性能表現：

MMLU 0 - shot	MMLU 5 - shot	GSM Direct	GSM CoT	BBH Direct	BBH CoT	TydiQA Gold - Passage	TydiQA Closed - book	Codex - Eval Pass@1	Codex - Eval Pass@10	AlpacaFarm vs Davinci - 003	平均值
59.2	61.1	9.0	60.0	48.1	53.5	51.8	13.3	28.9	45.9	62.7	46.3

引用說明

如果您使用此模型，請引用我們的論文、Llama論文以及原始數據集：

@misc{wang2023far,
      title={How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources}, 
      author={Yizhong Wang and Hamish Ivison and Pradeep Dasigi and Jack Hessel and Tushar Khot and Khyathi Raghavi Chandu and David Wadden and Kelsey MacMillan and Noah A. Smith and Iz Beltagy and Hannaneh Hajishirzi},
      year={2023},
      eprint={2306.04751},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@misc{touvron2023llama,
      title={LLaMA: Open and Efficient Foundation Language Models}, 
      author={Hugo Touvron and Thibaut Lavril and Gautier Izacard and Xavier Martinet and Marie - Anne Lachaux and Timothée Lacroix and Baptiste Rozière and Naman Goyal and Eric Hambro and Faisal Azhar and Aurelien Rodriguez and Armand Joulin and Edouard Grave and Guillaume Lample},
      year={2023},
      eprint={2302.13971},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@misc{dolly,
  author = {Databricks},
  title = {Free Dolly: Introducing the World's First Truly Open Instruction - Tuned LLM},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {Blog post},
  url = {https://www.databricks.com/blog/2023/04/12/dolly - first - open - commercially - viable - instruction - tuned - llm}
}

@article{longpre2023flan,
  title={The Flan Collection: Designing Data and Methods for Effective Instruction Tuning},
  author={Longpre, Shayne and Hou, Le and Vu, Tu and Webson, Albert and Chung, Hyung Won and Tay, Yi and Zhou, Denny and Le, Quoc V and Zoph, Barret and Wei, Jason and others},
  journal={arXiv preprint arXiv:2301.13688},
  year={2023}
}

@misc{köpf2023openassistant,
      title={OpenAssistant Conversations -- Democratizing Large Language Model Alignment}, 
      author={Andreas Köpf and Yannic Kilcher and Dimitri von Rütte and Sotiris Anagnostidis and Zhi - Rui Tam and Keith Stevens and Abdullah Barhoum and Nguyen Minh Duc and Oliver Stanley and Richárd Nagyfi and Shahul ES and Sameer Suri and David Glushkov and Arnav Dantuluri and Andrew Maguire and Christoph Schuhmann and Huu Nguyen and Alexander Mattick},
      year={2023},
      eprint={2304.07327},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@article{peng2023instruction,
  title={Instruction Tuning with GPT - 4},
  author={Peng, Baolin and Li, Chunyuan and He, Pengcheng and Galley, Michel and Gao, Jianfeng},
  journal={arXiv preprint arXiv:2304.03277},
  year={2023}
}

@misc{codealpaca,
  author = {Sahil Chaudhary},
  title = {Code Alpaca: An Instruction - following LLaMA model for code generation},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/sahil280114/codealpaca}},
}