开源Tulu 65B模型 - 基于多指令微调，具备强劲综合性能

首页

Tulu 65b

由 allenai 开发

Tulu 65B是基于多指令数据集微调的65B参数LLaMa模型，是开放资源指令调优研究的成果，综合性能强劲。

大型语言模型

Transformers

英语#多指令微调 #65B参数规模 #综合性能最优

下载量 20

发布时间 : 6/7/2023

模型简介

该模型通过FLAN V2、CoT、Dolly等多指令数据集微调，适用于多种自然语言处理任务，特别强调指令遵循能力。

模型特点

多指令数据集微调

整合FLAN V2、CoT、Dolly等7个高质量指令数据集进行训练

严格输入格式要求

采用特定对话格式(<|user|>/<|assistant|>标记)确保最佳生成效果

综合性能优异

在MMLU、GSM、BBH等多个基准测试中表现突出

模型能力

指令理解与执行

多轮对话生成

复杂问题解答

代码生成与解释

知识推理

使用案例

智能助手

任务型对话系统

处理复杂多轮指令对话

在AlpacaFarm评估中优于Davinci-003模型

教育研究

开放域问答

回答各类知识性问题

在MMLU基准测试中5-shot准确率达61.1%

🚀 Tulu 65B

Tulu 65B是一个基于65B参数的LLaMa模型，它在多种指令数据集（FLAN V2、CoT、Dolly、Open Assistant 1、GPT4 - Alpaca、Code - Alpaca和ShareGPT）上进行了微调。请注意，这是一个模型差异文件，使用说明见下文。

该模型是论文How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources研究的一部分。用于训练和评估该模型的代码库可在[https://github.com/allenai/open - instruct](https://github.com/allenai/open - instruct)找到。

这是本项目中训练出的综合性能最强的模型！

该模型遵循LICENSE.txt中给出的AI模型许可协议以及原始的Llama许可协议（llama_license.txt）。这些许可协议可在[我们的代码库](https://github.com/allenai/open - instruct/tree/main/model_licenses)中找到，模型许可见tulu_license.txt，Llama许可见llama_license.txt。

🚀 快速开始

访问模型

若要访问这些模型，请填写此表单，我们将进行审核，并告知您的用例是否获批。您在下方提供的信息仅用于评估访问这些模型的资格。

数据集

属性	详情
数据集	databricks/databricks - dolly - 15k、OpenAssistant/oasst1、sahil2801/CodeAlpaca - 20k
语言	英语

额外字段

字段	类型
名字	文本
姓氏	文本
机构	文本
所在国家	文本
预期用途	文本
过往相关出版物	文本
我同意遵守与此制品相关的许可条款，包括领域和使用限制	复选框

📦 安装指南

我们假设您已经可以访问HF格式的LLaMa模型。您可以在https://huggingface.co/docs/transformers/main/model_doc/llama找到获取访问权限和转换模型的详细信息。

克隆[https://github.com/allenai/open - instruct](https://github.com/allenai/open - instruct)并安装所需的依赖项，或者仅复制scripts/weight_diff.py并安装weight - diff - requirements.txt中列出的最小依赖项。然后将此模型差异文件下载或克隆到同一台机器上。

然后，运行以下命令：

python scripts/weight_diff.py recover --path_raw ${hf_llama_path} --path_tuned ${output_path} --path_diff ${diff_location}

这样您就可以恢复模型了！请注意，这会占用相当多的内存，尤其是对于较大的模型。

💻 使用示例

基础用法

模型训练使用以下格式（注意换行符）：

<|user|>
您的消息内容！
<|assistant|>

为获得最佳效果，请以这种方式格式化所有输入。确保在<|assistant|>后包含换行符，这对生成质量有很大影响。

📚 详细文档

性能表现

以下是该模型在论文How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources所涉及的基准测试中的性能表现：

MMLU 0 - shot	MMLU 5 - shot	GSM Direct	GSM CoT	BBH Direct	BBH CoT	TydiQA Gold - Passage	TydiQA Closed - book	Codex - Eval Pass@1	Codex - Eval Pass@10	AlpacaFarm vs Davinci - 003	平均值
59.2	61.1	9.0	60.0	48.1	53.5	51.8	13.3	28.9	45.9	62.7	46.3

引用说明

如果您使用此模型，请引用我们的论文、Llama论文以及原始数据集：

@misc{wang2023far,
      title={How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources}, 
      author={Yizhong Wang and Hamish Ivison and Pradeep Dasigi and Jack Hessel and Tushar Khot and Khyathi Raghavi Chandu and David Wadden and Kelsey MacMillan and Noah A. Smith and Iz Beltagy and Hannaneh Hajishirzi},
      year={2023},
      eprint={2306.04751},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@misc{touvron2023llama,
      title={LLaMA: Open and Efficient Foundation Language Models}, 
      author={Hugo Touvron and Thibaut Lavril and Gautier Izacard and Xavier Martinet and Marie - Anne Lachaux and Timothée Lacroix and Baptiste Rozière and Naman Goyal and Eric Hambro and Faisal Azhar and Aurelien Rodriguez and Armand Joulin and Edouard Grave and Guillaume Lample},
      year={2023},
      eprint={2302.13971},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@misc{dolly,
  author = {Databricks},
  title = {Free Dolly: Introducing the World's First Truly Open Instruction - Tuned LLM},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {Blog post},
  url = {https://www.databricks.com/blog/2023/04/12/dolly - first - open - commercially - viable - instruction - tuned - llm}
}

@article{longpre2023flan,
  title={The Flan Collection: Designing Data and Methods for Effective Instruction Tuning},
  author={Longpre, Shayne and Hou, Le and Vu, Tu and Webson, Albert and Chung, Hyung Won and Tay, Yi and Zhou, Denny and Le, Quoc V and Zoph, Barret and Wei, Jason and others},
  journal={arXiv preprint arXiv:2301.13688},
  year={2023}
}

@misc{köpf2023openassistant,
      title={OpenAssistant Conversations -- Democratizing Large Language Model Alignment}, 
      author={Andreas Köpf and Yannic Kilcher and Dimitri von Rütte and Sotiris Anagnostidis and Zhi - Rui Tam and Keith Stevens and Abdullah Barhoum and Nguyen Minh Duc and Oliver Stanley and Richárd Nagyfi and Shahul ES and Sameer Suri and David Glushkov and Arnav Dantuluri and Andrew Maguire and Christoph Schuhmann and Huu Nguyen and Alexander Mattick},
      year={2023},
      eprint={2304.07327},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@article{peng2023instruction,
  title={Instruction Tuning with GPT - 4},
  author={Peng, Baolin and Li, Chunyuan and He, Pengcheng and Galley, Michel and Gao, Jianfeng},
  journal={arXiv preprint arXiv:2304.03277},
  year={2023}
}

@misc{codealpaca,
  author = {Sahil Chaudhary},
  title = {Code Alpaca: An Instruction - following LLaMA model for code generation},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/sahil280114/codealpaca}},
}