open-instruct-stanford-alpaca-7b开源模型 - 基于Alpaca数据集微调，支持指令调优

首页

Open Instruct Stanford Alpaca 7b

由 allenai 开发

基于斯坦福Alpaca数据集微调的7B参数LLaMa模型，专注于开放资源指令调优

大型语言模型

Transformers

英语#指令微调 #开放资源LLM #多任务评估

下载量 220

发布时间 : 6/7/2023

模型简介

该模型是基于LLaMa架构微调的大型语言模型，专门针对指令跟随任务进行优化，能够理解和执行自然语言指令

模型特点

开放资源指令调优

基于斯坦福Alpaca数据集进行微调，专注于开放资源的指令调优

高效参数规模

7B参数规模在保持性能的同时提高了推理效率

结构化输入格式

采用特定的结构化输入格式(<|user|>和<|assistant|>标记)以获得最佳效果

模型能力

自然语言理解

指令跟随

文本生成

问答系统

使用案例

教育

智能教学助手

作为教育辅助工具回答学生问题

研究

语言模型研究

用于开放资源指令调优的研究

🚀 开放指令斯坦福Alpaca 7B

本模型是在斯坦福Alpaca数据集上微调的70亿参数LLaMa模型。请注意，这是一个模型差异文件（model diff），使用说明见下文。

该模型是论文《骆驼能走多远？探索开放资源上指令调优的现状》研究的一部分。用于训练和评估此模型的代码库可在https://github.com/allenai/open-instruct找到。

本模型遵循LICENSE.txt中规定的AI模型许可证，以及原始LLaMa许可证（llama_license.txt）。

🚀 快速开始

📦 安装指南

假设你已经有权访问HF格式的LLaMa模型。你可以在这里找到获取访问权限和转换模型的详细信息。

克隆https://github.com/allenai/open-instruct仓库并安装所需依赖，或者仅复制scripts/weight_diff.py文件，并安装weight-diff-requirements.txt中列出的最小依赖项。然后将此模型差异文件下载或克隆到同一台机器上。

💻 使用示例

基础用法

运行以下命令来恢复模型：

python scripts/weight_diff.py recover --path_raw ${hf_llama_path} --path_tuned ${output_path} --path_diff ${diff_location}

运行上述命令后，你将得到一个恢复后的模型！请注意，这会占用大量的内存，尤其是对于较大的模型。

📚 详细文档

输入格式

该模型训练时使用以下格式（注意换行符）：

<|user|>
Your message here!
<|assistant|>

为获得最佳效果，请以这种方式格式化所有输入。

性能

以下是该模型在我们的论文《骆驼能走多远？探索开放资源上指令调优的现状》所探索的各项基准测试中的性能表现：

零样本MMLU	五样本MMLU	GSM直接推理	GSM思维链	BBH直接推理	BBH思维链	TydiQA黄金段落	TydiQA闭卷	Codex-Eval通过率@1	Codex-Eval通过率@10	AlpacaFarm与Davinci-003对比	平均值
41.5	40.3	7.0	10.0	32.6	31.8	31.2	7.2	13.2	22.0	21.1	23.3

📄 许可证

如果你使用此模型，请引用我们的论文、LLaMa论文以及原始数据集：

@misc{wang2023far,
      title={How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources}, 
      author={Yizhong Wang and Hamish Ivison and Pradeep Dasigi and Jack Hessel and Tushar Khot and Khyathi Raghavi Chandu and David Wadden and Kelsey MacMillan and Noah A. Smith and Iz Beltagy and Hannaneh Hajishirzi},
      year={2023},
      eprint={2306.04751},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@misc{touvron2023llama,
      title={LLaMA: Open and Efficient Foundation Language Models}, 
      author={Hugo Touvron and Thibaut Lavril and Gautier Izacard and Xavier Martinet and Marie-Anne Lachaux and Timothée Lacroix and Baptiste Rozière and Naman Goyal and Eric Hambro and Faisal Azhar and Aurelien Rodriguez and Armand Joulin and Edouard Grave and Guillaume Lample},
      year={2023},
      eprint={2302.13971},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@misc{alpaca,
  author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },
  title = {Stanford Alpaca: An Instruction-following LLaMA model},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
}