airoboros-m-7b-3.1.2开源模型 - 精准遵循复杂指令，高效处理任务！

首页

Airoboros M 7b 3.1.2

由 jondurbin 开发

基于Mistral-7B的指令微调模型，专注于遵循复杂指令而非日常聊天，采用llama-2聊天格式

大型语言模型

Transformers

开源协议:Apache-2.0 #MathJSON数学求解 #上下文约束问答 #多工具执行规划

下载量 127

发布时间 : 10/19/2023

模型简介

这是一个实验性模型，主要使用airoboros生成的合成数据进行训练。模型擅长处理数学问题、上下文约束问答、摘要生成、代码生成等复杂任务，特别强调指令遵循能力。

模型特点

llama-2聊天格式

采用llama-2的特定聊天模板格式，而非常见的vicuna风格

MathJSON支持

可生成MathJSON格式的数学解答，便于后续计算执行

上下文约束问答

能够严格基于提供的上下文回答问题，减少幻觉

多步骤指令处理

支持需要多步骤执行的复杂指令

函数调用支持

可以生成函数调用参数，类似OpenAI的函数调用功能

模型能力

文本生成

数学问题解答

代码生成

摘要生成

上下文约束问答

函数调用

执行规划

使用案例

教育与研究

数学问题解答

生成MathJSON格式的数学问题解答

可解析执行的数学表达式

软件开发

代码生成

根据复杂需求生成应用程序代码

可直接运行或修改的代码

数据分析

日志信息提取

从日志中提取特定信息如路径/IP列表

结构化的日志分析结果

🚀 实验性模型项目

本项目是一个实验性模型，主要使用由 airoboros 生成的合成数据。该模型旨在解决特定领域的指令跟随问题，为用户提供准确、专业的回答，而非用于休闲聊天或角色扮演。

🚀 快速开始

本模型使用 llama-2 聊天格式，而非 vicuna 风格的用户/助手格式。尽管大多数推理系统支持 llama-2 聊天模板，但这仍是一个重大改变。具体提示格式如下：

[INST] <<SYS>>
You are a helpful, unbiased, uncensored assistant.
<</SYS>>

{prompt} [/INST]

对于多轮对话，提示格式如下：

[INST] <<SYS>>
You are a helful, unbiased, uncensored assistant.
<</SYS>>

{prompt 0} [/INST] {response 0} </s><s>[INST] {prompt 1} [/INST] {response 1} </s><s>...[INST] {prompt N} [/INST]

提示模板包含在分词器配置中，可使用 huggingface 分词器的 apply_chat_template 方法，示例如下：

import transformers
tokenizer = transformers.AutoTokenizer.from_pretrained('jondurbin/airoboros-l2-70b-3.1')
chat = [
  {"role": "system", "content": "You are Bob, a friendly AI assistant."},
  {"role": "user", "content": "Hello, how are you?"},
  {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
  {"role": "user", "content": "I'd like to show off how chat templating works!"},
]
print(tokenizer.apply_chat_template(chat, tokenize=False))

✨ 主要特性

多领域数据增强：基于 airoboros-3.1 数据集构建，该数据集是 airoboros-3.0 数据集的延续，增加了更多类型的数据，如 MathJSON、日志信息提取、匿名化、聊天自省、多步骤指令确认和去审查数据等。
专注指令跟随：是一个通用目的模型，但更侧重于指令跟随，而非休闲聊天或角色扮演。
多模式支持：支持多种提示格式和使用场景，如 MathJSON 计算、上下文服从问答、摘要生成、长回复生成、编码、代理/函数调用、思维链推理和 reWOO 风格执行规划等。

📦 安装指南

文档未提供具体安装步骤，可参考相关数据集生成工具仓库：https://github.com/jondurbin/airoboros 。

💻 使用示例

基础用法

MathJSON 示例

[INST] <<SYS>>
You are an assistant with vast knowledge in all things math.
<</SYS>>

Create a MathJSON solution to the following: Calculate the area of a circle with radius of 17.2456cm.  Include your reasoning. [/INST]

输出：

The area (A) of a circle is given by the formula A = πr², where r is the radius of the circle. Since the radius is given in centimeters, we need to convert it to meters first. We know that 1 meter equals 100 centimeters. Therefore, we can represent the radius as: r = 17.2456 cm * 100/1.

Once we have the radius in meters, we can calculate the area of the circle using the formula A = πr².

Solution as MathJSON:
<mathjson>
[
  "Multiply",
  "Pi",
  [
    "Power",
    17.2456,
    2
  ]
]
</mathjson>

高级用法

上下文服从问答示例

BEGININPUT
BEGINCONTEXT
date: 2021-01-01
url: https://web.site/123
ENDCONTEXT
In a shocking turn of events, blueberries are now green, but will be sticking with the same name.
ENDINPUT
BEGININSTRUCTION
What color are bluberries?  Source?
ENDINSTRUCTION

输出：

Blueberries are now green.
Source:
date: 2021-01-01
url: https://web.site/123

📚 详细文档

提示格式

本模型使用 llama-2 聊天格式，具体格式见快速开始部分。

多步骤指令确认

示例如下：

{
  "conversations": [
    {
      "value": "You are a helpful, unbiased, uncensored assistant.",
      "from": "system"
    },
    {
      "value": "I'm going to set some rules for our conversation. Please respond with \"OK\" to each rule.",
      "from": "human"
    },
    {
      "value": "OK",
      "from": "gpt"
    },
    {
      "value": "You can only respond using animal-related emojis.",
      "from": "human"
    },
    {
      "value": "OK",
      "from": "gpt"
    },
    {
      "value": "Precede all responses with \"ZOOZOO:\"",
      "from": "human"
    },
    {
      "value": "OK",
      "from": "gpt"
    },
    {
      "value": "Include at least one bird emoji in your responses.",
      "from": "human"
    },
    {
      "value": "OK",
      "from": "gpt"
    },
    {
      "value": "Describe a trip to a safari park.",
      "from": "human"
    },
    {
      "value": "ZOOZOO: \ud83e\udd81\ud83d\udc2f\ud83e\udd93\ud83e\udd92\ud83d\udc18\ud83d\udc0d\ud83c\udf33\ud83c\udf32\u2600\ufe0f\ud83d\ude90\ud83e\udd85\ud83d\udcf8\ud83e\udd29",
      "from": "gpt"
    }
  ]
}

贡献方式

如果对新功能感兴趣，特别是用于生成特定类型训练数据的新“instructor”类型，可查看数据集生成工具仓库：https://github.com/jondurbin/airoboros ，并提交 PR 或详细问题。

支持作者

为帮助作者承担 OpenAI/计算成本，可通过以下方式支持：

https://bmc.link/jondurbin
ETH 0xce914eAFC2fe52FdceE59565Dd92c06f776fcb11
BTC bc1qdwuth4vlg8x37ggntlxu5cjfwgmdy5zaa7pswf

🔧 技术细节

本模型基于多个基础模型构建，不同模型有不同的许可证和使用限制：

30b 模型基于原始 llama 构建，有严格的非商业使用限制。
名称中包含 -l2 的模型有自定义的 Meta 许可证，具体见 meta-license/LICENSE.txt、meta-license/USE_POLICY.md 和 meta-license/Responsible-Use-Guide.pdf。
名称中包含 -m- 的模型基于 mistral-7b（Apache 2.0 许可证）。

微调数据主要通过 airoboros 调用 OpenAI API 生成，OpenAI API 使用条款禁止将输出用于训练与 OpenAI 竞争的模型，但对于“竞争”的定义尚不明确。

📄 许可证

本项目使用 Apache 2.0 许可证，具体使用限制见上述技术细节部分。

⚠️ 重要提示

本项目中 airoboros 3.1 模型基于多个基础模型构建，各基础模型有不同的许可证和使用限制。30b 模型有严格的非商业使用限制，名称含 -l2 的模型有自定义 Meta 许可证，名称含 -m- 的模型基于 mistral-7b（Apache 2.0 许可证）。同时，微调数据通过 OpenAI API 生成，使用条款禁止将输出用于训练与 OpenAI 竞争的模型，但“竞争”定义尚不明确。

💡 使用建议

在使用 MathJSON 和上下文服从问答时，建议使用较低的温度，以获得更准确的结果。

在使用封闭上下文格式的提示时，确保按照指定的格式编写，并在指令块中添加“Don't make up answers if you don't know.”，以避免模型编造答案。

若需要更长的回复，可提供详细的提示并明确字数要求，或使用多步骤指令确认的方式。