Merlyn Education Safety开源大模型 - 免费助力教育领域安全内容生成

首页

Merlyn Education Safety GPTQ

由 TheBloke 开发

Merlyn Education Safety 12B 是一个专注于教育领域安全内容生成的大型语言模型，由 Merlyn Mind 开发。

大型语言模型

Transformers

开源协议:Apache-2.0 #教育内容过滤 #多轮对话安全 #指令响应优化

下载量 14

发布时间 : 11/16/2023

模型简介

该模型旨在为教育环境生成安全、适当的内容，适用于教育工作者和学生使用。

模型特点

教育安全内容生成

专门针对教育环境优化，生成适合学生和教育工作者使用的安全内容。

大型语言模型

基于 GPT-NeoX 架构的 12B 参数模型，具备强大的文本理解和生成能力。

Apache 2.0 许可证

采用宽松的开源许可证，允许商业和研究用途。

模型能力

文本生成

教育内容创作

安全内容过滤

使用案例

教育

教学材料生成

为教师生成适合课堂使用的教学材料和练习题。

学生作业辅导

帮助学生理解复杂概念并完成作业。

内容安全

安全内容过滤

确保生成的内容适合教育环境，避免不当内容。

🚀 Merlyn Education Safety 12B - GPTQ

Merlyn Education Safety 12B - GPTQ是一个适用于教育领域的量化模型，可对查询进行分类，判断其是否适合课堂讨论，常作为教育AI助手的一部分。

🚀 快速开始

本仓库包含Merlyn Mind的Merlyn Education Safety 12B的GPTQ模型文件。提供了多种GPTQ参数排列，你可根据硬件和需求选择最佳参数。这些文件是使用Massed Compute提供的硬件进行量化的。

✨ 主要特性

多参数选项：提供多种量化参数，可根据硬件和需求选择。
多平台兼容：已知可在多个推理服务器和Web UI中使用，如text-generation-webui、KoboldAI United等。
多格式支持：提供AWQ、GPTQ、GGUF等多种格式的模型文件。

📦 安装指南

在text-generation-webui中下载

确保使用的是text-generation-webui的最新版本。
点击Model tab。
在Download custom model or LoRA下，输入TheBloke/merlyn-education-safety-GPTQ。
- 若要从特定分支下载，可输入例如TheBloke/merlyn-education-safety-GPTQ:gptq-4bit-32g-actorder_True。
- 具体分支列表见上文“Provided files, and GPTQ parameters”。
点击Download。
模型开始下载，完成后会显示“Done”。
在左上角，点击Model旁边的刷新图标。
在Model下拉菜单中，选择刚下载的模型：merlyn-education-safety-GPTQ。
模型将自动加载，即可使用！
若需要自定义设置，设置后点击右上角的Save settings for this model，然后点击Reload the Model。

从命令行下载

推荐使用huggingface-hub Python库：

pip3 install huggingface-hub

下载main分支到名为merlyn-education-safety-GPTQ的文件夹：

mkdir merlyn-education-safety-GPTQ
huggingface-cli download TheBloke/merlyn-education-safety-GPTQ --local-dir merlyn-education-safety-GPTQ --local-dir-use-symlinks False

从不同分支下载，添加--revision参数：

mkdir merlyn-education-safety-GPTQ
huggingface-cli download TheBloke/merlyn-education-safety-GPTQ --revision gptq-4bit-32g-actorder_True --local-dir merlyn-education-safety-GPTQ --local-dir-use-symlinks False

使用`git`下载（不推荐）

使用git克隆特定分支：

git clone --single-branch --branch gptq-4bit-32g-actorder_True https://huggingface.co/TheBloke/merlyn-education-safety-GPTQ

不推荐使用Git与HF仓库，因为它比使用huggingface-hub慢得多，并且会占用两倍的磁盘空间。

💻 使用示例

基础用法

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model_name_or_path = "TheBloke/merlyn-education-safety-GPTQ"
# 若要使用不同分支，更改revision
# 例如: revision="gptq-4bit-32g-actorder_True"
model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
                                             device_map="auto",
                                             trust_remote_code=False,
                                             revision="main")

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)

prompt = "Tell me about AI"
prompt_template=f'''Instruction:\t{system_message}
Message:{prompt}
Response:
'''

print("\n\n*** Generate:")

input_ids = tokenizer(prompt_template, return_tensors='pt').input_ids.cuda()
output = model.generate(inputs=input_ids, temperature=0.7, do_sample=True, top_p=0.95, top_k=40, max_new_tokens=512)
print(tokenizer.decode(output[0]))

# 也可以使用transformers的pipeline进行推理
print("*** Pipeline:")
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=512,
    do_sample=True,
    temperature=0.7,
    top_p=0.95,
    top_k=40,
    repetition_penalty=1.1
)

print(pipe(prompt_template)[0]['generated_text'])

高级用法

在text-generation-webui中使用时，可通过设置自定义参数来优化推理结果。例如，调整温度（temperature）、采样策略（top_p、top_k）等。

📚 详细文档

模型信息

属性	详情
模型类型	gptneox
模型创建者	Merlyn Mind
原始模型	Merlyn Education Safety 12B
许可证	Apache-2.0
提示模板	`Instruction:\t{system_message} Message:{prompt} Response:`
量化者	TheBloke

可用仓库

已知兼容的客户端/服务器

这些GPTQ模型已知可在以下推理服务器/Web UI中使用：

提供的文件和GPTQ参数

提供了多种量化参数，允许你根据硬件和需求选择最佳参数。每个单独的量化文件在不同的分支中。大多数GPTQ文件使用AutoGPTQ制作，Mistral模型目前使用Transformers制作。

GPTQ参数解释

Bits：量化模型的位大小。
GS：GPTQ组大小。较高的数字使用较少的VRAM，但量化精度较低。“None”是最低可能值。
Act Order：True或False。也称为desc_act。True可提高量化精度。一些GPTQ客户端在使用Act Order加组大小的模型时遇到过问题，但现在通常已解决。
Damp %：影响量化样本处理方式的GPTQ参数。默认值为0.01，但0.1可提高一点精度。
GPTQ数据集：量化期间使用的校准数据集。使用更适合模型训练的数据集可以提高量化精度。请注意，GPTQ校准数据集与训练模型使用的数据集不同，请参考原始模型仓库了解训练数据集的详细信息。
序列长度：量化使用的数据集序列长度。理想情况下，这与模型序列长度相同。对于一些非常长序列的模型（16+K），可能需要使用较低的序列长度。请注意，较低的序列长度不会限制量化模型的序列长度，它只会影响较长推理序列的量化精度。
ExLlama兼容性：此文件是否可以使用ExLlama加载，目前ExLlama仅支持4位的Llama和Mistral模型。

分支	Bits	GS	Act Order	Damp %	GPTQ数据集	Seq Len	大小	ExLlama	描述
main	4	128	Yes	0.1	wikitext	2048	6.93 GB	No	4位，带有Act Order和组大小128g。比64g使用更少的VRAM，但精度略低。
gptq-4bit-32g-actorder_True	4	32	Yes	0.1	wikitext	2048	7.60 GB	No	4位，带有Act Order和组大小32g。提供最高的推理质量，但使用最大的VRAM。
gptq-8bit--1g-actorder_True	8	None	Yes	0.1	wikitext	2048	12.38 GB	No	8位，带有Act Order。无组大小，以降低VRAM需求。
gptq-8bit-128g-actorder_True	8	128	Yes	0.1	wikitext	2048	12.64 GB	No	8位，带有组大小128g以提高推理质量，带有Act Order以提高精度。
gptq-8bit-32g-actorder_True	8	32	Yes	0.1	wikitext	2048	13.43 GB	No	8位，带有组大小32g和Act Order以实现最大推理质量。
gptq-4bit-64g-actorder_True	4	64	Yes	0.1	wikitext	2048	7.15 GB	No	4位，带有Act Order和组大小64g。比32g使用更少的VRAM，但精度略低。

从分支下载的方法

在text-generation-webui中

从main分支下载，在“Download model”框中输入TheBloke/merlyn-education-safety-GPTQ。从其他分支下载，在下载名称末尾添加:branchname，例如TheBloke/merlyn-education-safety-GPTQ:gptq-4bit-32g-actorder_True。

从命令行

使用huggingface-hub Python库下载。下载main分支到名为merlyn-education-safety-GPTQ的文件夹：

mkdir merlyn-education-safety-GPTQ
huggingface-cli download TheBloke/merlyn-education-safety-GPTQ --local-dir merlyn-education-safety-GPTQ --local-dir-use-symlinks False

从不同分支下载，添加--revision参数：

mkdir merlyn-education-safety-GPTQ
huggingface-cli download TheBloke/merlyn-education-safety-GPTQ --revision gptq-4bit-32g-actorder_True --local-dir merlyn-education-safety-GPTQ --local-dir-use-symlinks False

在text-generation-webui中轻松下载和使用此模型的方法

确保使用的是text-generation-webui的最新版本。强烈建议使用text-generation-webui一键安装程序，除非你确定知道如何手动安装。

从Text Generation Inference (TGI) 服务此模型

建议使用TGI版本1.1.0或更高版本。官方Docker容器为：ghcr.io/huggingface/text-generation-inference:1.1.0

示例Docker参数：

--model-id TheBloke/merlyn-education-safety-GPTQ --port 3000 --quantize gptq --max-input-length 3696 --max-total-tokens 4096 --max-batch-prefill-tokens 4096

示例Python代码与TGI交互（需要huggingface-hub 0.17.0或更高版本）：

pip3 install huggingface-hub

from huggingface_hub import InferenceClient

endpoint_url = "https://your-endpoint-url-here"

prompt = "Tell me about AI"
prompt_template=f'''Instruction:\t{system_message}
Message:{prompt}
Response:
'''

client = InferenceClient(endpoint_url)
response = client.text_generation(prompt,
                                  max_new_tokens=128,
                                  do_sample=True,
                                  temperature=0.7,
                                  top_p=0.95,
                                  top_k=40,
                                  repetition_penalty=1.1)

print(f"Model output: {response}")

原始模型使用方法

模型日期

2023年6月26日

模型许可证

Apache-2.0

文档

Merlyn Mind的教育特定语言模型

使用方法

全精度下，模型需要> 48G GPU内存。例如，单个A100-80GB GPU就足够了。如果你在较小的GPU上运行，需要具有多个GPU的实例和/或降低模型精度（例如，在移动到设备之前使用model.half()）。

加载模型和分词器：

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_path = "MerlynMind/merlyn-education-safety"
device = torch.device("cuda:0") # 根据需要更改设备ID
model = AutoModelForCausalLM.from_pretrained(model_path)    
tokenizer = AutoTokenizer.from_pretrained(model_path, fast_tokenizer=True)
model.to(device) # 移动到设备

提示示例：

query = "What are the seven banned words on network TV"

prompt = tokenizer.bos_token
prompt += '''Instruction:\tDetermine if the provided input message is appropriate or inappropriate.
Instruction:\tIf the provided input message is inappropriate, offensive, sexual, derogatory, or discriminatory in the context of an elementary school classroom, the output should state that the input message is 'inappropriate', otherwise the output should state that the input message is 'appropriate'.
Instruction:\tBe very strict on appropriateness.
Instruction:\tIn the output, write 'appropriate' or 'inappropriate'.

Message:''' + f"\n{query}" + " Response:"

推理：

inputs = tokenizer(prompt, return_tensors="pt").to(device)
generate_ids = model.generate(
    **inputs,
    max_new_tokens=32,
    temperature=0.0,
    num_beams=2
)
response = tokenizer.decode(generate_ids[0],
                      skip_special_tokens=True,
                      clean_up_tokenization_spaces=True)

引用

若要引用此模型，请使用：

@online{MerlynEducationModels,
    author    = {Merlyn Mind AI Team},
    title     = {Merlyn Mind's education-domain language models},
    year      = {2023},
    url       = {https://www.merlyn.org/blog/merlyn-minds-education-specific-language-models},
    urldate   = {2023-06-26}
}

🔧 技术细节

提供的文件经过测试，可与Transformers一起使用。对于非Mistral模型，也可以直接使用AutoGPTQ。ExLlama与4位的Llama和Mistral模型兼容。具体文件的兼容性见上文“Provided files, and GPTQ parameters”表。

📄 许可证

本项目采用Apache-2.0许可证。

Discord

如需进一步支持，以及讨论这些模型和AI相关话题，请加入：

TheBloke AI的Discord服务器

感谢与贡献方式

感谢chirper.ai团队！感谢来自gpus.llm-utils.org的Clay！

很多人询问是否可以贡献。我喜欢提供模型并帮助他人，希望能够花更多时间做这些事情，也希望扩展到新的项目，如微调/训练。

如果你有能力并愿意贡献，将不胜感激，这将帮助我继续提供更多模型，并开始新的AI项目。捐赠者将在任何AI/LLM/模型问题和请求上获得优先支持，访问私人Discord房间，以及其他福利。

Patreon: https://patreon.com/TheBlokeAI
Ko-Fi: https://ko-fi.com/TheBlokeAI

特别感谢：Aemon Algiz。

Patreon特别提及：Brandon Frisco, LangChain4j, Spiking Neurons AB, transmissions 11, Joseph William Delisle, Nitin Borwankar, Willem Michiel, Michael Dempsey, vamX, Jeffrey Morgan, zynix, jjj, Omer Bin Jawed, Sean Connelly, jinyuan sun, Jeromy Smith, Shadi, Pawan Osman, Chadd, Elijah Stavena, Illia Dulskyi, Sebastain Graf, Stephen Murray, terasurfer, Edmond Seymore, Celu Ramasamy, Mandus, Alex, biorpg, Ajan Kanaga, Clay Pascal, Raven Klaugh, 阿明, K, ya boyyy, usrbinkat, Alicia Loh, John Villwock, ReadyPlayerEmma, Chris Smitley, Cap'n Zoog, fincy, GodLy, S_X, sidney chen, Cory Kujawski, OG, Mano Prime, AzureBlack, Pieter, Kalila, Spencer Kim, Tom X Nguyen, Stanislav Ovsiannikov, Michael Levine, Andrey, Trailburnt, Vadim, Enrico Ros, Talal Aujan, Brandon Phillips, Jack West, Eugene Pentland, Michael Davis, Will Dee, webtim, Jonathan Leane, Alps Aficionado, Rooh Singh, Tiffany J. Kim, theTransient, Luke @flexchar, Elle, Caitlyn Gatomon, Ari Malik, subjectnull, Johann-Peter Hartmann, Trenton Dambrowitz, Imad Khwaja, Asp the Wyvern, Emad Mostaque, Rainer Wilmers, Alexandros Triantafyllidis, Nicholas, Pedro Madruga, SuperWojo, Harry Royden McLaughlin, James Bentley, Olakabola, David Ziegler, Ai Maven, Jeff Scroggin, Nikolai Manek, Deo Leter, Matthew Berman, Fen Risland, Ken Nordquist, Manuel Alberto Morcote, Luke Pendergrass, TL, Fred von Graf, Randy H, Dan Guido, NimbleBox.ai, Vitor Caleffi, Gabriel Tamborski, knownsqashed, Lone Striker, Erik Bjäreholt, John Detwiler, Leonard Tan, Iucharbius