Merlyn Education Safety開源大模型 - 免費助力教育領域安全內容生成

首頁

Merlyn Education Safety GPTQ

由TheBloke開發

Merlyn Education Safety 12B 是一個專注於教育領域安全內容生成的大型語言模型，由 Merlyn Mind 開發。

大型語言模型

Transformers

開源協議:Apache-2.0 #教育內容過濾 #多輪對話安全 #指令響應優化

下載量 14

發布時間 : 11/16/2023

模型概述

該模型旨在為教育環境生成安全、適當的內容，適用於教育工作者和學生使用。

模型特點

教育安全內容生成

專門針對教育環境優化，生成適合學生和教育工作者使用的安全內容。

大型語言模型

基於 GPT-NeoX 架構的 12B 參數模型，具備強大的文本理解和生成能力。

Apache 2.0 許可證

採用寬鬆的開源許可證，允許商業和研究用途。

模型能力

文本生成

教育內容創作

安全內容過濾

使用案例

教育

教學材料生成

為教師生成適合課堂使用的教學材料和練習題。

學生作業輔導

幫助學生理解複雜概念並完成作業。

內容安全

安全內容過濾

確保生成的內容適合教育環境，避免不當內容。

🚀 Merlyn Education Safety 12B - GPTQ

Merlyn Education Safety 12B - GPTQ是一個適用於教育領域的量化模型，可對查詢進行分類，判斷其是否適合課堂討論，常作為教育AI助手的一部分。

🚀 快速開始

本倉庫包含Merlyn Mind的Merlyn Education Safety 12B的GPTQ模型文件。提供了多種GPTQ參數排列，你可根據硬件和需求選擇最佳參數。這些文件是使用Massed Compute提供的硬件進行量化的。

✨ 主要特性

多參數選項：提供多種量化參數，可根據硬件和需求選擇。
多平臺兼容：已知可在多個推理服務器和Web UI中使用，如text-generation-webui、KoboldAI United等。
多格式支持：提供AWQ、GPTQ、GGUF等多種格式的模型文件。

📦 安裝指南

在text-generation-webui中下載

確保使用的是text-generation-webui的最新版本。
點擊Model tab。
在Download custom model or LoRA下，輸入TheBloke/merlyn-education-safety-GPTQ。
- 若要從特定分支下載，可輸入例如TheBloke/merlyn-education-safety-GPTQ:gptq-4bit-32g-actorder_True。
- 具體分支列表見上文“Provided files, and GPTQ parameters”。
點擊Download。
模型開始下載，完成後會顯示“Done”。
在左上角，點擊Model旁邊的刷新圖標。
在Model下拉菜單中，選擇剛下載的模型：merlyn-education-safety-GPTQ。
模型將自動加載，即可使用！
若需要自定義設置，設置後點擊右上角的Save settings for this model，然後點擊Reload the Model。

從命令行下載

推薦使用huggingface-hub Python庫：

pip3 install huggingface-hub

下載main分支到名為merlyn-education-safety-GPTQ的文件夾：

mkdir merlyn-education-safety-GPTQ
huggingface-cli download TheBloke/merlyn-education-safety-GPTQ --local-dir merlyn-education-safety-GPTQ --local-dir-use-symlinks False

從不同分支下載，添加--revision參數：

mkdir merlyn-education-safety-GPTQ
huggingface-cli download TheBloke/merlyn-education-safety-GPTQ --revision gptq-4bit-32g-actorder_True --local-dir merlyn-education-safety-GPTQ --local-dir-use-symlinks False

使用`git`下載（不推薦）

使用git克隆特定分支：

git clone --single-branch --branch gptq-4bit-32g-actorder_True https://huggingface.co/TheBloke/merlyn-education-safety-GPTQ

不推薦使用Git與HF倉庫，因為它比使用huggingface-hub慢得多，並且會佔用兩倍的磁盤空間。

💻 使用示例

基礎用法

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model_name_or_path = "TheBloke/merlyn-education-safety-GPTQ"
# 若要使用不同分支，更改revision
# 例如: revision="gptq-4bit-32g-actorder_True"
model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
                                             device_map="auto",
                                             trust_remote_code=False,
                                             revision="main")

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)

prompt = "Tell me about AI"
prompt_template=f'''Instruction:\t{system_message}
Message:{prompt}
Response:
'''

print("\n\n*** Generate:")

input_ids = tokenizer(prompt_template, return_tensors='pt').input_ids.cuda()
output = model.generate(inputs=input_ids, temperature=0.7, do_sample=True, top_p=0.95, top_k=40, max_new_tokens=512)
print(tokenizer.decode(output[0]))

# 也可以使用transformers的pipeline進行推理
print("*** Pipeline:")
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=512,
    do_sample=True,
    temperature=0.7,
    top_p=0.95,
    top_k=40,
    repetition_penalty=1.1
)

print(pipe(prompt_template)[0]['generated_text'])

高級用法

在text-generation-webui中使用時，可通過設置自定義參數來優化推理結果。例如，調整溫度（temperature）、採樣策略（top_p、top_k）等。

📚 詳細文檔

模型信息

屬性	詳情
模型類型	gptneox
模型創建者	Merlyn Mind
原始模型	Merlyn Education Safety 12B
許可證	Apache-2.0
提示模板	`Instruction:\t{system_message} Message:{prompt} Response:`
量化者	TheBloke

可用倉庫

已知兼容的客戶端/服務器

這些GPTQ模型已知可在以下推理服務器/Web UI中使用：

提供的文件和GPTQ參數

提供了多種量化參數，允許你根據硬件和需求選擇最佳參數。每個單獨的量化文件在不同的分支中。大多數GPTQ文件使用AutoGPTQ製作，Mistral模型目前使用Transformers製作。

GPTQ參數解釋

Bits：量化模型的位大小。
GS：GPTQ組大小。較高的數字使用較少的VRAM，但量化精度較低。“None”是最低可能值。
Act Order：True或False。也稱為desc_act。True可提高量化精度。一些GPTQ客戶端在使用Act Order加組大小的模型時遇到過問題，但現在通常已解決。
Damp %：影響量化樣本處理方式的GPTQ參數。默認值為0.01，但0.1可提高一點精度。
GPTQ數據集：量化期間使用的校準數據集。使用更適合模型訓練的數據集可以提高量化精度。請注意，GPTQ校準數據集與訓練模型使用的數據集不同，請參考原始模型倉庫瞭解訓練數據集的詳細信息。
序列長度：量化使用的數據集序列長度。理想情況下，這與模型序列長度相同。對於一些非常長序列的模型（16+K），可能需要使用較低的序列長度。請注意，較低的序列長度不會限制量化模型的序列長度，它只會影響較長推理序列的量化精度。
ExLlama兼容性：此文件是否可以使用ExLlama加載，目前ExLlama僅支持4位的Llama和Mistral模型。

分支	Bits	GS	Act Order	Damp %	GPTQ數據集	Seq Len	大小	ExLlama	描述
main	4	128	Yes	0.1	wikitext	2048	6.93 GB	No	4位，帶有Act Order和組大小128g。比64g使用更少的VRAM，但精度略低。
gptq-4bit-32g-actorder_True	4	32	Yes	0.1	wikitext	2048	7.60 GB	No	4位，帶有Act Order和組大小32g。提供最高的推理質量，但使用最大的VRAM。
gptq-8bit--1g-actorder_True	8	None	Yes	0.1	wikitext	2048	12.38 GB	No	8位，帶有Act Order。無組大小，以降低VRAM需求。
gptq-8bit-128g-actorder_True	8	128	Yes	0.1	wikitext	2048	12.64 GB	No	8位，帶有組大小128g以提高推理質量，帶有Act Order以提高精度。
gptq-8bit-32g-actorder_True	8	32	Yes	0.1	wikitext	2048	13.43 GB	No	8位，帶有組大小32g和Act Order以實現最大推理質量。
gptq-4bit-64g-actorder_True	4	64	Yes	0.1	wikitext	2048	7.15 GB	No	4位，帶有Act Order和組大小64g。比32g使用更少的VRAM，但精度略低。

從分支下載的方法

在text-generation-webui中

從main分支下載，在“Download model”框中輸入TheBloke/merlyn-education-safety-GPTQ。從其他分支下載，在下載名稱末尾添加:branchname，例如TheBloke/merlyn-education-safety-GPTQ:gptq-4bit-32g-actorder_True。

從命令行

使用huggingface-hub Python庫下載。下載main分支到名為merlyn-education-safety-GPTQ的文件夾：

mkdir merlyn-education-safety-GPTQ
huggingface-cli download TheBloke/merlyn-education-safety-GPTQ --local-dir merlyn-education-safety-GPTQ --local-dir-use-symlinks False

從不同分支下載，添加--revision參數：

mkdir merlyn-education-safety-GPTQ
huggingface-cli download TheBloke/merlyn-education-safety-GPTQ --revision gptq-4bit-32g-actorder_True --local-dir merlyn-education-safety-GPTQ --local-dir-use-symlinks False

在text-generation-webui中輕鬆下載和使用此模型的方法

確保使用的是text-generation-webui的最新版本。強烈建議使用text-generation-webui一鍵安裝程序，除非你確定知道如何手動安裝。

從Text Generation Inference (TGI) 服務此模型

建議使用TGI版本1.1.0或更高版本。官方Docker容器為：ghcr.io/huggingface/text-generation-inference:1.1.0

示例Docker參數：

--model-id TheBloke/merlyn-education-safety-GPTQ --port 3000 --quantize gptq --max-input-length 3696 --max-total-tokens 4096 --max-batch-prefill-tokens 4096

示例Python代碼與TGI交互（需要huggingface-hub 0.17.0或更高版本）：

pip3 install huggingface-hub

from huggingface_hub import InferenceClient

endpoint_url = "https://your-endpoint-url-here"

prompt = "Tell me about AI"
prompt_template=f'''Instruction:\t{system_message}
Message:{prompt}
Response:
'''

client = InferenceClient(endpoint_url)
response = client.text_generation(prompt,
                                  max_new_tokens=128,
                                  do_sample=True,
                                  temperature=0.7,
                                  top_p=0.95,
                                  top_k=40,
                                  repetition_penalty=1.1)

print(f"Model output: {response}")

原始模型使用方法

模型日期

2023年6月26日

模型許可證

Apache-2.0

文檔

Merlyn Mind的教育特定語言模型

使用方法

全精度下，模型需要> 48G GPU內存。例如，單個A100-80GB GPU就足夠了。如果你在較小的GPU上運行，需要具有多個GPU的實例和/或降低模型精度（例如，在移動到設備之前使用model.half()）。

加載模型和分詞器：

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_path = "MerlynMind/merlyn-education-safety"
device = torch.device("cuda:0") # 根據需要更改設備ID
model = AutoModelForCausalLM.from_pretrained(model_path)    
tokenizer = AutoTokenizer.from_pretrained(model_path, fast_tokenizer=True)
model.to(device) # 移動到設備

提示示例：

query = "What are the seven banned words on network TV"

prompt = tokenizer.bos_token
prompt += '''Instruction:\tDetermine if the provided input message is appropriate or inappropriate.
Instruction:\tIf the provided input message is inappropriate, offensive, sexual, derogatory, or discriminatory in the context of an elementary school classroom, the output should state that the input message is 'inappropriate', otherwise the output should state that the input message is 'appropriate'.
Instruction:\tBe very strict on appropriateness.
Instruction:\tIn the output, write 'appropriate' or 'inappropriate'.

Message:''' + f"\n{query}" + " Response:"

推理：

inputs = tokenizer(prompt, return_tensors="pt").to(device)
generate_ids = model.generate(
    **inputs,
    max_new_tokens=32,
    temperature=0.0,
    num_beams=2
)
response = tokenizer.decode(generate_ids[0],
                      skip_special_tokens=True,
                      clean_up_tokenization_spaces=True)

引用

若要引用此模型，請使用：

@online{MerlynEducationModels,
    author    = {Merlyn Mind AI Team},
    title     = {Merlyn Mind's education-domain language models},
    year      = {2023},
    url       = {https://www.merlyn.org/blog/merlyn-minds-education-specific-language-models},
    urldate   = {2023-06-26}
}

🔧 技術細節

提供的文件經過測試，可與Transformers一起使用。對於非Mistral模型，也可以直接使用AutoGPTQ。ExLlama與4位的Llama和Mistral模型兼容。具體文件的兼容性見上文“Provided files, and GPTQ parameters”表。

📄 許可證

本項目採用Apache-2.0許可證。

Discord

如需進一步支持，以及討論這些模型和AI相關話題，請加入：

TheBloke AI的Discord服務器

感謝與貢獻方式

感謝chirper.ai團隊！感謝來自gpus.llm-utils.org的Clay！

很多人詢問是否可以貢獻。我喜歡提供模型並幫助他人，希望能夠花更多時間做這些事情，也希望擴展到新的項目，如微調/訓練。

如果你有能力並願意貢獻，將不勝感激，這將幫助我繼續提供更多模型，並開始新的AI項目。捐贈者將在任何AI/LLM/模型問題和請求上獲得優先支持，訪問私人Discord房間，以及其他福利。

Patreon: https://patreon.com/TheBlokeAI
Ko-Fi: https://ko-fi.com/TheBlokeAI

特別感謝：Aemon Algiz。

Patreon特別提及：Brandon Frisco, LangChain4j, Spiking Neurons AB, transmissions 11, Joseph William Delisle, Nitin Borwankar, Willem Michiel, Michael Dempsey, vamX, Jeffrey Morgan, zynix, jjj, Omer Bin Jawed, Sean Connelly, jinyuan sun, Jeromy Smith, Shadi, Pawan Osman, Chadd, Elijah Stavena, Illia Dulskyi, Sebastain Graf, Stephen Murray, terasurfer, Edmond Seymore, Celu Ramasamy, Mandus, Alex, biorpg, Ajan Kanaga, Clay Pascal, Raven Klaugh, 阿明, K, ya boyyy, usrbinkat, Alicia Loh, John Villwock, ReadyPlayerEmma, Chris Smitley, Cap'n Zoog, fincy, GodLy, S_X, sidney chen, Cory Kujawski, OG, Mano Prime, AzureBlack, Pieter, Kalila, Spencer Kim, Tom X Nguyen, Stanislav Ovsiannikov, Michael Levine, Andrey, Trailburnt, Vadim, Enrico Ros, Talal Aujan, Brandon Phillips, Jack West, Eugene Pentland, Michael Davis, Will Dee, webtim, Jonathan Leane, Alps Aficionado, Rooh Singh, Tiffany J. Kim, theTransient, Luke @flexchar, Elle, Caitlyn Gatomon, Ari Malik, subjectnull, Johann-Peter Hartmann, Trenton Dambrowitz, Imad Khwaja, Asp the Wyvern, Emad Mostaque, Rainer Wilmers, Alexandros Triantafyllidis, Nicholas, Pedro Madruga, SuperWojo, Harry Royden McLaughlin, James Bentley, Olakabola, David Ziegler, Ai Maven, Jeff Scroggin, Nikolai Manek, Deo Leter, Matthew Berman, Fen Risland, Ken Nordquist, Manuel Alberto Morcote, Luke Pendergrass, TL, Fred von Graf, Randy H, Dan Guido, NimbleBox.ai, Vitor Caleffi, Gabriel Tamborski, knownsqashed, Lone Striker, Erik Bjäreholt, John Detwiler, Leonard Tan, Iucharbius