h2ogpt-gm-oasst1-en-2048-open-llama-7b-preview-700bt開源大模型

首頁

H2ogpt Gm Oasst1 En 2048 Open Llama 7b Preview 700bt

由h2oai開發

基於OpenLlama 7B預訓練模型微調的大語言模型，使用OpenAssistant數據集訓練，支持英文文本生成任務

大型語言模型

Transformers

英語開源協議:Apache-2.0 #英文對話生成 #7B參數量級 #指令微調模型

下載量 58

發布時間 : 5/24/2023

模型概述

該模型是使用H2O LLM Studio訓練的文本生成模型，基於OpenLlama架構，適用於對話和問答場景

模型特點

基於OpenLlama架構

採用經過700B token預訓練的OpenLlama 7B模型作為基礎

使用OpenAssistant數據集微調

使用高質量的OpenAssistant對話數據集進行微調，優化對話能力

2048上下文長度

支持長達2048 token的上下文記憶

模型能力

文本生成

對話系統

問答系統

使用案例

對話系統

智能助手

構建能夠理解並回應用戶問題的智能對話助手

內容生成

文本創作

生成各種類型的文本內容，如文章、故事等

🚀 H2O GPT 模型

本模型基於 H2O LLM Studio 訓練，可用於自然語言處理任務，能根據輸入生成相關文本內容，為用戶提供語言交互能力。

🚀 快速開始

要在配備 GPU 的機器上使用 transformers 庫調用此模型，首先需確保已安裝 transformers、accelerate 和 torch 庫。

pip install transformers==4.28.1
pip install accelerate==0.18.0
pip install torch==2.0.0

✨ 主要特性

基礎模型：採用 [openlm - research/open_llama_7b_700bt_preview](https://huggingface.co/openlm - research/open_llama_7b_700bt_preview) 作為基礎模型。
數據集：使用 [OpenAssistant/oasst1](https://github.com/h2oai/h2o - llmstudio/blob/1935d84d9caafed3ee686ad2733eb02d2abfce57/app_utils/utils.py#LL1896C5 - L1896C28) 數據集進行訓練。

📦 安裝指南

在配備 GPU 的機器上，使用以下命令安裝所需庫：

pip install transformers==4.28.1
pip install accelerate==0.18.0
pip install torch==2.0.0

💻 使用示例

基礎用法

import torch
from transformers import pipeline

generate_text = pipeline(
    model="h2oai/h2ogpt - gm - oasst1 - en - 2048 - open - llama - 7b - preview - 700bt",
    torch_dtype=torch.float16,
    trust_remote_code=True,
    use_fast=False,
    device_map={"": "cuda:0"},
)

res = generate_text(
    "Why is drinking water so healthy?",
    min_new_tokens=2,
    max_new_tokens=1024,
    do_sample=False,
    num_beams=1,
    temperature=float(0.3),
    repetition_penalty=float(1.2),
    renormalize_logits=True
)
print(res[0]["generated_text"])

高級用法

若不想使用 trust_remote_code=True，可以下載 h2oai_pipeline.py，將其與你的筆記本放在同一目錄下，然後從加載的模型和分詞器自行構建管道：

import torch
from h2oai_pipeline import H2OTextGenerationPipeline
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(
    "h2oai/h2ogpt - gm - oasst1 - en - 2048 - open - llama - 7b - preview - 700bt",
    use_fast=False,
    padding_side="left"
)
model = AutoModelForCausalLM.from_pretrained(
    "h2oai/h2ogpt - gm - oasst1 - en - 2048 - open - llama - 7b - preview - 700bt",
    torch_dtype=torch.float16,
    device_map={"": "cuda:0"}
)
generate_text = H2OTextGenerationPipeline(model=model, tokenizer=tokenizer)

res = generate_text(
    "Why is drinking water so healthy?",
    min_new_tokens=2,
    max_new_tokens=1024,
    do_sample=False,
    num_beams=1,
    temperature=float(0.3),
    repetition_penalty=float(1.2),
    renormalize_logits=True
)
print(res[0]["generated_text"])

你也可以從加載的模型和分詞器自行構建管道，並考慮預處理步驟：

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "h2oai/h2ogpt - gm - oasst1 - en - 2048 - open - llama - 7b - preview - 700bt"  # 可以是本地文件夾或 Hugging Face 模型名稱
# 重要提示：提示語的格式必須與模型訓練時的格式相同。
# 你可以在實驗日誌中找到示例提示語。
prompt = "<|prompt|>How are you?</s><|answer|>"

tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False)
model = AutoModelForCausalLM.from_pretrained(model_name)
model.cuda().eval()
inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).to("cuda")

# 生成配置可根據需要修改
tokens = model.generate(
    **inputs,
    min_new_tokens=2,
    max_new_tokens=1024,
    do_sample=False,
    num_beams=1,
    temperature=float(0.3),
    repetition_penalty=float(1.2),
    renormalize_logits=True
)[0]

tokens = tokens[inputs["input_ids"].shape[1]:]
answer = tokenizer.decode(tokens, skip_special_tokens=True)
print(answer)

🔧 技術細節

模型架構

LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(32000, 4096, padding_idx=0)
    (layers): ModuleList(
      (0-31): 32 x LlamaDecoderLayer(
        (self_attn): LlamaAttention(
          (q_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (k_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (v_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (o_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): LlamaMLP(
          (gate_proj): Linear(in_features=4096, out_features=11008, bias=False)
          (down_proj): Linear(in_features=11008, out_features=4096, bias=False)
          (up_proj): Linear(in_features=4096, out_features=11008, bias=False)
          (act_fn): SiLUActivation()
        )
        (input_layernorm): LlamaRMSNorm()
        (post_attention_layernorm): LlamaRMSNorm()
      )
    )
    (norm): LlamaRMSNorm()
  )
  (lm_head): Linear(in_features=4096, out_features=32000, bias=False)
)

模型配置

本模型使用 H2O LLM Studio 進行訓練，配置文件為 cfg.yaml。你可以訪問 [H2O LLM Studio](https://github.com/h2oai/h2o - llmstudio) 瞭解如何訓練自己的大語言模型。

模型驗證

使用 [EleutherAI lm - evaluation - harness](https://github.com/EleutherAI/lm - evaluation - harness) 進行模型驗證：

CUDA_VISIBLE_DEVICES=0 python main.py --model hf - causal - experimental --model_args pretrained=h2oai/h2ogpt - gm - oasst1 - en - 2048 - open - llama - 7b - preview - 700bt --tasks openbookqa,arc_easy,winogrande,hellaswag,arc_challenge,piqa,boolq --device cuda &> eval.log

📄 許可證

本項目採用 Apache - 2.0 許可證。

⚠️ 重要提示

偏差與冒犯性：大語言模型是在各種互聯網文本數據上進行訓練的，這些數據可能包含有偏差、種族主義、冒犯性或其他不適當的內容。使用此模型即表示您承認並接受生成的內容有時可能會表現出偏差，或產生冒犯性或不適當的內容。本倉庫的開發者不認可、支持或推廣任何此類內容或觀點。
侷限性：大語言模型是基於人工智能的工具，而非人類。它可能會產生不正確、無意義或不相關的回覆。用戶有責任批判性地評估生成的內容，並自行決定是否使用。
風險自擔：使用此大語言模型的用戶必須對使用該工具可能產生的任何後果承擔全部責任。本倉庫的開發者和貢獻者不對因使用或濫用所提供的模型而導致的任何損害、損失或傷害承擔責任。
倫理考量：鼓勵用戶負責任且合乎倫理地使用大語言模型。使用此模型即表示您同意不將其用於促進仇恨言論、歧視、騷擾或任何形式的非法或有害活動的目的。
問題反饋：如果您遇到大語言模型生成的有偏差、冒犯性或其他不適當的內容，請通過提供的渠道向倉庫維護者報告。您的反饋將有助於改進模型並減輕潛在問題。
免責聲明變更：本倉庫的開發者保留隨時修改或更新此免責聲明的權利，恕不另行通知。用戶有責任定期查看免責聲明，以瞭解任何變更。

使用本倉庫提供的大語言模型即表示您同意接受並遵守本免責聲明中規定的條款和條件。如果您不同意本免責聲明的任何部分，則不應使用該模型及其生成的任何內容。