Drug_Ollama_v3-2開源大語言模型 - 免費部署專用於藥物領域文本生成

首頁

Drug Ollama V3 2

由Ketak-ZoomRx開發

該模型是基於open_llama_3b使用H2O LLM Studio訓練的大語言模型，專注於藥物相關領域的文本生成任務。

大型語言模型

Transformers

英語#醫藥問答 #低參數量 #精準醫療

下載量 99

發布時間 : 11/17/2023

模型概述

這是一個基於Llama架構的大語言模型，專門針對藥物領域進行了優化訓練，能夠生成與藥物相關的文本內容。

模型特點

藥物領域優化

針對藥物相關領域進行了專門的訓練優化

高效推理

支持量化加載(4bit/8bit)和多GPU分片推理

可控生成

提供多種參數控制生成結果，如temperature、repetition_penalty等

模型能力

藥物相關文本生成

問答系統

文本補全

使用案例

醫療健康

藥物信息問答

回答關於藥物作用、副作用等專業問題

醫療報告生成

輔助生成醫療相關的報告文本

🚀 模型卡片

本模型使用 H2O LLM Studio 進行訓練，可用於自然語言處理任務，基於預訓練模型進行微調，能根據輸入生成相關文本。

🚀 快速開始

本模型可與 transformers 庫結合使用，以下是使用前的準備步驟和使用示例。

📦 安裝指南

若要在配備 GPU 的機器上使用 transformers 庫調用此模型，需先確保已安裝 transformers、accelerate 和 torch 庫。可使用以下命令進行安裝：

pip install transformers==4.29.2
pip install einops==0.6.1
pip install accelerate==0.19.0
pip install torch==2.0.0

💻 使用示例

基礎用法

以下代碼展示瞭如何使用 pipeline 調用模型生成文本：

import torch
from transformers import pipeline

generate_text = pipeline(
    model="Ketak-ZoomRx/Drug_Ollama_v3-2",
    torch_dtype="auto",
    trust_remote_code=True,
    use_fast=True,
    device_map={"": "cuda:0"},
)

res = generate_text(
    "Why is drinking water so healthy?",
    min_new_tokens=2,
    max_new_tokens=256,
    do_sample=False,
    num_beams=1,
    temperature=float(0.0),
    repetition_penalty=float(1.2),
    renormalize_logits=True
)
print(res[0]["generated_text"])

你可以打印預處理步驟後的示例提示，查看其如何輸入到分詞器中：

print(generate_text.preprocess("Why is drinking water so healthy?")["prompt_text"])

輸出結果如下：

<|prompt|>Why is drinking water so healthy?</s><|answer|>

高級用法

你可以下載 h2oai_pipeline.py，將其與你的筆記本放在同一目錄下，然後從加載的模型和分詞器自行構建管道。如果模型和分詞器在 transformers 包中得到完全支持，你可以將 trust_remote_code 設置為 False：

import torch
from h2oai_pipeline import H2OTextGenerationPipeline
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(
    "Ketak-ZoomRx/Drug_Ollama_v3-2",
    use_fast=True,
    padding_side="left",
    trust_remote_code=True,
)
model = AutoModelForCausalLM.from_pretrained(
    "Ketak-ZoomRx/Drug_Ollama_v3-2",
    torch_dtype="auto",
    device_map={"": "cuda:0"},
    trust_remote_code=True,
)
generate_text = H2OTextGenerationPipeline(model=model, tokenizer=tokenizer)

res = generate_text(
    "Why is drinking water so healthy?",
    min_new_tokens=2,
    max_new_tokens=256,
    do_sample=False,
    num_beams=1,
    temperature=float(0.0),
    repetition_penalty=float(1.2),
    renormalize_logits=True
)
print(res[0]["generated_text"])

你也可以從加載的模型和分詞器自行構建管道，並考慮預處理步驟：

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Ketak-ZoomRx/Drug_Ollama_v3-2"  # either local folder or huggingface model name
# Important: The prompt needs to be in the same format the model was trained with.
# You can find an example prompt in the experiment logs.
prompt = "<|prompt|>How are you?</s><|answer|>"

tokenizer = AutoTokenizer.from_pretrained(
    model_name,
    use_fast=True,
    trust_remote_code=True,
)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map={"": "cuda:0"},
    trust_remote_code=True,
)
model.cuda().eval()
inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).to("cuda")

# generate configuration can be modified to your needs
tokens = model.generate(
    input_ids=inputs["input_ids"],
    attention_mask=inputs["attention_mask"],
    min_new_tokens=2,
    max_new_tokens=256,
    do_sample=False,
    num_beams=1,
    temperature=float(0.0),
    repetition_penalty=float(1.2),
    renormalize_logits=True
)[0]

tokens = tokens[inputs["input_ids"].shape[1]:]
answer = tokenizer.decode(tokens, skip_special_tokens=True)
print(answer)

🔧 技術細節

量化與分片

你可以通過指定 load_in_8bit=True 或 load_in_4bit=True 來使用量化方式加載模型。此外，通過設置 device_map=auto 可以在多個 GPU 上進行分片加載。

模型架構

模型架構如下：

LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(32000, 3200, padding_idx=0)
    (layers): ModuleList(
      (0-25): 26 x LlamaDecoderLayer(
        (self_attn): LlamaAttention(
          (q_proj): Linear(in_features=3200, out_features=3200, bias=False)
          (k_proj): Linear(in_features=3200, out_features=3200, bias=False)
          (v_proj): Linear(in_features=3200, out_features=3200, bias=False)
          (o_proj): Linear(in_features=3200, out_features=3200, bias=False)
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): LlamaMLP(
          (gate_proj): Linear(in_features=3200, out_features=8640, bias=False)
          (down_proj): Linear(in_features=8640, out_features=3200, bias=False)
          (up_proj): Linear(in_features=3200, out_features=8640, bias=False)
          (act_fn): SiLUActivation()
        )
        (input_layernorm): LlamaRMSNorm()
        (post_attention_layernorm): LlamaRMSNorm()
      )
    )
    (norm): LlamaRMSNorm()
  )
  (lm_head): Linear(in_features=3200, out_features=32000, bias=False)
)

模型配置

本模型使用 H2O LLM Studio 進行訓練，具體配置見 cfg.yaml。你可以訪問 H2O LLM Studio 瞭解如何訓練自己的大語言模型。

📚 詳細文檔

免責聲明

在使用本倉庫提供的大語言模型之前，請仔細閱讀本免責聲明。使用該模型即表示你同意以下條款和條件：

偏見與冒犯性內容：該大語言模型基於多種互聯網文本數據進行訓練，這些數據可能包含有偏見、種族主義、冒犯性或其他不適當的內容。使用此模型即表示你承認並接受生成的內容有時可能存在偏見或產生冒犯性、不適當的內容。本倉庫的開發者不支持、認可或推廣任何此類內容或觀點。
侷限性：大語言模型是基於人工智能的工具，並非人類。它可能會產生錯誤、無意義或不相關的回覆。用戶有責任批判性地評估生成的內容，並自行決定是否使用。
風險自負：使用此大語言模型的用戶必須對使用該工具可能產生的任何後果承擔全部責任。本倉庫的開發者和貢獻者對因使用或濫用所提供的模型而導致的任何損害、損失或傷害不承擔任何責任。
道德考量：鼓勵用戶負責任且符合道德地使用該大語言模型。使用此模型即表示你同意不將其用於宣揚仇恨言論、歧視、騷擾或任何非法或有害活動的目的。
問題反饋：如果你遇到大語言模型生成的任何有偏見、冒犯性或其他不適當的內容，請通過提供的渠道向倉庫維護者報告。你的反饋將有助於改進模型並減少潛在問題。
免責聲明的變更：本倉庫的開發者保留在任何時候修改或更新本免責聲明的權利，無需事先通知。用戶有責任定期查看免責聲明，以瞭解任何變更。

使用本倉庫提供的大語言模型即表示你同意接受並遵守本免責聲明中規定的條款和條件。如果你不同意本免責聲明的任何部分，則不應使用該模型及其生成的任何內容。