orca_alpaca_3b開源解釋性AI模型 - 免費部署精準理解處理指令輸入

首頁

Orca Alpaca 3b

由pankajmathur開發

基於Open_LLaMA-3B模型訓練的解釋性調優模型，採用Alpaca數據集的指令和輸入，並應用了Orca研究論文的數據集構建方法。

大型語言模型

Transformers

英語#解釋性指令調優 #思維鏈學習 #輕量級LLM

下載量 85

發布時間 : 6/16/2023

模型概述

該模型是一個經過解釋性調優的語言模型，通過結合Alpaca數據集的指令和Orca研究論文的方法，學習從教師模型（ChatGPT）中獲取思維過程。

模型特點

解釋性調優

採用Orca研究論文的方法，學習教師模型的思維過程，而不僅僅是輸出結果。

多系統指令支持

使用15條系統指令生成自定義數據集，增強模型的多樣性和適應性。

高效訓練

在4塊A600(50G) GPU上僅用20小時完成訓練，成本效益高。

模型能力

指令跟隨

逐步推理

文本生成

任務解答

使用案例

教育

數學問題解答

逐步解釋如何解決數學問題

提供詳細的解題步驟和推理過程

研究輔助

數據分析解釋

解釋數據分析結果和統計方法

清晰展示分析過程和結論推導

🚀 Orca_alpaca_3b

這是一個基於Open_LLaMA - 3B的模型，在解釋調優數據集上進行訓練。該數據集使用了Alpaca數據集中的指令和輸入，並應用了Orca研究論文中的數據集構建方法。

🚀 快速開始

本項目的Orca_alpaca_3b模型是在特定的解釋調優數據集上訓練得到的，下面將詳細介紹數據集構建、訓練過程以及使用示例。

✨ 主要特性

基於Open_LLaMA - 3B模型，在解釋調優數據集上進行訓練。
使用Orca研究論文中的方法構建數據集，幫助模型學習“思考”過程。
利用DeepSpeed和Zero - 3方法進行GPU並行訓練。

📦 安裝指南

文檔未提及安裝步驟，故跳過此章節。

💻 使用示例

基礎用法

以下展示瞭如何使用alpaca_orca_open_llama_3b：

import torch
from transformers import LlamaForCausalLM, LlamaTokenizer

# change model_path between 3b,7b or 13b
model_path = 'psmathur/alpaca_orca_open_llama_3b'
tokenizer = LlamaTokenizer.from_pretrained(model_path)
model = LlamaForCausalLM.from_pretrained(
    model_path, torch_dtype=torch.float16, device_map='auto',
)


#generate text function
def generate_text(system, instruction, input=None):
    
    if input:
        prompt = f"### System:\n{system}\n\n#\n\n### User:\n{instruction}\n\n### Input:\n{input}\n\n### Response:\n"
    else:
        prompt = f"### System:\n{system}\n\n#\n\n### User:\n{instruction}\n\n### Response:\n"
    
    tokens = tokenizer.encode(prompt)
    tokens = torch.LongTensor(tokens).unsqueeze(0)
    tokens = tokens.to('cuda')

    instance = {'input_ids': tokens,'top_p': 1.0, 'temperature':0.7, 'generate_len': 1024}

    length = len(tokens[0])
    with torch.no_grad():
        rest = model.generate(
            input_ids=tokens, 
            max_length=length+instance['generate_len'], 
            use_cache=True, 
            do_sample=True, 
            top_p=instance['top_p'],
            temperature=instance['temperature']
        )    
    output = rest[0][length:]
    string = tokenizer.decode(output, skip_special_tokens=True)
    print(f'[!] Response: {string}')

# same prompt as provided by Orca Research Paper
system = 'You are an AI assistant. User will you give you a task. Your goal is to complete the task as faithfully as you can. While performing the task think step-by-step and justify your steps.'
instruction = 'Use the given data to calculate the median.'
input = '[5,2,3,4,1]'
generate_text(system, instruction, input)

📚 詳細文檔

數據集

我們基於Orca研究論文中的方法構建瞭解釋調優的Alpaca數據集 ~52K。與原始數據集使用的普通指令調優方法不同，我們利用了Orca研究論文中提供的15條系統指令來生成自定義數據集。這有助於學生模型（即本模型）從教師模型（ChatGPT，gpt - 3.5 - turbo - 0301版本）學習“思考”過程。請參考以下示例，瞭解如何在每個指令前添加系統提示。

訓練

訓練配置如下表所示：

屬性	詳情
訓練設備	4x A600(50G) GPUs
訓練時長	約20小時
訓練成本	66美元（使用Lambda Labs）
並行訓練方法	使用DeepSpeed和Zero - 3方法，編寫自定義微調腳本，並借鑑了OpenAlpaca倉庫中的部分模型訓練代碼

訓練過程中使用的部分參數如下：

參數	值
batch_size	16
train_micro_batch_size_per_gpu	2
gradient_accumulation_steps	2
Learning rate	2e - 5
Max length	1024
Epochs	3

未來目標

嘗試更多數據，如Dolly V2、WizardLM等（歡迎提出建議）。
嘗試更大的OpenLLaMA模型，如7B和13B。
嘗試使用更好的GPU進行訓練，目前未能獲取8xA100 (40GB)，可能是因為需求旺盛。
為文本生成界面提供更多選項（可能使用https://github.com/oobabooga/text - generation - webui ）。
提供4bit GGML/GPTQ量化模型（可能需要TheBloke的幫助）。

引用

如果您在研究或應用中發現alpaca_orca_open_llama_3b很有用，請使用以下BibTeX進行引用：

@misc{alpaca_orca_open_llama_3b,
  author = {Pankaj Mathur},
  title = {alpaca_orca_open_llama_3b: A custom explain tuned Alpaca Model Based On OpenLLaMA},
  year = {2023},
  publisher = {GitHub, HuggingFace},
  journal = {GitHub repository, HuggingFace repository},
  howpublished = {\url{https://github.com/pankajarm/alpaca_orca_open_llama_3b}, \url{https://https://huggingface.co/psmathur/alpaca_orca_open_llama_3b}},
}

@software{openlm2023openllama,
  author = {Xinyang Geng and Hao Liu},
  title = {OpenLLaMA: An Open Reproduction of LLaMA},
  month = May,
  year = 2023,
  url = {https://github.com/openlm-research/open_llama}
}

@misc{openalpaca,
  author = {Yixuan Su and Tian Lan and Deng Cai},
  title = {OpenAlpaca: A Fully Open-Source Instruction-Following Model Based On OpenLLaMA},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/yxuansu/OpenAlpaca}},
}

@misc{alpaca,
  author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },
  title = {Stanford Alpaca: An Instruction-following LLaMA model},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
}