Orca Mini 3B開源文本生成模型 - 免費部署實現優質文本創作

首頁

Orca Mini 3b

由pankajmathur開發

orca_mini_3b是基於OpenLLaMa-3B模型訓練的文本生成模型，採用了來自WizardLM、Alpaca和Dolly-V2數據集的指令和輸入進行解釋性調優，並應用了Orca研究論文中的數據集構建方法。

大型語言模型

Transformers

英語#指令微調模型 #多數據集融合 #小規模高效訓練

下載量 4,232

發布時間 : 6/22/2023

模型概述

該模型是一個3B參數的文本生成模型，通過解釋性調優方法訓練，能夠生成高質量的文本響應。它特別適合需要理解複雜指令並生成詳細解釋的應用場景。

模型特點

解釋性調優

採用Orca研究論文中的方法，通過系統指令生成自定義數據集，使模型能夠學習思考過程。

多數據集訓練

結合了WizardLM、Alpaca和Dolly-V2數據集，提供了豐富的指令和輸入樣本。

高效訓練

使用DeepSpeed和ZeRO階段3優化，在8塊A100 GPU上僅需4小時完成訓練。

模型能力

文本生成

指令理解

解釋性響應生成

使用案例

教育

教學輔助

生成詳細的解釋和示例，幫助學生理解複雜概念。

內容創作

文章生成

根據指令生成高質量的文章或段落。

🚀 orca_mini_3b

orca_mini_3b是基於OpenLLaMa-3B模型，在經過解釋調優的數據集上訓練得到的模型，它藉助了WizardLM、Alpaca和Dolly-V2數據集的指令和輸入，並應用了Orca研究論文的數據集構建方法。

🚀 快速開始

你可以在Google Colab上免費使用帶有T4 GPU的orca-mini-3b。點擊下面的按鈕在Colab中打開：

✨ 主要特性

基於OpenLLaMa-3B模型，在經過解釋調優的數據集上進行訓練。
利用WizardLM、Alpaca和Dolly-V2數據集的指令和輸入，並應用Orca研究論文的數據集構建方法。
幫助學生模型（即本模型）從教師模型（ChatGPT）學習思維過程。

📦 安裝指南

文檔未提及安裝步驟，故跳過此章節。

💻 使用示例

基礎用法

import torch
from transformers import LlamaForCausalLM, LlamaTokenizer

# Hugging Face model_path
model_path = 'psmathur/orca_mini_3b'
tokenizer = LlamaTokenizer.from_pretrained(model_path)
model = LlamaForCausalLM.from_pretrained(
    model_path, torch_dtype=torch.float16, device_map='auto',
)


#generate text function
def generate_text(system, instruction, input=None):
    
    if input:
        prompt = f"### System:\n{system}\n\n### User:\n{instruction}\n\n### Input:\n{input}\n\n### Response:\n"
    else:
        prompt = f"### System:\n{system}\n\n### User:\n{instruction}\n\n### Response:\n"
    
    tokens = tokenizer.encode(prompt)
    tokens = torch.LongTensor(tokens).unsqueeze(0)
    tokens = tokens.to('cuda')

    instance = {'input_ids': tokens,'top_p': 1.0, 'temperature':0.7, 'generate_len': 1024, 'top_k': 50}

    length = len(tokens[0])
    with torch.no_grad():
        rest = model.generate(
            input_ids=tokens, 
            max_length=length+instance['generate_len'], 
            use_cache=True, 
            do_sample=True, 
            top_p=instance['top_p'],
            temperature=instance['temperature'],
            top_k=instance['top_k']
        )    
    output = rest[0][length:]
    string = tokenizer.decode(output, skip_special_tokens=True)
    return f'[!] Response: {string}'

# Sample Test Instruction Used by Youtuber Sam Witteveen https://www.youtube.com/@samwitteveenai
system = 'You are an AI assistant that follows instruction extremely well. Help as much as you can.'
instruction = 'Write a letter to Sam Altman, CEO of OpenAI, requesting him to convert GPT4 a private model by OpenAI to an open source project'
print(generate_text(system, instruction))

高級用法

# 此部分文檔未提及高級用法說明，保持原代碼
import torch
from transformers import LlamaForCausalLM, LlamaTokenizer

# Hugging Face model_path
model_path = 'psmathur/orca_mini_3b'
tokenizer = LlamaTokenizer.from_pretrained(model_path)
model = LlamaForCausalLM.from_pretrained(
    model_path, torch_dtype=torch.float16, device_map='auto',
)


#generate text function
def generate_text(system, instruction, input=None):
    
    if input:
        prompt = f"### System:\n{system}\n\n### User:\n{instruction}\n\n### Input:\n{input}\n\n### Response:\n"
    else:
        prompt = f"### System:\n{system}\n\n### User:\n{instruction}\n\n### Response:\n"
    
    tokens = tokenizer.encode(prompt)
    tokens = torch.LongTensor(tokens).unsqueeze(0)
    tokens = tokens.to('cuda')

    instance = {'input_ids': tokens,'top_p': 1.0, 'temperature':0.7, 'generate_len': 1024, 'top_k': 50}

    length = len(tokens[0])
    with torch.no_grad():
        rest = model.generate(
            input_ids=tokens, 
            max_length=length+instance['generate_len'], 
            use_cache=True, 
            do_sample=True, 
            top_p=instance['top_p'],
            temperature=instance['temperature'],
            top_k=instance['top_k']
        )    
    output = rest[0][length:]
    string = tokenizer.decode(output, skip_special_tokens=True)
    return f'[!] Response: {string}'

# Sample Test Instruction Used by Youtuber Sam Witteveen https://www.youtube.com/@samwitteveenai
system = 'You are an AI assistant that follows instruction extremely well. Help as much as you can.'
instruction = 'Write a letter to Sam Altman, CEO of OpenAI, requesting him to convert GPT4 a private model by OpenAI to an open source project'
print(generate_text(system, instruction))

📚 詳細文檔

數據集

我們構建了經過解釋調優的WizardLM數據集 ~70K、Alpaca數據集 ~52K 和 Dolly-V2數據集 ~15K，使用了 Orca研究論文中的方法。

與原始數據集使用的普通指令調優方法不同，我們利用了Orca研究論文中提供的所有15條系統指令來生成自定義數據集。

這有助於學生模型（即本模型）從教師模型（ChatGPT，gpt-3.5-turbo-0301版本）學習思維過程。

請參閱以下示例用法，瞭解如何在每個指令之前添加系統提示。

訓練

訓練配置如下表所示：

訓練在8個A100（80G）GPU上進行，持續約4小時，使用 Lambda Labs 的成本為48美元。

我們使用了DeepSpeed和完全分片數據並行（即 ZeRO stage 3），通過編寫自己的微調腳本並利用出色的 OpenAlpaca倉庫提供的部分模型訓練代碼。

以下是訓練期間使用的一些參數：

參數	值
batch_size	64
train_micro_batch_size_per_gpu	4
gradient_accumulation_steps	2
Learning rate	2e-5
Max length	1024
Epochs	3
Optimizer	AdamW

示例輸出

[!] Response:
Dear Sam Altman,

I am writing to request that you convert the GPT4 private model developed by OpenAI to an open source project. As a user of OpenAI, I have been waiting for the day when I can use the advanced natural language processing capabilities of GPT4 in a more open and accessible way.

While OpenAI has made significant progress in developing AI applications, it has primarily focused on building private models that are not accessible to the general public. However, with the recent release of GPT-3, there is a growing demand for more open and accessible AI tools.

Converting GPT4 to an open source project would allow for greater transparency, collaboration, and innovation. It would also help to build trust in the technology and ensure that it is used ethically and responsibly.

I urge you to consider converting GPT4 to an open source project. This would be a significant contribution to the AI community and would help to create a more open and accessible future.

Thank you for your consideration.

Sincerely,

[Your Name]

下一步目標

嘗試更多數據，如實際使用FLAN-v2，就像Orca研究論文中那樣（歡迎提出建議）。
為文本生成界面提供更多選項（可能使用 https://github.com/oobabooga/text-generation-webui）。
提供4位GGML/GPTQ量化模型（可能 TheBloke 可以提供幫助）。

侷限性和偏差

本模型可能會產生事實錯誤的輸出，不應依賴它來生成事實準確的信息。本模型在各種公共數據集上進行訓練。儘管在清理預訓練數據方面付出了巨大努力，但該模型仍有可能生成低俗、有偏見或其他冒犯性的輸出。

免責聲明

本模型的許可證不構成法律建議。我們不對使用此模型的第三方的行為負責。在將此模型用於商業目的之前，請諮詢律師。

引用

如果您在研究或應用中發現wizardlm_alpaca_dolly_orca_open_llama_3b很有用，請使用以下BibTeX進行引用：

@misc{orca_mini_3b,
  author = {Pankaj Mathur},
  title = {wizardlm_alpaca_dolly_orca_open_llama_3b: An explain tuned OpenLLaMA-3b model on custom wizardlm, alpaca, & dolly datasets},
  year = {2023},
  publisher = {GitHub, HuggingFace},
  journal = {GitHub repository, HuggingFace repository},
  howpublished = {\url{https://github.com/pankajarm/wizardlm_alpaca_dolly_orca_open_llama_3b}, \url{https://https://huggingface.co/psmathur/wizardlm_alpaca_dolly_orca_open_llama_3b}},
}

@misc{mukherjee2023orca,
      title={Orca: Progressive Learning from Complex Explanation Traces of GPT-4}, 
      author={Subhabrata Mukherjee and Arindam Mitra and Ganesh Jawahar and Sahaj Agarwal and Hamid Palangi and Ahmed Awadallah},
      year={2023},
      eprint={2306.02707},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@software{openlm2023openllama,
  author = {Xinyang Geng and Hao Liu},
  title = {OpenLLaMA: An Open Reproduction of LLaMA},
  month = May,
  year = 2023,
  url = {https://github.com/openlm-research/open_llama}
}

@misc{openalpaca,
  author = {Yixuan Su and Tian Lan and Deng Cai},
  title = {OpenAlpaca: A Fully Open-Source Instruction-Following Model Based On OpenLLaMA},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/yxuansu/OpenAlpaca}},
}

@misc{alpaca,
  author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },
  title = {Stanford Alpaca: An Instruction-following LLaMA model},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
}

Open LLM Leaderboard評估結果

詳細結果可在此處找到。

指標	值
平均值	35.5
ARC (25-shot)	41.55
HellaSwag (10-shot)	61.52
MMLU (5-shot)	26.79
TruthfulQA (0-shot)	42.42
Winogrande (5-shot)	61.8
GSM8K (5-shot)	0.08
DROP (3-shot)	14.33

Open LLM Leaderboard評估結果

詳細結果可在此處找到。

指標	值
平均值	39.03
AI2 Reasoning Challenge (25-Shot)	41.55
HellaSwag (10-Shot)	61.52
MMLU (5-Shot)	26.79
TruthfulQA (0-shot)	42.42
Winogrande (5-shot)	61.80
GSM8k (5-shot)	0.08