LaMini-GPT-1.5B開源大語言模型 - 免費部署精準完成指令跟隨任務

首頁

Lamini GPT 1.5B

由MBZUAI開發

LaMini-GPT-1.5B是基於GPT-2-xl架構微調的大型語言模型，屬於LaMini-LM系列，專注於指令跟隨任務

大型語言模型

Transformers

英語#指令微調模型 #自然語言生成 #多樣化蒸餾

下載量 365

發布時間 : 4/16/2023

模型概述

該模型是在包含258萬條指令的LaMini-instruction數據集上對GPT-2-xl進行微調的版本，擅長響應自然語言指令

模型特點

指令微調優化

在258萬條多樣化指令數據上進行微調，顯著提升指令理解和執行能力

高效推理

1.5B參數規模在保持良好性能的同時實現相對高效的推理

多樣化任務支持

能夠處理問答、建議生成、內容創作等多種自然語言任務

模型能力

自然語言理解

指令跟隨

文本生成

問答系統

內容創作

使用案例

智能助手

健康建議生成

根據用戶健康需求提供個性化建議

可生成結構化的健康生活方式建議

教育應用

學習指導

回答學生問題並提供學習資源建議

能生成教育性內容和學習路徑建議

🚀 LaMini-GPT-1.5B

LaMini-GPT-1.5B是LaMini-LM模型系列中的一員，該模型基於大規模指令進行蒸餾，能有效完成自然語言指令響應任務，在多種NLP下游任務中表現出色。

本模型是論文 "LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions" 中LaMini-LM模型系列的一部分。它是 gpt2-xl 在 LaMini-instruction 數據集上的微調版本，該數據集包含258萬個用於指令微調的樣本。有關我們數據集的更多信息，請參考項目倉庫。

你可以查看LaMini-LM系列的其他模型，帶有 ✩ 的模型在其規模/架構下具有最佳的整體性能，因此我們推薦使用它們。更多細節可在我們的論文中查看。

基礎模型	LaMini-LM系列（參數數量）
T5	LaMini-T5-61M LaMini-T5-223M LaMini-T5-738M
Flan-T5	LaMini-Flan-T5-77M✩ LaMini-Flan-T5-248M✩ LaMini-Flan-T5-783M✩
Cerebras-GPT	LaMini-Cerebras-111M LaMini-Cerebras-256M LaMini-Cerebras-590M LaMini-Cerebras-1.3B
GPT-2	LaMini-GPT-124M✩ LaMini-GPT-774M✩ LaMini-GPT-1.5B✩
GPT-Neo	LaMini-Neo-125M LaMini-Neo-1.3B
GPT-J	即將推出
LLaMA	即將推出

🚀 快速開始

預期用途

我們建議使用該模型來響應自然語言編寫的人類指令。由於這個僅解碼器模型是使用包裝文本進行微調的，我們建議使用相同的包裝文本以獲得最佳性能。請參考右側的示例或下面的代碼。

我們現在向你展示如何使用HuggingFace的 pipeline() 加載和使用我們的模型。

# pip install -q transformers
from transformers import pipeline

checkpoint = "{model_name}" 

model = pipeline('text-generation', model = checkpoint)

instruction = 'Please let me know your thoughts on the given place and why you think it deserves to be visited: \n"Barcelona, Spain"'

input_prompt = f"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response:"

generated_text = model(input_prompt, max_length=512, do_sample=True)[0]['generated_text']

print("Response", generated_text)

📚 詳細文檔

訓練過程

我們使用 gpt2-xl 進行初始化，並在我們的 LaMini-instruction 數據集上對其進行微調。其總參數數量為15億。

訓練超參數

文檔暫未提供訓練超參數的具體內容。

評估

我們進行了兩組評估：對下游NLP任務的自動評估和對面向用戶指令的人工評估。更多詳細信息，請參考我們的論文。

侷限性

需要更多信息。

📄 許可證

本模型採用CC By NC 4.0許可證。

📖 引用

@article{lamini-lm,
  author       = {Minghao Wu and
                  Abdul Waheed and
                  Chiyu Zhang and
                  Muhammad Abdul-Mageed and
                  Alham Fikri Aji
                  },
  title        = {LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions},
  journal      = {CoRR},
  volume       = {abs/2304.14402},
  year         = {2023},
  url          = {https://arxiv.org/abs/2304.14402},
  eprinttype   = {arXiv},
  eprint       = {2304.14402}
}