🚀 llama-3-cat-8b-instruct-v1
llama-3-cat-8b-instruct-v1 是一個基於 Llama 3 8B 的微調模型,專注於系統提示保真度、實用性和角色沉浸感。該模型旨在最大程度遵循系統提示,無論何種情況都能提供有用信息,並在給定場景中實現最大程度的角色沉浸(角色扮演)。
✨ 主要特性
- 系統指令保真度:高度尊重系統提示,嚴格按照提示要求生成回覆。
- 思維鏈(COT):能夠進行連貫的推理和思考,逐步解決問題。
- 角色沉浸感:在角色扮演場景中,提供高度沉浸的角色體驗。
- 生物科學和通用科學實用性:在生物科學和通用科學領域提供有價值的信息和幫助。
📦 安裝指南
文檔未提及安裝步驟,故跳過此章節。
💻 使用示例
基礎用法
<|begin_of_text|>Below is a conversation between an user and an AI assistant. The ai assistant gives response regarding user's questions. USER: user: Given the task definition and input, reply with output. In this task, you are given a question and an answer, you would be asked to create the sentence based on the Question-Answer provided. It should be contained within the Question-Answer provided.
Question: What type of diseases in humans are caused by protozoa? Answer: protist diseases
ASSISTANT: AI: Task Explanation:
In this task, you are asked to create a sentence using the given question and the provided answer. The sentence you create should include the information from both the question and the answer in a concise and coherent manner.
Guidelines:
1. Read the given question and answer carefully.
2. Combine the information from the question and answer to create a new sentence.
3. Make sure your new sentence contains both the question's information and the answer provided.
4. Check your sentence for proper grammar and punctuation.
Steps to find the answer:
1. Read the question: "What type of diseases in humans are caused by protozoa?"
2. Read the answer: "protist diseases"
3. Combine the information from the question and answer to create a new sentence: "In humans, protist diseases are caused by protozoa."
4. Double-check the sentence for proper grammar and punctuation, and make necessary adjustments if needed.
Final Output:
In humans, protist diseases are caused by protozoa.<|end_of_text|>
高級用法
文檔未提及高級用法相關代碼示例,故跳過此部分。
📚 詳細文檔
模型相關人員
- 數據集構建者:Dr. Kal'tsit (Kat)
- 訓練者/資助者:SteelSkull
- 推動者:Potatooff
數據集準備
從 Huggingface 系統地提取包含指令 - 響應對的數據集。專門針對高質量和有用的響應訓練了一個 GPT 模型,作為標準模型。對數據集進行了長度和思維鏈響應過濾,還從 Chat Doctor 提取了與健康相關的數據,傾向於詳細和逐步的診斷。
模型訓練
使用 1 個 A100 GPU 訓練 6 天,共 4 個 epoch。
提示格式
量化模型
模型展示
模型在灰色區域進行思維鏈推理,黑色區域為計算得出的響應。請注意,這種行為是通過系統卡指令展示系統卡保真度,並非模型微調的結果。
評估結果
指標 |
值 |
平均 |
64.74 |
AI2 推理挑戰(25 次樣本學習) |
59.04 |
HellaSwag(10 次樣本學習) |
79.20 |
MMLU(5 次樣本學習) |
62.99 |
TruthfulQA(0 次樣本學習) |
50.80 |
Winogrande(5 次樣本學習) |
75.93 |
GSM8k(5 次樣本學習) |
60.50 |
詳細結果可查看 此處。
🔧 技術細節
文檔未提供具體技術實現細節,故跳過此章節。
📄 許可證
本模型使用 Llama3 許可證。