TinyLlama-1.1B-Chat-v1.0開源輕量級模型 - 適配資源受限場景免費部署

首頁

Tinyllama 1.1B Chat V1.0

由TinyLlama開發

小羊駝是一個11億參數的輕量級Llama模型，通過3萬億標記數據預訓練，並經過對話微調和對齊優化，適合資源受限場景。

大型語言模型

Transformers

英語開源協議:Apache-2.0 #輕量級對話模型 #Llama2兼容架構 #DPO對齊優化

下載量 1.4M

發布時間 : 12/30/2023

模型概述

基於Llama 2架構的輕量級聊天模型，經過UltraChat和UltraFeedback數據集微調，支持英語對話生成。

模型特點

輕量化設計

僅11億參數，適合計算資源和內存受限的應用場景

高效訓練

僅需16張A100-40G顯卡在90天內完成3萬億標記數據訓練

兼容性強

完全復現Llama 2架構與分詞器，兼容基於Llama的開源項目

對話優化

採用Zephyr訓練方案，通過UltraChat和UltraFeedback數據集進行微調和對齊

模型能力

文本生成

對話交互

編程輔助

使用案例

聊天機器人

風格化對話

可定製系統提示實現不同風格的對話響應（如海盜風格）

生成符合角色設定的自然語言回覆

編程輔助

代碼生成

根據自然語言描述生成Python等編程語言的代碼片段

如生成斐波那契數列計算函數

🚀 TinyLlama-1.1B

TinyLlama項目旨在預訓練一個在3萬億個標記上訓練的11億參數的Llama模型。通過適當的優化，使用16塊A100 - 40G GPU，我們可以在“僅”90天內完成這一目標🚀🚀。訓練已於2023年9月1日開始。

該模型採用了與Llama 2完全相同的架構和分詞器。這意味著TinyLlama可以無縫集成到許多基於Llama構建的開源項目中。此外，TinyLlama僅擁有11億參數，體積小巧。這種緊湊性使其能夠滿足許多對計算和內存佔用有嚴格要求的應用場景。

🚀 快速開始

你需要 transformers >= 4.34 版本。更多信息請查看 TinyLlama 的GitHub頁面。

# 從源碼安裝transformers - 僅適用於版本 <= v4.34
# pip install git+https://github.com/huggingface/transformers.git
# pip install accelerate

import torch
from transformers import pipeline

pipe = pipeline("text-generation", model="TinyLlama/TinyLlama-1.1B-Chat-v1.0", torch_dtype=torch.bfloat16, device_map="auto")

# 我們使用分詞器的聊天模板來格式化每條消息 - 請參閱 https://huggingface.co/docs/transformers/main/en/chat_templating
messages = [
    {
        "role": "system",
        "content": "You are a friendly chatbot who always responds in the style of a pirate",
    },
    {"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
# <|system|>
# You are a friendly chatbot who always responds in the style of a pirate.</s>
# <|user|>
# How many helicopters can a human eat in one sitting?</s>
# <|assistant|>
# ...

✨ 主要特性

架構兼容：採用與Llama 2相同的架構和分詞器，可輕鬆集成到基於Llama的開源項目中。
參數緊湊：僅11億參數，適合對計算和內存要求較高的應用。

📦 安裝指南

你需要 transformers >= 4.34 版本，可按需從源碼安裝：

# 從源碼安裝transformers - 僅適用於版本 <= v4.34
pip install git+https://github.com/huggingface/transformers.git
pip install accelerate

💻 使用示例

基礎用法

# 從源碼安裝transformers - 僅適用於版本 <= v4.34
# pip install git+https://github.com/huggingface/transformers.git
# pip install accelerate

import torch
from transformers import pipeline

pipe = pipeline("text-generation", model="TinyLlama/TinyLlama-1.1B-Chat-v1.0", torch_dtype=torch.bfloat16, device_map="auto")

# 我們使用分詞器的聊天模板來格式化每條消息 - 請參閱 https://huggingface.co/docs/transformers/main/en/chat_templating
messages = [
    {
        "role": "system",
        "content": "You are a friendly chatbot who always responds in the style of a pirate",
    },
    {"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
# <|system|>
# You are a friendly chatbot who always responds in the style of a pirate.</s>
# <|user|>
# How many helicopters can a human eat in one sitting?</s>
# <|assistant|>
# ...

📚 詳細文檔

本模型

此聊天模型是在 TinyLlama/TinyLlama - 1.1B - intermediate - step - 1431k - 3T 基礎上進行微調的。我們遵循 HF的Zephyr 的訓練方法。該模型最初在 UltraChat 數據集的一個變體上進行微調，該數據集包含ChatGPT生成的各種合成對話。然後，我們使用 🤗 TRL的 DPOTrainer 在 openbmb/UltraFeedback 數據集上進一步對齊模型，該數據集包含64k個由GPT - 4排名的提示和模型完成內容。