MobiLlama-1B-Chat開源指令跟隨模型 - 適配低資源設備，響應快速低內存佔用

首頁

Mobillama 1B Chat

由MBZUAI開發

MobiLlama-1B-Chat是基於MobiLlama-1B微調的指令跟隨模型，專為資源受限設備設計，強調高效、低內存佔用和快速響應。

大型語言模型

Transformers

英語開源協議:Apache-2.0 #輕量級語言模型 #設備端優化 #指令跟隨

下載量 44

發布時間 : 2/25/2024

模型概述

這是一個小型語言模型(SLM)，適用於需要設備端處理、能效高、內存佔用低和響應效率高的場景，特別關注隱私、安全和可持續部署。

模型特點

資源高效

專為資源受限設備設計，參數共享方案降低了預訓練和部署成本

完全透明

提供完整的訓練數據管道、訓練代碼、模型權重和評估代碼

指令跟隨

經過微調能夠有效理解和執行人類指令

模型能力

文本生成

指令理解

對話系統

使用案例

教育

兒童活動建議

為家長提供適合兒童的活動創意和建議

生成詳細的活動方案和注意事項

健康

正念冥想指導

提供正念冥想的方法和好處說明

詳細解釋冥想技巧和科學依據

🚀 MobiLlama-1B-Chat

我們推出了 MobiLlama-1.2B-Chat，這是一個在 MBZUAI/MobiLlama-1B 基礎上微調的指令跟隨模型。它旨在滿足資源受限設備的特定需求，在降低資源需求的同時提升性能。

🚀 快速開始

加載 MobiLlama-1B-Chat

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("MBZUAI/MobiLlama-1B-Chat", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("MBZUAI/MobiLlama-1B-Chat", trust_remote_code=True)
model.to('cuda')

#template adapated from fastchat
template= "A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.\n### Human: Got any creative ideas for a 10 year old’s birthday?\n### Assistant: Of course! Here are some creative ideas for a 10-year-old's birthday party:\n1. Treasure Hunt: Organize a treasure hunt in your backyard or nearby park. Create clues and riddles for the kids to solve, leading them to hidden treasures and surprises.\n2. Science Party: Plan a science-themed party where kids can engage in fun and interactive experiments. You can set up different stations with activities like making slime, erupting volcanoes, or creating simple chemical reactions.\n3. Outdoor Movie Night: Set up a backyard movie night with a projector and a large screen or white sheet. Create a cozy seating area with blankets and pillows, and serve popcorn and snacks while the kids enjoy a favorite movie under the stars.\n4. DIY Crafts Party: Arrange a craft party where kids can unleash their creativity. Provide a variety of craft supplies like beads, paints, and fabrics, and let them create their own unique masterpieces to take home as party favors.\n5. Sports Olympics: Host a mini Olympics event with various sports and games. Set up different stations for activities like sack races, relay races, basketball shooting, and obstacle courses. Give out medals or certificates to the participants.\n6. Cooking Party: Have a cooking-themed party where the kids can prepare their own mini pizzas, cupcakes, or cookies. Provide toppings, frosting, and decorating supplies, and let them get hands-on in the kitchen.\n7. Superhero Training Camp: Create a superhero-themed party where the kids can engage in fun training activities. Set up an obstacle course, have them design their own superhero capes or masks, and organize superhero-themed games and challenges.\n8. Outdoor Adventure: Plan an outdoor adventure party at a local park or nature reserve. Arrange activities like hiking, nature scavenger hunts, or a picnic with games. Encourage exploration and appreciation for the outdoors.\nRemember to tailor the activities to the birthday child's interests and preferences. Have a great celebration!\n### Human: {prompt}\n### Assistant:"

prompt = "What are the key benefits of practicing mindfulness meditation?"

input_str = template.format(prompt=prompt)
input_ids = tokenizer(input_str, return_tensors="pt").to('cuda').input_ids
outputs = model.generate(input_ids, max_length=1000, pad_token_id=tokenizer.eos_token_id)
print(tokenizer.batch_decode(outputs[:, input_ids.shape[1]:-1])[0].strip())

或者，你可以使用 FastChat：

python3 -m fastchat.serve.cli --model-path MBZUAI/MobiLlama-1B-Chat

✨ 主要特性

在近期大語言模型（LLM）的發展中，“越大越好”一直是主流趨勢。然而，大語言模型並不適合需要設備端處理、能源效率、低內存佔用和響應效率的場景。這些要求對於隱私、安全和可持續部署至關重要。本文通過應對為資源受限設備設計準確且高效的小語言模型（SLM）這一挑戰，探索了“少即是多”的範式。我們的主要貢獻是推出了一個準確且完全透明的開源 5 億（0.5B）參數的小語言模型 MobiLlama，它滿足資源受限計算的特定需求，強調在降低資源需求的同時提升性能。MobiLlama 是一種小語言模型設計，它從一個更大的模型開始，並應用了一種謹慎的參數共享方案，以降低預訓練和部署成本。我們的工作不僅努力填補開源小語言模型的空白，還確保了完全透明，完整的訓練數據管道、訓練代碼、模型權重以及 300 多個檢查點和評估代碼都可以在我們的 Github 上找到。

Arxiv 論文鏈接

📚 詳細文檔

模型描述

屬性	詳情
模型類型	基於 LLaMA - 7B 架構設計構建的小語言模型（SLM）
語言（NLP）	英語
許可證	Apache 2.0
更多信息資源	- 訓練代碼 - 數據準備 - 完全處理後的 Amber 預訓練數據

MobiLlama-1B-Chat 微調詳情

DataMix

子集	行數	許可證
WizardLM/WizardLM_evol_instruct_V2_196k	143k
icybee/share_gpt_90k_v1	90k	cc0 - 1.0
總計	233k

超參數

超參數	值
總參數	1.2B
隱藏層大小	2048
中間層大小（MLPs）	5632
注意力頭數量	32
隱藏層數量	22
RMSNorm ɛ	1e^ - 5
最大序列長度	2048
詞表大小	32000

訓練超參數	值
學習率	2e - 5
訓練輪數	3
每個設備的訓練批次大小	2
梯度累積步數	16
熱身比例	0.04
模型最大長度	2048

評估

評估基準	MobiLlama - 05B - Chat	MobiLlama - 1.2B - Chat
HellaSwag	0.5042	0.6244
MMLU	0.2677	0.2635
Arc Challenge	0.2935	0.3558
TruthfulQA	0.3997	0.3848
CrowsPairs	0.5694	0.679
PIQA	0.7078	0.7557
Race	0.3320	0.3598
SIQA	0.4165	0.4396
Winogrande	0.5659	0.5966

引用

BibTeX：

@misc{thawakar2024mobillama,
      title={MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT}, 
      author={Omkar Thawakar and Ashmal Vayani and Salman Khan and Hisham Cholakkal and Rao Muhammad Anwer and Michael Felsberg and Timothy Baldwin and Eric P. Xing and Fahad Shahbaz Khan},
      year={2024},
      eprint={2402.16840},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}