MobiLlama-1B-Chatオープンソース指令追随モデル - 低リソースデバイスに適合し、高速応答で低メモリ消費

ホーム

Mobillama 1B Chat

MBZUAIによって開発

MobiLlama-1B-ChatはMobiLlama-1Bを微調整した命令追従モデルで、リソース制約のあるデバイス向けに設計されており、効率性、低メモリ使用量、高速応答を重視しています。

大規模言語モデル

Transformers

英語オープンソースライセンス:Apache-2.0 #軽量言語モデル #デバイスサイド最適化 #命令追従

ダウンロード数 44

リリース時間 : 2/25/2024

モデル概要

これは小型言語モデル(SLM)で、デバイスサイド処理、高エネルギー効率、低メモリ使用量、高速応答が必要なシナリオに適しており、特にプライバシー、セキュリティ、持続可能な展開に焦点を当てています。

モデル特徴

リソース効率

リソース制約のあるデバイス向けに設計されており、パラメータ共有スキームにより事前学習と展開コストを削減

完全な透明性

完全なトレーニングデータパイプライン、トレーニングコード、モデル重み、評価コードを提供

命令追従

微調整により人間の命令を効果的に理解・実行可能

モデル能力

テキスト生成

命令理解

対話システム

使用事例

教育

子供向け活動提案

保護者向けに子供に適した活動アイデアと提案を提供

詳細な活動プランと注意事項を生成

健康

マインドフルネス瞑想ガイド

マインドフルネス瞑想の方法と利点を説明

瞑想テクニックと科学的根拠を詳細に解説

🚀 MobiLlama-1B-Chat

MobiLlama-1B-Chatは、MBZUAI/MobiLlama-1B をファインチューニングした命令追従モデルです。このモデルは、リソース制約のあるコンピューティング環境に特化した小規模言語モデル（SLM）で、少ないリソースで高い性能を発揮します。

🚀 クイックスタート

モデルの読み込み

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("MBZUAI/MobiLlama-1B-Chat", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("MBZUAI/MobiLlama-1B-Chat", trust_remote_code=True)
model.to('cuda')

#template adapated from fastchat
template= "A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.\n### Human: Got any creative ideas for a 10 year old’s birthday?\n### Assistant: Of course! Here are some creative ideas for a 10-year-old's birthday party:\n1. Treasure Hunt: Organize a treasure hunt in your backyard or nearby park. Create clues and riddles for the kids to solve, leading them to hidden treasures and surprises.\n2. Science Party: Plan a science-themed party where kids can engage in fun and interactive experiments. You can set up different stations with activities like making slime, erupting volcanoes, or creating simple chemical reactions.\n3. Outdoor Movie Night: Set up a backyard movie night with a projector and a large screen or white sheet. Create a cozy seating area with blankets and pillows, and serve popcorn and snacks while the kids enjoy a favorite movie under the stars.\n4. DIY Crafts Party: Arrange a craft party where kids can unleash their creativity. Provide a variety of craft supplies like beads, paints, and fabrics, and let them create their own unique masterpieces to take home as party favors.\n5. Sports Olympics: Host a mini Olympics event with various sports and games. Set up different stations for activities like sack races, relay races, basketball shooting, and obstacle courses. Give out medals or certificates to the participants.\n6. Cooking Party: Have a cooking-themed party where the kids can prepare their own mini pizzas, cupcakes, or cookies. Provide toppings, frosting, and decorating supplies, and let them get hands-on in the kitchen.\n7. Superhero Training Camp: Create a superhero-themed party where the kids can engage in fun training activities. Set up an obstacle course, have them design their own superhero capes or masks, and organize superhero-themed games and challenges.\n8. Outdoor Adventure: Plan an outdoor adventure party at a local park or nature reserve. Arrange activities like hiking, nature scavenger hunts, or a picnic with games. Encourage exploration and appreciation for the outdoors.\nRemember to tailor the activities to the birthday child's interests and preferences. Have a great celebration!\n### Human: {prompt}\n### Assistant:"

prompt = "What are the key benefits of practicing mindfulness meditation?"

input_str = template.format(prompt=prompt)
input_ids = tokenizer(input_str, return_tensors="pt").to('cuda').input_ids
outputs = model.generate(input_ids, max_length=1000, pad_token_id=tokenizer.eos_token_id)
print(tokenizer.batch_decode(outputs[:, input_ids.shape[1]:-1])[0].strip())

または、FastChat を使用することもできます。

python3 -m fastchat.serve.cli --model-path MBZUAI/MobiLlama-1B-Chat

✨ 主な機能

近年の大規模言語モデル（LLM）開発では、「大きければ良い」という傾向が主流でした。しかし、LLMはデバイス上での処理、エネルギー効率、低メモリ使用量、応答効率が求められるシナリオには適していません。これらの要件は、プライバシー、セキュリティ、持続可能な展開にとって重要です。この研究では、リソース制約のあるデバイス向けに、正確で効率的な小規模言語モデル（SLM）を設計するという課題に取り組み、「少ないが良い」というパラダイムを探求しています。

MobiLlamaは、大規模モデルから始めて、注意深いパラメータ共有スキームを適用することで、事前学習と展開のコストを削減するSLM設計です。この研究では、オープンソースのSLMのギャップを埋めるだけでなく、完全なトレーニングデータパイプライン、トレーニングコード、モデルウェイト、300以上のチェックポイント、および評価コードが Github で公開されており、完全な透明性を保証しています。

Arxiv Paper Link

📚 ドキュメント

モデルの詳細

属性	详情
モデルタイプ	LLaMA-7Bのアーキテクチャ設計を使用した小規模言語モデル（SLM）
言語	英語
ライセンス	Apache 2.0
詳細情報のリソース	トレーニングコード、データ準備、完全に処理されたAmber事前学習データ

ファインチューニングの詳細

DataMix

サブセット	行数	ライセンス
WizardLM/WizardLM_evol_instruct_V2_196k	143k
icybee/share_gpt_90k_v1	90k	cc0-1.0
合計	233k

ハイパーパラメータ

ハイパーパラメータ	値
総パラメータ数	1.2B
隠れ層のサイズ	2048
中間層のサイズ（MLP）	5632
アテンションヘッドの数	32
隠れ層の数	22
RMSNormのε	1e^-5
最大シーケンス長	2048
語彙サイズ	32000

トレーニングハイパーパラメータ	値
学習率	2e-5
エポック数	3
デバイスごとのバッチサイズ	2
勾配累積ステップ数	16
ウォームアップ率	0.04
モデルの最大長	2048

評価

評価ベンチマーク	MobiLlama-05B-Chat	MobiLlama-1.2B-Chat
HellaSwag	0.5042	0.6244
MMLU	0.2677	0.2635
Arc Challenge	0.2935	0.3558
TruthfulQA	0.3997	0.3848
CrowsPairs	0.5694	0.679
PIQA	0.7078	0.7557
Race	0.3320	0.3598
SIQA	0.4165	0.4396
Winogrande	0.5659	0.5966

引用

BibTeX:

@misc{thawakar2024mobillama,
      title={MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT}, 
      author={Omkar Thawakar and Ashmal Vayani and Salman Khan and Hisham Cholakkal and Rao Muhammad Anwer and Michael Felsberg and Timothy Baldwin and Eric P. Xing and Fahad Shahbaz Khan},
      year={2024},
      eprint={2402.16840},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}