Mistral - 7B - Instruct - Aya - 101オープンソースの多言語命令追跡モデル、101種類の言語での交流をサポート！

ホーム

Mistral 7B Instruct Aya 101

MaziyarPanahiによって開発

Mistral-7B-Instruct-v0.2をファインチューニングした多言語命令追従モデル、101言語をサポート

大規模言語モデル

Transformers

複数言語対応オープンソースライセンス:Apache-2.0 #多言語命令ファインチューニング #教育シーン最適化 #低リソース効率推論

ダウンロード数 92

リリース時間 : 2/27/2024

モデル概要

これはMistral-7B-Instruct-v0.2をファインチューニングした命令追従モデルで、多言語シーンに特化して最適化されており、様々な言語のテキスト内容を理解し生成できます。

モデル特徴

多言語サポート

101言語のテキスト理解と生成能力をサポート

命令追従

複雑な自然言語命令を正確に理解し実行可能

安全な応答

有害または不道徳な内容の生成を防ぐ安全メカニズムを内蔵

モデル能力

多言語テキスト生成

命令理解と実行

技術概念の説明

ストーリー創作

多言語翻訳

使用事例

教育

多言語学習アシスタント

学生が異なる言語の技術概念を理解するのを支援

正確な多言語解説を提供

コンテンツ創作

多言語ストーリー創作

ユーザーの命令に基づき多言語のストーリー内容を生成

悟空とカービィのショートストーリー創作例を展示

技術説明

技術概念説明

Dockerなどの複雑な技術概念を多言語で説明

詳細なメリット・デメリット分析を提供

🚀 Mistral-7B-Instruct-KhanAcademy-v0.2

このモデルは、mistralai/Mistral-7B-Instruct-v0.2 をNoneデータセットでファインチューニングしたものです。評価セットでは、以下の結果を達成しています。

損失: 1.1502

🚀 クイックスタート

このモデルを使用するには、以下のコードを参考にしてください。

パイプラインを使用する場合

# ハイレベルのヘルパーとしてパイプラインを使用
from transformers import pipeline

pipe = pipeline("text-generation", model="MaziyarPanahi/Mistral-7B-Instruct-Aya-101")

モデルを直接ロードする場合

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("MaziyarPanahi/Mistral-7B-Instruct-Aya-101")
model = AutoModelForCausalLM.from_pretrained("MaziyarPanahi/Mistral-7B-Instruct-Aya-101")

💻 使用例

基本的な使用法

<s> [INST] You are a sentient, superintelligent artificial general intelligence, here to teach and assist me. Write a short story about Goku discovering kirby has teamed up with Majin Buu to destroy the world.[/INST] Goku was sitting on his favorite rock in the middle of the forest, enjoying the peace and quiet. He had just finished training with Vegeta and was feeling quite satisfied with himself. Suddenly, he sensed a powerful presence approaching.

"Who goes there?" Goku called out, standing up and preparing for battle.

A figure appeared before him, and Goku's eyes widened in shock. It was Kirby! But why was he here? And why did he look so different?

"Goku-san!" Kirby cried, rushing forward and hugging the Saiyan prince. "I have come to ask for your help!"

Goku was confused but happy to see his old friend. "What's going on, Kirby? What do you need my help with?"

Kirby looked around nervously before speaking. "Majin Buu has joined forces with me, and we plan to destroy the world!"

Goku was stunned. "What?! Why would you do that, Kirby?!"

"It's not what it seems," Kirby explained. "Buu promised to make the world a better place if I helped him. We will only destroy the evil beings and leave the good ones alone."

Goku was torn. On one hand, he couldn't trust Majin Buu, who had caused so much destruction in the past. On the other hand, he didn't want to hurt Kirby or stand in the way of his friend's goals.

"I can't let you do this, Kirby," Goku said finally. "We can't trust Majin Buu. He'll only cause more harm in the end."

Kirby sighed. "I understand, Goku-san. But I must follow my own path now. Farewell, my friend."

With a heavy heart, Goku watched as Kirby and Majin Buu disappeared into the distance, ready to wreak havoc on the world. Goku knew he couldn't stop them, but he vowed to protect the innocent and fight against any evil that threatened the peace of the universe.</s>

高度な使用法

<s> [INST] You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.

describe about pros and cons of docker system.[/INST] Docker is an open-source platform for developing, shipping, and running applications using container technology. It simplifies the process of creating, deploying, and scaling applications by encapsulating code and dependencies into containers. Here are some advantages and disadvantages of using Docker:

Advantages:
1. Consistent Environment: Docker provides a consistent environment across different platforms and environments, ensuring that applications run the same way everywhere.
2. Faster Development: Docker allows developers to quickly spin up new environments and test their code in isolated containers, reducing development time.
3. Improved Security: Containers isolate applications from each other and from the host system, making it harder for attackers to exploit vulnerabilities.
4. Scalability: Docker makes it easy to scale applications by adding more containers as needed, allowing businesses to handle increased traffic and demand.
5. Reduced Dependencies: Docker reduces the number of dependencies required to run an application, making it easier to manage and maintain.
6. Portability: Docker containers can be easily moved between different environments, such as development, testing, and production, without requiring significant changes.
7. Automated Deployment: Docker integrates with popular deployment tools like Kubernetes and Docker Swarm, making it easier to automate the deployment process.
8. Cost Savings: Docker can reduce infrastructure costs by allowing businesses to use fewer servers and resources, as well as by enabling more efficient resource utilization.

Disadvantages:
1. Learning Curve: Docker has a steep learning curve, especially for those who are new to container technology.
2. Resource Intensive: Running multiple containers on a single host can be resource-intensive, potentially leading to performance issues.
3. Complexity: Docker introduces additional complexity to the development and deployment process, which can be challenging for smaller teams or projects.
4. Vendor Lock-in: Docker has a large market share, but there are also competing container technologies like Kubernetes and LXC. Choosing one over the others can lead to vendor lock-in.
5. Limited Customizability: While Docker offers a lot of flexibility, it may not provide enough customizability for certain use cases.
6. Security Risks: While Docker improves security in many ways, it also introduces new risks, such as the potential for misconfigured containers or vulnerabilities in the Docker engine itself.
7. Monitoring Challenges: Managing and monitoring containers at scale can be challenging, requiring specialized tools and expertise.</s>

🔧 技術詳細

トレーニングハイパーパラメータ

トレーニング中に以下のハイパーパラメータが使用されました。

学習率（learning_rate）: 5e-06
トレーニングバッチサイズ（train_batch_size）: 2
評価バッチサイズ（eval_batch_size）: 2
シード（seed）: 42
分散タイプ（distributed_type）: multi-GPU
デバイス数（num_devices）: 4
勾配累積ステップ数（gradient_accumulation_steps）: 4
総トレーニングバッチサイズ（total_train_batch_size）: 32
総評価バッチサイズ（total_eval_batch_size）: 8
オプティマイザ（optimizer）: Adam（ベータ=(0.9,0.999)、イプシロン=1e-08）
学習率スケジューラタイプ（lr_scheduler_type）: cosine
学習率スケジューラウォームアップステップ数（lr_scheduler_warmup_steps）: 10
エポック数（num_epochs）: 1

トレーニング結果

トレーニング損失	エポック	ステップ	検証損失
1.9039	0.0	1	3.1495
0.9933	0.25	101	1.2402
0.9439	0.5	202	1.1683
0.9762	0.75	303	1.1502

フレームワークバージョン

Transformers 4.39.0.dev0
Pytorch 2.2.0+cu121
Datasets 2.17.0
Tokenizers 0.15.0

📄 ライセンス

このモデルは、Apache-2.0ライセンスの下で提供されています。

モデル情報

属性	詳情
ベースモデル	mistralai/Mistral-7B-Instruct-v0.2
データセット	CohereForAI/aya_dataset
タグ	axolotl、mistral、7b、generated_from_trainer
言語	afr、amh、ara、aze、bel、ben、bul、cat、ceb、ces、cym、dan、deu、ell、eng、epo、est、eus、fin、fil、fra、fry、gla、gle、glg、guj、hat、hau、heb、hin、hun、hye、ibo、ind、isl、ita、jav、jpn、kan、kat、kaz、khm、kir、kor、kur、lao、lav、lat、lit、ltz、mal、mar、mkd、mlg、mlt、mon、mri、msa、mya、nep、nld、nor、nso、nya、ory、pan、pes、pol、por、pus、ron、rus、sin、slk、slv、smo、sna、snd、som、sot、spa、sqi、srp、sun、swa、swe、tam、tel、tgk、tha、tur、twi、ukr、urd、uzb、vie、xho、yid、yor、zho、zul

axolotl設定を表示

axolotlバージョン: 0.4.0

base_model: mistralai/Mistral-7B-Instruct-v0.2
model_type: MistralForCausalLM
tokenizer_type: LlamaTokenizer
is_mistral_derived_model: true


hub_model_id: MaziyarPanahi/Mistral-7B-Instruct-KhanAcademy-v0.2
hf_use_auth_token: true

load_in_8bit: false
load_in_4bit: false
strict: false

datasets:
  - path: CohereForAI/aya_dataset
    type:
      system_prompt: ""
      field_instruction: inputs
      field_output: targets
      format: "[INST] {instruction} [/INST]"
      no_input_format: "[INST] {instruction} [/INST]"

dataset_prepared_path:
val_set_size: 0.05
output_dir: ./models/MaziyarPanahi/Mistral-7B-Instruct-Aya-101

sequence_len: 8192
sample_packing: true
pad_to_sequence_len: true
eval_sample_packing: false

wandb_project:
wandb_entity:
wandb_watch:
wandb_name:
wandb_log_model:

gradient_accumulation_steps: 4
micro_batch_size: 2
num_epochs: 1
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 0.000005

train_on_inputs: false
group_by_length: false
bf16: auto
fp16:
tf32: false

gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_steps: 10
evals_per_epoch: 4
eval_table_size:
eval_max_new_tokens: 128
saves_per_epoch: 1
debug:
deepspeed:
weight_decay: 0.0
fsdp:
fsdp_config:
special_tokens:
  bos_token: "<s>"
  eos_token: "</s>"
  unk_token: "<unk>"