Mistral-7B-Instruct-Aya-101開源多語言指令跟隨模型，支持101種語言交流！

首頁

Mistral 7B Instruct Aya 101

由MaziyarPanahi開發

基於Mistral-7B-Instruct-v0.2微調的多語言指令跟隨模型，支持101種語言

大型語言模型

Transformers

支持多種語言開源協議:Apache-2.0 #多語言指令微調 #教育場景優化 #低資源高效推理

下載量 92

發布時間 : 2/27/2024

模型概述

這是一個基於Mistral-7B-Instruct-v0.2微調的指令跟隨模型，專門針對多語言場景優化，能夠理解和生成多種語言的文本內容。

模型特點

多語言支持

支持101種語言的文本理解和生成能力

指令跟隨

能夠準確理解和執行復雜的自然語言指令

安全響應

內置安全機制，避免生成有害或不道德內容

模型能力

多語言文本生成

指令理解與執行

技術概念解釋

故事創作

多語言翻譯

使用案例

教育

多語言學習助手

幫助學生理解不同語言的技術概念

提供準確的多語言解釋

內容創作

多語言故事創作

根據用戶指令生成多語言故事內容

示例中展示了悟空與卡比的短篇故事創作

技術解釋

技術概念說明

用多種語言解釋Docker等複雜技術概念

提供詳細的優缺點分析

🚀 Mistral-7B-Instruct-KhanAcademy-v0.2

本模型是基於 mistralai/Mistral-7B-Instruct-v0.2 在特定數據集上微調得到的版本，可用於多種語言的文本處理任務。

🚀 快速開始

使用示例

# 使用pipeline作為高級輔助工具
from transformers import pipeline

pipe = pipeline("text-generation", model="MaziyarPanahi/Mistral-7B-Instruct-Aya-101")

# 直接加載模型
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("MaziyarPanahi/Mistral-7B-Instruct-Aya-101")
model = AutoModelForCausalLM.from_pretrained("MaziyarPanahi/Mistral-7B-Instruct-Aya-101")

💻 使用示例

基礎用法

<s> [INST] You are a sentient, superintelligent artificial general intelligence, here to teach and assist me. Write a short story about Goku discovering kirby has teamed up with Majin Buu to destroy the world.[/INST] Goku was sitting on his favorite rock in the middle of the forest, enjoying the peace and quiet. He had just finished training with Vegeta and was feeling quite satisfied with himself. Suddenly, he sensed a powerful presence approaching.

"Who goes there?" Goku called out, standing up and preparing for battle.

A figure appeared before him, and Goku's eyes widened in shock. It was Kirby! But why was he here? And why did he look so different?

"Goku-san!" Kirby cried, rushing forward and hugging the Saiyan prince. "I have come to ask for your help!"

Goku was confused but happy to see his old friend. "What's going on, Kirby? What do you need my help with?"

Kirby looked around nervously before speaking. "Majin Buu has joined forces with me, and we plan to destroy the world!"

Goku was stunned. "What?! Why would you do that, Kirby?!"

"It's not what it seems," Kirby explained. "Buu promised to make the world a better place if I helped him. We will only destroy the evil beings and leave the good ones alone."

Goku was torn. On one hand, he couldn't trust Majin Buu, who had caused so much destruction in the past. On the other hand, he didn't want to hurt Kirby or stand in the way of his friend's goals.

"I can't let you do this, Kirby," Goku said finally. "We can't trust Majin Buu. He'll only cause more harm in the end."

Kirby sighed. "I understand, Goku-san. But I must follow my own path now. Farewell, my friend."

With a heavy heart, Goku watched as Kirby and Majin Buu disappeared into the distance, ready to wreak havoc on the world. Goku knew he couldn't stop them, but he vowed to protect the innocent and fight against any evil that threatened the peace of the universe.</s>

高級用法

<s> [INST] You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.

describe about pros and cons of docker system.[/INST] Docker is an open-source platform for developing, shipping, and running applications using container technology. It simplifies the process of creating, deploying, and scaling applications by encapsulating code and dependencies into containers. Here are some advantages and disadvantages of using Docker:

Advantages:
1. Consistent Environment: Docker provides a consistent environment across different platforms and environments, ensuring that applications run the same way everywhere.
2. Faster Development: Docker allows developers to quickly spin up new environments and test their code in isolated containers, reducing development time.
3. Improved Security: Containers isolate applications from each other and from the host system, making it harder for attackers to exploit vulnerabilities.
4. Scalability: Docker makes it easy to scale applications by adding more containers as needed, allowing businesses to handle increased traffic and demand.
5. Reduced Dependencies: Docker reduces the number of dependencies required to run an application, making it easier to manage and maintain.
6. Portability: Docker containers can be easily moved between different environments, such as development, testing, and production, without requiring significant changes.
7. Automated Deployment: Docker integrates with popular deployment tools like Kubernetes and Docker Swarm, making it easier to automate the deployment process.
8. Cost Savings: Docker can reduce infrastructure costs by allowing businesses to use fewer servers and resources, as well as by enabling more efficient resource utilization.

Disadvantages:
1. Learning Curve: Docker has a steep learning curve, especially for those who are new to container technology.
2. Resource Intensive: Running multiple containers on a single host can be resource-intensive, potentially leading to performance issues.
3. Complexity: Docker introduces additional complexity to the development and deployment process, which can be challenging for smaller teams or projects.
4. Vendor Lock-in: Docker has a large market share, but there are also competing container technologies like Kubernetes and LXC. Choosing one over the others can lead to vendor lock-in.
5. Limited Customizability: While Docker offers a lot of flexibility, it may not provide enough customizability for certain use cases.
6. Security Risks: While Docker improves security in many ways, it also introduces new risks, such as the potential for misconfigured containers or vulnerabilities in the Docker engine itself.
7. Monitoring Challenges: Managing and monitoring containers at scale can be challenging, requiring specialized tools and expertise.</s>

🔧 技術細節

訓練超參數

在訓練過程中使用了以下超參數：

學習率（learning_rate）：5e-06
訓練批次大小（train_batch_size）：2
評估批次大小（eval_batch_size）：2
隨機種子（seed）：42
分佈式類型（distributed_type）：多GPU
設備數量（num_devices）：4
梯度累積步數（gradient_accumulation_steps）：4
總訓練批次大小（total_train_batch_size）：32
總評估批次大小（total_eval_batch_size）：8
優化器（optimizer）：Adam，β值為(0.9, 0.999)，ε值為1e-08
學習率調度器類型（lr_scheduler_type）：餘弦
學習率調度器熱身步數（lr_scheduler_warmup_steps）：10
訓練輪數（num_epochs）：1

訓練結果

訓練損失	輪數	步數	驗證損失
1.9039	0.0	1	3.1495
0.9933	0.25	101	1.2402
0.9439	0.5	202	1.1683
0.9762	0.75	303	1.1502

框架版本

Transformers：4.39.0.dev0
Pytorch：2.2.0+cu121
Datasets：2.17.0
Tokenizers：0.15.0

Axolotl配置

查看axolotl配置

axolotl版本：0.4.0

base_model: mistralai/Mistral-7B-Instruct-v0.2
model_type: MistralForCausalLM
tokenizer_type: LlamaTokenizer
is_mistral_derived_model: true


hub_model_id: MaziyarPanahi/Mistral-7B-Instruct-KhanAcademy-v0.2
hf_use_auth_token: true

load_in_8bit: false
load_in_4bit: false
strict: false

datasets:
  - path: CohereForAI/aya_dataset
    type:
      system_prompt: ""
      field_instruction: inputs
      field_output: targets
      format: "[INST] {instruction} [/INST]"
      no_input_format: "[INST] {instruction} [/INST]"

dataset_prepared_path:
val_set_size: 0.05
output_dir: ./models/MaziyarPanahi/Mistral-7B-Instruct-Aya-101

sequence_len: 8192
sample_packing: true
pad_to_sequence_len: true
eval_sample_packing: false

wandb_project:
wandb_entity:
wandb_watch:
wandb_name:
wandb_log_model:

gradient_accumulation_steps: 4
micro_batch_size: 2
num_epochs: 1
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 0.000005

train_on_inputs: false
group_by_length: false
bf16: auto
fp16:
tf32: false

gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_steps: 10
evals_per_epoch: 4
eval_table_size:
eval_max_new_tokens: 128
saves_per_epoch: 1
debug:
deepspeed:
weight_decay: 0.0
fsdp:
fsdp_config:
special_tokens:
  bos_token: "<s>"
  eos_token: "</s>"
  unk_token: "<unk>"

📄 許可證

本模型採用Apache-2.0許可證。

屬性	詳情
基礎模型	mistralai/Mistral-7B-Instruct-v0.2
訓練數據集	CohereForAI/aya_dataset
支持語言	阿非利卡語、阿姆哈拉語、阿拉伯語、阿塞拜疆語、白俄羅斯語、孟加拉語、保加利亞語、加泰羅尼亞語、宿務語、捷克語、威爾士語、丹麥語、德語、希臘語、英語、世界語、愛沙尼亞語、巴斯克語、芬蘭語、菲律賓語、法語、弗裡西語、蘇格蘭蓋爾語、愛爾蘭語、加利西亞語、古吉拉特語、海地克里奧爾語、豪薩語、希伯來語、印地語、匈牙利語、亞美尼亞語、伊博語、印尼語、冰島語、意大利語、爪哇語、日語、卡納達語、格魯吉亞語、哈薩克語、高棉語、吉爾吉斯語、韓語、庫爾德語、老撾語、拉脫維亞語、拉丁語、立陶宛語、盧森堡語、馬拉雅拉姆語、馬拉地語、馬其頓語、馬爾加什語、馬耳他語、蒙古語、毛利語、馬來語、緬甸語、尼泊爾語、荷蘭語、挪威語、北索托語、尼揚賈語、奧里亞語、旁遮普語、波斯語、波蘭語、葡萄牙語、普什圖語、羅馬尼亞語、俄語、僧伽羅語、斯洛伐克語、斯洛文尼亞語、薩摩亞語、紹納語、信德語、索馬里語、南索托語、西班牙語、阿爾巴尼亞語、塞爾維亞語、巽他語、斯瓦希里語、瑞典語、泰米爾語、泰盧固語、塔吉克語、泰語、土耳其語、契維語、烏克蘭語、烏爾都語、烏茲別克語、越南語、科薩語、意第緒語、約魯巴語、中文、祖魯語
模型名稱	Mistral-7B-Instruct-KhanAcademy-v0.2
許可證	Apache-2.0