Mistral-7B-Instruct-Aya-101开源多语言指令跟随模型，支持101种语言交流！

首页

Mistral 7B Instruct Aya 101

由 MaziyarPanahi 开发

基于Mistral-7B-Instruct-v0.2微调的多语言指令跟随模型，支持101种语言

大型语言模型

Transformers

支持多种语言开源协议:Apache-2.0 #多语言指令微调 #教育场景优化 #低资源高效推理

下载量 92

发布时间 : 2/27/2024

模型简介

这是一个基于Mistral-7B-Instruct-v0.2微调的指令跟随模型，专门针对多语言场景优化，能够理解和生成多种语言的文本内容。

模型特点

多语言支持

支持101种语言的文本理解和生成能力

指令跟随

能够准确理解和执行复杂的自然语言指令

安全响应

内置安全机制，避免生成有害或不道德内容

模型能力

多语言文本生成

指令理解与执行

技术概念解释

故事创作

多语言翻译

使用案例

教育

多语言学习助手

帮助学生理解不同语言的技术概念

提供准确的多语言解释

内容创作

多语言故事创作

根据用户指令生成多语言故事内容

示例中展示了悟空与卡比的短篇故事创作

技术解释

技术概念说明

用多种语言解释Docker等复杂技术概念

提供详细的优缺点分析

🚀 Mistral-7B-Instruct-KhanAcademy-v0.2

本模型是基于 mistralai/Mistral-7B-Instruct-v0.2 在特定数据集上微调得到的版本，可用于多种语言的文本处理任务。

🚀 快速开始

使用示例

# 使用pipeline作为高级辅助工具
from transformers import pipeline

pipe = pipeline("text-generation", model="MaziyarPanahi/Mistral-7B-Instruct-Aya-101")

# 直接加载模型
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("MaziyarPanahi/Mistral-7B-Instruct-Aya-101")
model = AutoModelForCausalLM.from_pretrained("MaziyarPanahi/Mistral-7B-Instruct-Aya-101")

💻 使用示例

基础用法

<s> [INST] You are a sentient, superintelligent artificial general intelligence, here to teach and assist me. Write a short story about Goku discovering kirby has teamed up with Majin Buu to destroy the world.[/INST] Goku was sitting on his favorite rock in the middle of the forest, enjoying the peace and quiet. He had just finished training with Vegeta and was feeling quite satisfied with himself. Suddenly, he sensed a powerful presence approaching.

"Who goes there?" Goku called out, standing up and preparing for battle.

A figure appeared before him, and Goku's eyes widened in shock. It was Kirby! But why was he here? And why did he look so different?

"Goku-san!" Kirby cried, rushing forward and hugging the Saiyan prince. "I have come to ask for your help!"

Goku was confused but happy to see his old friend. "What's going on, Kirby? What do you need my help with?"

Kirby looked around nervously before speaking. "Majin Buu has joined forces with me, and we plan to destroy the world!"

Goku was stunned. "What?! Why would you do that, Kirby?!"

"It's not what it seems," Kirby explained. "Buu promised to make the world a better place if I helped him. We will only destroy the evil beings and leave the good ones alone."

Goku was torn. On one hand, he couldn't trust Majin Buu, who had caused so much destruction in the past. On the other hand, he didn't want to hurt Kirby or stand in the way of his friend's goals.

"I can't let you do this, Kirby," Goku said finally. "We can't trust Majin Buu. He'll only cause more harm in the end."

Kirby sighed. "I understand, Goku-san. But I must follow my own path now. Farewell, my friend."

With a heavy heart, Goku watched as Kirby and Majin Buu disappeared into the distance, ready to wreak havoc on the world. Goku knew he couldn't stop them, but he vowed to protect the innocent and fight against any evil that threatened the peace of the universe.</s>

高级用法

<s> [INST] You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.

describe about pros and cons of docker system.[/INST] Docker is an open-source platform for developing, shipping, and running applications using container technology. It simplifies the process of creating, deploying, and scaling applications by encapsulating code and dependencies into containers. Here are some advantages and disadvantages of using Docker:

Advantages:
1. Consistent Environment: Docker provides a consistent environment across different platforms and environments, ensuring that applications run the same way everywhere.
2. Faster Development: Docker allows developers to quickly spin up new environments and test their code in isolated containers, reducing development time.
3. Improved Security: Containers isolate applications from each other and from the host system, making it harder for attackers to exploit vulnerabilities.
4. Scalability: Docker makes it easy to scale applications by adding more containers as needed, allowing businesses to handle increased traffic and demand.
5. Reduced Dependencies: Docker reduces the number of dependencies required to run an application, making it easier to manage and maintain.
6. Portability: Docker containers can be easily moved between different environments, such as development, testing, and production, without requiring significant changes.
7. Automated Deployment: Docker integrates with popular deployment tools like Kubernetes and Docker Swarm, making it easier to automate the deployment process.
8. Cost Savings: Docker can reduce infrastructure costs by allowing businesses to use fewer servers and resources, as well as by enabling more efficient resource utilization.

Disadvantages:
1. Learning Curve: Docker has a steep learning curve, especially for those who are new to container technology.
2. Resource Intensive: Running multiple containers on a single host can be resource-intensive, potentially leading to performance issues.
3. Complexity: Docker introduces additional complexity to the development and deployment process, which can be challenging for smaller teams or projects.
4. Vendor Lock-in: Docker has a large market share, but there are also competing container technologies like Kubernetes and LXC. Choosing one over the others can lead to vendor lock-in.
5. Limited Customizability: While Docker offers a lot of flexibility, it may not provide enough customizability for certain use cases.
6. Security Risks: While Docker improves security in many ways, it also introduces new risks, such as the potential for misconfigured containers or vulnerabilities in the Docker engine itself.
7. Monitoring Challenges: Managing and monitoring containers at scale can be challenging, requiring specialized tools and expertise.</s>

🔧 技术细节

训练超参数

在训练过程中使用了以下超参数：

学习率（learning_rate）：5e-06
训练批次大小（train_batch_size）：2
评估批次大小（eval_batch_size）：2
随机种子（seed）：42
分布式类型（distributed_type）：多GPU
设备数量（num_devices）：4
梯度累积步数（gradient_accumulation_steps）：4
总训练批次大小（total_train_batch_size）：32
总评估批次大小（total_eval_batch_size）：8
优化器（optimizer）：Adam，β值为(0.9, 0.999)，ε值为1e-08
学习率调度器类型（lr_scheduler_type）：余弦
学习率调度器热身步数（lr_scheduler_warmup_steps）：10
训练轮数（num_epochs）：1

训练结果

训练损失	轮数	步数	验证损失
1.9039	0.0	1	3.1495
0.9933	0.25	101	1.2402
0.9439	0.5	202	1.1683
0.9762	0.75	303	1.1502

框架版本

Transformers：4.39.0.dev0
Pytorch：2.2.0+cu121
Datasets：2.17.0
Tokenizers：0.15.0

Axolotl配置

查看axolotl配置

axolotl版本：0.4.0

base_model: mistralai/Mistral-7B-Instruct-v0.2
model_type: MistralForCausalLM
tokenizer_type: LlamaTokenizer
is_mistral_derived_model: true


hub_model_id: MaziyarPanahi/Mistral-7B-Instruct-KhanAcademy-v0.2
hf_use_auth_token: true

load_in_8bit: false
load_in_4bit: false
strict: false

datasets:
  - path: CohereForAI/aya_dataset
    type:
      system_prompt: ""
      field_instruction: inputs
      field_output: targets
      format: "[INST] {instruction} [/INST]"
      no_input_format: "[INST] {instruction} [/INST]"

dataset_prepared_path:
val_set_size: 0.05
output_dir: ./models/MaziyarPanahi/Mistral-7B-Instruct-Aya-101

sequence_len: 8192
sample_packing: true
pad_to_sequence_len: true
eval_sample_packing: false

wandb_project:
wandb_entity:
wandb_watch:
wandb_name:
wandb_log_model:

gradient_accumulation_steps: 4
micro_batch_size: 2
num_epochs: 1
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 0.000005

train_on_inputs: false
group_by_length: false
bf16: auto
fp16:
tf32: false

gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_steps: 10
evals_per_epoch: 4
eval_table_size:
eval_max_new_tokens: 128
saves_per_epoch: 1
debug:
deepspeed:
weight_decay: 0.0
fsdp:
fsdp_config:
special_tokens:
  bos_token: "<s>"
  eos_token: "</s>"
  unk_token: "<unk>"

📄 许可证

本模型采用Apache-2.0许可证。

属性	详情
基础模型	mistralai/Mistral-7B-Instruct-v0.2
训练数据集	CohereForAI/aya_dataset
支持语言	阿非利卡语、阿姆哈拉语、阿拉伯语、阿塞拜疆语、白俄罗斯语、孟加拉语、保加利亚语、加泰罗尼亚语、宿务语、捷克语、威尔士语、丹麦语、德语、希腊语、英语、世界语、爱沙尼亚语、巴斯克语、芬兰语、菲律宾语、法语、弗里西语、苏格兰盖尔语、爱尔兰语、加利西亚语、古吉拉特语、海地克里奥尔语、豪萨语、希伯来语、印地语、匈牙利语、亚美尼亚语、伊博语、印尼语、冰岛语、意大利语、爪哇语、日语、卡纳达语、格鲁吉亚语、哈萨克语、高棉语、吉尔吉斯语、韩语、库尔德语、老挝语、拉脱维亚语、拉丁语、立陶宛语、卢森堡语、马拉雅拉姆语、马拉地语、马其顿语、马尔加什语、马耳他语、蒙古语、毛利语、马来语、缅甸语、尼泊尔语、荷兰语、挪威语、北索托语、尼扬贾语、奥里亚语、旁遮普语、波斯语、波兰语、葡萄牙语、普什图语、罗马尼亚语、俄语、僧伽罗语、斯洛伐克语、斯洛文尼亚语、萨摩亚语、绍纳语、信德语、索马里语、南索托语、西班牙语、阿尔巴尼亚语、塞尔维亚语、巽他语、斯瓦希里语、瑞典语、泰米尔语、泰卢固语、塔吉克语、泰语、土耳其语、契维语、乌克兰语、乌尔都语、乌兹别克语、越南语、科萨语、意第绪语、约鲁巴语、中文、祖鲁语
模型名称	Mistral-7B-Instruct-KhanAcademy-v0.2
许可证	Apache-2.0