🚀 预训练语言模型合并项目
本项目是使用 mergekit 对预训练语言模型进行合并的成果。通过合并多个预训练模型,旨在提升模型的性能和适用性,为自然语言处理任务提供更强大的支持。
基础模型信息
属性 |
详情 |
基础模型 |
DavidAU/MN - GRAND - Gutenberg - Lyra4 - Lyra - 12B - DARKNESS、mistralai/Mistral - Nemo - Base - 2407、mistralai/Mistral - Nemo - Instruct - 2407、redrix/sororicide - 12B - Farer - Mell - Unslop、mergekit - community/MN - Chthonia - 12B、yamatazen/EtherealAurora - 12B - v2、mergekit - community/MN - Anathema - 12B、mergekit - community/MN - Ephemeros - 12B、jtatman/mistral_nemo_12b_reasoning_psychology_lora |
库名称 |
transformers |
标签 |
mergekit、merge |
🚀 合并详情
合并方法
本模型采用 Model Stock 合并方法,以 [mistralai/Mistral - Nemo - Instruct - 2407](https://huggingface.co/mistralai/Mistral - Nemo - Instruct - 2407) 为基础进行合并。
参与合并的模型
以下模型参与了本次合并:
- [DavidAU/MN - GRAND - Gutenberg - Lyra4 - Lyra - 12B - DARKNESS](https://huggingface.co/DavidAU/MN - GRAND - Gutenberg - Lyra4 - Lyra - 12B - DARKNESS)
- [mistralai/Mistral - Nemo - Base - 2407](https://huggingface.co/mistralai/Mistral - Nemo - Base - 2407)
- [redrix/sororicide - 12B - Farer - Mell - Unslop](https://huggingface.co/redrix/sororicide - 12B - Farer - Mell - Unslop)
- [mergekit - community/MN - Chthonia - 12B](https://huggingface.co/mergekit - community/MN - Chthonia - 12B)
- [yamatazen/EtherealAurora - 12B - v2](https://huggingface.co/yamatazen/EtherealAurora - 12B - v2)
- [mergekit - community/MN - Anathema - 12B](https://huggingface.co/mergekit - community/MN - Anathema - 12B)
- [mergekit - community/MN - Ephemeros - 12B](https://huggingface.co/mergekit - community/MN - Ephemeros - 12B) + jtatman/mistral_nemo_12b_reasoning_psychology_lora
配置信息
以下是用于生成此模型的 YAML 配置:
dtype: float32
out_dtype: bfloat16
merge_method: model_stock
base_model: mistralai/Mistral-Nemo-Instruct-2407
models:
- model: DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-12B-DARKNESS
- model: mergekit-community/MN-Anathema-12B
- model: mergekit-community/MN-Chthonia-12B
- model: mergekit-community/MN-Ephemeros-12B+jtatman/mistral_nemo_12b_reasoning_psychology_lora
parameters:
weight: 0.7
- model: mistralai/Mistral-Nemo-Base-2407
parameters:
weight: 0.5
- model: redrix/sororicide-12B-Farer-Mell-Unslop
- model: yamatazen/EtherealAurora-12B-v2
parameters:
weight: 1.4
tokenizer:
source: union
tokens:
"<|im_start|>":
source: yamatazen/EtherealAurora-12B-v2
"<|im_end|>":
source: yamatazen/EtherealAurora-12B-v2
"[INST]":
source: mistralai/Mistral-Nemo-Instruct-2407
"[/INST]":
source: mistralai/Mistral-Nemo-Instruct-2407
chat_template: chatml