🚀 預訓練語言模型合併項目
本項目是使用 mergekit 對預訓練語言模型進行合併的成果。通過合併多個預訓練模型,旨在提升模型的性能和適用性,為自然語言處理任務提供更強大的支持。
基礎模型信息
屬性 |
詳情 |
基礎模型 |
DavidAU/MN - GRAND - Gutenberg - Lyra4 - Lyra - 12B - DARKNESS、mistralai/Mistral - Nemo - Base - 2407、mistralai/Mistral - Nemo - Instruct - 2407、redrix/sororicide - 12B - Farer - Mell - Unslop、mergekit - community/MN - Chthonia - 12B、yamatazen/EtherealAurora - 12B - v2、mergekit - community/MN - Anathema - 12B、mergekit - community/MN - Ephemeros - 12B、jtatman/mistral_nemo_12b_reasoning_psychology_lora |
庫名稱 |
transformers |
標籤 |
mergekit、merge |
🚀 合併詳情
合併方法
本模型採用 Model Stock 合併方法,以 [mistralai/Mistral - Nemo - Instruct - 2407](https://huggingface.co/mistralai/Mistral - Nemo - Instruct - 2407) 為基礎進行合併。
參與合併的模型
以下模型參與了本次合併:
- [DavidAU/MN - GRAND - Gutenberg - Lyra4 - Lyra - 12B - DARKNESS](https://huggingface.co/DavidAU/MN - GRAND - Gutenberg - Lyra4 - Lyra - 12B - DARKNESS)
- [mistralai/Mistral - Nemo - Base - 2407](https://huggingface.co/mistralai/Mistral - Nemo - Base - 2407)
- [redrix/sororicide - 12B - Farer - Mell - Unslop](https://huggingface.co/redrix/sororicide - 12B - Farer - Mell - Unslop)
- [mergekit - community/MN - Chthonia - 12B](https://huggingface.co/mergekit - community/MN - Chthonia - 12B)
- [yamatazen/EtherealAurora - 12B - v2](https://huggingface.co/yamatazen/EtherealAurora - 12B - v2)
- [mergekit - community/MN - Anathema - 12B](https://huggingface.co/mergekit - community/MN - Anathema - 12B)
- [mergekit - community/MN - Ephemeros - 12B](https://huggingface.co/mergekit - community/MN - Ephemeros - 12B) + jtatman/mistral_nemo_12b_reasoning_psychology_lora
配置信息
以下是用於生成此模型的 YAML 配置:
dtype: float32
out_dtype: bfloat16
merge_method: model_stock
base_model: mistralai/Mistral-Nemo-Instruct-2407
models:
- model: DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-12B-DARKNESS
- model: mergekit-community/MN-Anathema-12B
- model: mergekit-community/MN-Chthonia-12B
- model: mergekit-community/MN-Ephemeros-12B+jtatman/mistral_nemo_12b_reasoning_psychology_lora
parameters:
weight: 0.7
- model: mistralai/Mistral-Nemo-Base-2407
parameters:
weight: 0.5
- model: redrix/sororicide-12B-Farer-Mell-Unslop
- model: yamatazen/EtherealAurora-12B-v2
parameters:
weight: 1.4
tokenizer:
source: union
tokens:
"<|im_start|>":
source: yamatazen/EtherealAurora-12B-v2
"<|im_end|>":
source: yamatazen/EtherealAurora-12B-v2
"[INST]":
source: mistralai/Mistral-Nemo-Instruct-2407
"[/INST]":
source: mistralai/Mistral-Nemo-Instruct-2407
chat_template: chatml