MN-2407-DSK-QwQify-v0.1-12B開源模型 - 賦予QwQ思維，用於角色扮演寫作

首頁

MN 2407 DSK QwQify V0.1 12B

由BeaverAI開發

這是一個基於PocketDoc/Dans-SakuraKaze-V1.0.0-12b微調的測試模型，旨在為現有模型賦予QwQ的思維模式，適用於角色扮演/冒險/協作寫作場景。

大型語言模型

Transformers

開源協議:Apache-2.0 #角色扮演增強 #QwQ風格轉換 #多輪對話優化

下載量 734

發布時間 : 3/15/2025

模型概述

該模型基於PocketDoc/Dans-SakuraKaze-V1.0.0-12b（一個角色扮演/冒險/協作寫作模型）微調而來，通過LoRA技術賦予模型QwQ的思維模式。主要用於對話生成和協作寫作任務。

模型特點

QwQ思維模式

通過微調賦予模型QwQ特有的思維和表達方式，使其在對話和寫作中表現出特定風格

角色扮演優化

特別針對角色扮演場景優化，能夠生成符合角色設定的對話內容

協作寫作支持

適用於協作寫作任務，能夠根據上下文生成連貫的文本內容

ChatML格式支持

使用ChatML提示格式，便於對話管理和上下文控制

模型能力

文本生成

角色扮演對話

協作寫作

風格化文本生成

使用案例

娛樂

角色扮演遊戲

作為遊戲中的NPC進行對話互動

生成符合角色設定的自然對話

互動故事創作

與用戶共同創作互動故事

生成連貫且符合上下文的故事情節

創意寫作

協作寫作助手

幫助作家進行創意寫作

提供創意建議和文本續寫

🚀 BeaverAI/MN-2407-DSK-QwQify-v0.1-12B

這是一個測試模型，旨在為現有模型賦予QwQ思維。此初始版本基於PocketDoc/Dans-SakuraKaze-V1.0.0-12b（一個角色扮演/冒險/協同寫作模型）構建，而該模型又基於PocketDoc/Dans-PersonalityEngine-V1.1.0-12b（一個通用指令模型）訓練，後者則基於mistralai/Mistral-Nemo-Base-2407。

GGUF

🚀 快速開始

提示格式和使用方法

提示格式和使用方法應與QwQ模型相同：使用ChatML格式，並去除前幾輪對話中的思考內容。如果模型沒有自動生成思考內容，可以在助手回覆的開頭添加<think>\n。

對話格式要求

模型應遵循前幾輪對話的格式。在對話的開頭幾輪，可能需要多次重新生成回覆，並對模型的回覆進行編輯，以達到滿意的效果。

前綴設置建議

可以考慮禁用為角色插入{{char}}:前綴，而是在系統提示的末尾添加類似“僅以“{{char}}”的身份與“{{user}}”對話。最終回覆以“{{char}}:”開頭”的內容。

image/png

✨ 主要特性

本模型是基於PocketDoc/Dans-SakuraKaze-V1.0.0-12b在多個數據集上進行微調得到的，這些數據集包括：

PJMixers-Dev/allura-org_gryphe-sonnet-3.5-charcards-names-added-qwq-all-aphrodite-Shuffled
PJMixers-Dev/anthracite-org_c2_logs_32k_llama3_qwen2_v1.3-qwq-all-aphrodite-Shuffled
PJMixers-Dev/grimulkan_aicg-logs-augmented-system-qwq-all-aphrodite-Shuffled
PJMixers-Dev/grimulkan_jannie-log-augmented-system-qwq-all-aphrodite-Shuffled
PJMixers-Dev/grimulkan_PIPPA-augmented-dedup-system-qwq-all-aphrodite-Shuffled
PJMixers-Dev/lemonilia_LimaRP-Only-NonSus-Simple-CustomShareGPT-qwq-all-aphrodite-Shuffled
PJMixers-Dev/MinervaAI_Aesir-Preview-Anon-qwq-all-aphrodite-Shuffled
PJMixers-Dev/NyxKrage_chub-logs-sharegpt-longest-CustomShareGPT-qwq-all-aphrodite-Shuffled
PJMixers-Dev/PocketDoc_Dans-Prosemaxx-Cowriter-XL-8192-shrunk-l3-qwq-all-aphrodite-Shuffled
PJMixers-Dev/PocketDoc_Dans-Personamaxx-Rainy-qwq-all-aphrodite-Shuffled

模型在評估集上取得了以下結果：

損失值：1.2770

🔧 技術細節

Axolotl配置

查看axolotl配置

axolotl版本：0.8.0.dev0

mlflow_tracking_uri: http://127.0.0.1:7860
mlflow_experiment_name: MN-2407-DSK-QwQify-v0.1-12B-LoRA

# Hugging Face保存配置
hub_model_id: BeaverAI/MN-2407-DSK-QwQify-v0.1-12B-LoRA-WS
hub_strategy: every_save

# 模型檢查點配置
output_dir: ./Outputs/MN-2407-DSK-QwQify-v0.1-12B-LoRA
resume_from_checkpoint:
save_steps: 25
save_safetensors: true
save_total_limit: 3
save_only_model: false

# 模型架構配置
base_model: PocketDoc/Dans-SakuraKaze-V1.0.0-12b
model_type: MistralForCausalLM
tokenizer_type: PreTrainedTokenizerFast

# 混合精度訓練配置
bf16: true
fp16: false
tf32: false

# 模型加載配置
load_in_8bit: false
load_in_4bit: false
strict: false

# 序列配置
sequence_len: 8192
min_sample_len: 256
sample_packing: true
eval_sample_packing: true
pad_to_sequence_len: true
train_on_inputs: false
group_by_length: false

# LoRA適配器配置
adapter: lora
lora_model_dir:
lora_r: 128
lora_alpha: 128
lora_dropout: 0.125
peft_layers_to_transform:
peft_use_dora:
peft_use_rslora:
peft_layer_replication:
lora_target_modules:
  - gate_proj
  - down_proj
  - up_proj
  - q_proj
  - v_proj
  - k_proj
  - o_proj
lora_modules_to_save:

# 修復未初始化的標記（例如基礎L3模型上的<|start_header_id|>）
fix_untrained_tokens:

# 數據集配置
# https://github.com/xzuyn/axolotl/blob/came-plus-formatters/src/axolotl/prompt_strategies/customchatml-regex-last-only.py
datasets:
  - path: PJMixers-Dev/allura-org_gryphe-sonnet-3.5-charcards-names-added-qwq-all-aphrodite-Shuffled
    split: train
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/anthracite-org_c2_logs_32k_llama3_qwen2_v1.3-qwq-all-aphrodite-Shuffled
    split: train
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/grimulkan_aicg-logs-augmented-system-qwq-all-aphrodite-Shuffled
    split: train
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/grimulkan_jannie-log-augmented-system-qwq-all-aphrodite-Shuffled
    split: train
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/grimulkan_PIPPA-augmented-dedup-system-qwq-all-aphrodite-Shuffled
    split: train
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/lemonilia_LimaRP-Only-NonSus-Simple-CustomShareGPT-qwq-all-aphrodite-Shuffled
    split: train
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/MinervaAI_Aesir-Preview-Anon-qwq-all-aphrodite-Shuffled
    split: train
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/NyxKrage_chub-logs-sharegpt-longest-CustomShareGPT-qwq-all-aphrodite-Shuffled
    split: train
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/PocketDoc_Dans-Prosemaxx-Cowriter-XL-8192-shrunk-l3-qwq-all-aphrodite-Shuffled
    split: train
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/PocketDoc_Dans-Personamaxx-Rainy-qwq-all-aphrodite-Shuffled
    split: train
    type: customchatml-regex-last-only
test_datasets:
  - path: PJMixers-Dev/allura-org_gryphe-sonnet-3.5-charcards-names-added-qwq-all-aphrodite-Shuffled
    split: test
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/anthracite-org_c2_logs_32k_llama3_qwen2_v1.3-qwq-all-aphrodite-Shuffled
    split: test
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/grimulkan_aicg-logs-augmented-system-qwq-all-aphrodite-Shuffled
    split: test
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/grimulkan_jannie-log-augmented-system-qwq-all-aphrodite-Shuffled
    split: test
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/grimulkan_PIPPA-augmented-dedup-system-qwq-all-aphrodite-Shuffled
    split: test
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/lemonilia_LimaRP-Only-NonSus-Simple-CustomShareGPT-qwq-all-aphrodite-Shuffled
    split: test
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/MinervaAI_Aesir-Preview-Anon-qwq-all-aphrodite-Shuffled
    split: test
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/NyxKrage_chub-logs-sharegpt-longest-CustomShareGPT-qwq-all-aphrodite-Shuffled
    split: test
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/PocketDoc_Dans-Prosemaxx-Cowriter-XL-8192-shrunk-l3-qwq-all-aphrodite-Shuffled
    split: test
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/PocketDoc_Dans-Personamaxx-Rainy-qwq-all-aphrodite-Shuffled
    split: test
    type: customchatml-regex-last-only
val_set_size: 0
eval_strategy: steps
eval_steps: 25
dataset_prepared_path: ./00-Tokenized-Datasets/MN-2407-DSK-QwQify-v0.1-12B-customchatml-regex-last-only
shuffle_merged_datasets: true
dataset_processes:

# 訓練超參數
num_epochs: 2
gradient_accumulation_steps: 1
micro_batch_size: 16  # x4 GPUs = 64
eval_batch_size: 16   # x4 GPUs = 64
warmup_steps: 0
optimizer: came_pytorch
optim_args:
optim_target_modules:
lr_scheduler: rex
learning_rate: 2e-5
cosine_min_lr_ratio:
loraplus_lr_ratio:
loraplus_lr_embedding:
weight_decay: 0.1
max_grad_norm: 1
logging_steps: 1

# 模型優化
gradient_checkpointing: unsloth
flash_attention: true
plugins:
  - axolotl.integrations.liger.LigerPlugin
  - axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin
cut_cross_entropy: true
liger_rope: true
liger_rms_norm: true
liger_layer_norm: true
liger_glu_activation: true
liger_cross_entropy: false
liger_fused_linear_cross_entropy: false
lora_mlp_kernel: false
lora_qkv_kernel: false
lora_o_kernel: false

# DeepSpeed
deepspeed: deepspeed_configs/zero3_bf16.json

# 垃圾回收
gc_steps: 1

# 調試配置
debug: true
seed: 42

# 標記配置
special_tokens:
  bos_token: "<s>"
  eos_token: "<|im_end|>"
  pad_token: "<pad>"
tokens:

訓練超參數

訓練過程中使用了以下超參數：

學習率：2e-05
訓練批次大小：16
評估批次大小：16
隨機種子：42
分佈式類型：多GPU
設備數量：4
總訓練批次大小：64
總評估批次大小：64
優化器：使用OptimizerNames.ADAMW_HF，其中betas=(0.9,0.999)，epsilon=1e-08，無額外優化器參數
學習率調度器類型：餘弦
訓練輪數：2.0

訓練結果

訓練損失	輪數	步數	驗證損失
2.134	0.0038	1	2.0025
1.6185	0.0951	25	1.5748
1.5187	0.1901	50	1.4871
1.4757	0.2852	75	1.4410
1.4008	0.3802	100	1.4100
1.4116	0.4753	125	1.3857
1.357	0.5703	150	1.3630
1.3435	0.6654	175	1.3478
1.3332	0.7605	200	1.3353
1.3042	0.8555	225	1.3308
1.2993	0.9506	250	1.3228
1.3105	1.0456	275	1.3154
1.2782	1.1407	300	1.3094
1.3063	1.2357	325	1.3070
1.3003	1.3308	350	1.3005
1.2937	1.4259	375	1.2952
1.283	1.5209	400	1.2922
1.2692	1.6160	425	1.2887
1.2639	1.7110	450	1.2855
1.2546	1.8061	475	1.2822
1.2711	1.9011	500	1.2787
1.2492	1.9962	525	1.2770