MN-2407-DSK-QwQify-v0.1-12B开源模型 - 赋予QwQ思维，用于角色扮演写作

首页

MN 2407 DSK QwQify V0.1 12B

由 BeaverAI 开发

这是一个基于PocketDoc/Dans-SakuraKaze-V1.0.0-12b微调的测试模型，旨在为现有模型赋予QwQ的思维模式，适用于角色扮演/冒险/协作写作场景。

大型语言模型

Transformers

开源协议:Apache-2.0 #角色扮演增强 #QwQ风格转换 #多轮对话优化

下载量 734

发布时间 : 3/15/2025

模型简介

该模型基于PocketDoc/Dans-SakuraKaze-V1.0.0-12b（一个角色扮演/冒险/协作写作模型）微调而来，通过LoRA技术赋予模型QwQ的思维模式。主要用于对话生成和协作写作任务。

模型特点

QwQ思维模式

通过微调赋予模型QwQ特有的思维和表达方式，使其在对话和写作中表现出特定风格

角色扮演优化

特别针对角色扮演场景优化，能够生成符合角色设定的对话内容

协作写作支持

适用于协作写作任务，能够根据上下文生成连贯的文本内容

ChatML格式支持

使用ChatML提示格式，便于对话管理和上下文控制

模型能力

文本生成

角色扮演对话

协作写作

风格化文本生成

使用案例

娱乐

角色扮演游戏

作为游戏中的NPC进行对话互动

生成符合角色设定的自然对话

互动故事创作

与用户共同创作互动故事

生成连贯且符合上下文的故事情节

创意写作

协作写作助手

帮助作家进行创意写作

提供创意建议和文本续写

🚀 BeaverAI/MN-2407-DSK-QwQify-v0.1-12B

这是一个测试模型，旨在为现有模型赋予QwQ思维。此初始版本基于PocketDoc/Dans-SakuraKaze-V1.0.0-12b（一个角色扮演/冒险/协同写作模型）构建，而该模型又基于PocketDoc/Dans-PersonalityEngine-V1.1.0-12b（一个通用指令模型）训练，后者则基于mistralai/Mistral-Nemo-Base-2407。

GGUF

🚀 快速开始

提示格式和使用方法

提示格式和使用方法应与QwQ模型相同：使用ChatML格式，并去除前几轮对话中的思考内容。如果模型没有自动生成思考内容，可以在助手回复的开头添加<think>\n。

对话格式要求

模型应遵循前几轮对话的格式。在对话的开头几轮，可能需要多次重新生成回复，并对模型的回复进行编辑，以达到满意的效果。

前缀设置建议

可以考虑禁用为角色插入{{char}}:前缀，而是在系统提示的末尾添加类似“仅以“{{char}}”的身份与“{{user}}”对话。最终回复以“{{char}}:”开头”的内容。

image/png

✨ 主要特性

本模型是基于PocketDoc/Dans-SakuraKaze-V1.0.0-12b在多个数据集上进行微调得到的，这些数据集包括：

PJMixers-Dev/allura-org_gryphe-sonnet-3.5-charcards-names-added-qwq-all-aphrodite-Shuffled
PJMixers-Dev/anthracite-org_c2_logs_32k_llama3_qwen2_v1.3-qwq-all-aphrodite-Shuffled
PJMixers-Dev/grimulkan_aicg-logs-augmented-system-qwq-all-aphrodite-Shuffled
PJMixers-Dev/grimulkan_jannie-log-augmented-system-qwq-all-aphrodite-Shuffled
PJMixers-Dev/grimulkan_PIPPA-augmented-dedup-system-qwq-all-aphrodite-Shuffled
PJMixers-Dev/lemonilia_LimaRP-Only-NonSus-Simple-CustomShareGPT-qwq-all-aphrodite-Shuffled
PJMixers-Dev/MinervaAI_Aesir-Preview-Anon-qwq-all-aphrodite-Shuffled
PJMixers-Dev/NyxKrage_chub-logs-sharegpt-longest-CustomShareGPT-qwq-all-aphrodite-Shuffled
PJMixers-Dev/PocketDoc_Dans-Prosemaxx-Cowriter-XL-8192-shrunk-l3-qwq-all-aphrodite-Shuffled
PJMixers-Dev/PocketDoc_Dans-Personamaxx-Rainy-qwq-all-aphrodite-Shuffled

模型在评估集上取得了以下结果：

损失值：1.2770

🔧 技术细节

Axolotl配置

查看axolotl配置

axolotl版本：0.8.0.dev0

mlflow_tracking_uri: http://127.0.0.1:7860
mlflow_experiment_name: MN-2407-DSK-QwQify-v0.1-12B-LoRA

# Hugging Face保存配置
hub_model_id: BeaverAI/MN-2407-DSK-QwQify-v0.1-12B-LoRA-WS
hub_strategy: every_save

# 模型检查点配置
output_dir: ./Outputs/MN-2407-DSK-QwQify-v0.1-12B-LoRA
resume_from_checkpoint:
save_steps: 25
save_safetensors: true
save_total_limit: 3
save_only_model: false

# 模型架构配置
base_model: PocketDoc/Dans-SakuraKaze-V1.0.0-12b
model_type: MistralForCausalLM
tokenizer_type: PreTrainedTokenizerFast

# 混合精度训练配置
bf16: true
fp16: false
tf32: false

# 模型加载配置
load_in_8bit: false
load_in_4bit: false
strict: false

# 序列配置
sequence_len: 8192
min_sample_len: 256
sample_packing: true
eval_sample_packing: true
pad_to_sequence_len: true
train_on_inputs: false
group_by_length: false

# LoRA适配器配置
adapter: lora
lora_model_dir:
lora_r: 128
lora_alpha: 128
lora_dropout: 0.125
peft_layers_to_transform:
peft_use_dora:
peft_use_rslora:
peft_layer_replication:
lora_target_modules:
  - gate_proj
  - down_proj
  - up_proj
  - q_proj
  - v_proj
  - k_proj
  - o_proj
lora_modules_to_save:

# 修复未初始化的标记（例如基础L3模型上的<|start_header_id|>）
fix_untrained_tokens:

# 数据集配置
# https://github.com/xzuyn/axolotl/blob/came-plus-formatters/src/axolotl/prompt_strategies/customchatml-regex-last-only.py
datasets:
  - path: PJMixers-Dev/allura-org_gryphe-sonnet-3.5-charcards-names-added-qwq-all-aphrodite-Shuffled
    split: train
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/anthracite-org_c2_logs_32k_llama3_qwen2_v1.3-qwq-all-aphrodite-Shuffled
    split: train
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/grimulkan_aicg-logs-augmented-system-qwq-all-aphrodite-Shuffled
    split: train
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/grimulkan_jannie-log-augmented-system-qwq-all-aphrodite-Shuffled
    split: train
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/grimulkan_PIPPA-augmented-dedup-system-qwq-all-aphrodite-Shuffled
    split: train
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/lemonilia_LimaRP-Only-NonSus-Simple-CustomShareGPT-qwq-all-aphrodite-Shuffled
    split: train
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/MinervaAI_Aesir-Preview-Anon-qwq-all-aphrodite-Shuffled
    split: train
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/NyxKrage_chub-logs-sharegpt-longest-CustomShareGPT-qwq-all-aphrodite-Shuffled
    split: train
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/PocketDoc_Dans-Prosemaxx-Cowriter-XL-8192-shrunk-l3-qwq-all-aphrodite-Shuffled
    split: train
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/PocketDoc_Dans-Personamaxx-Rainy-qwq-all-aphrodite-Shuffled
    split: train
    type: customchatml-regex-last-only
test_datasets:
  - path: PJMixers-Dev/allura-org_gryphe-sonnet-3.5-charcards-names-added-qwq-all-aphrodite-Shuffled
    split: test
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/anthracite-org_c2_logs_32k_llama3_qwen2_v1.3-qwq-all-aphrodite-Shuffled
    split: test
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/grimulkan_aicg-logs-augmented-system-qwq-all-aphrodite-Shuffled
    split: test
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/grimulkan_jannie-log-augmented-system-qwq-all-aphrodite-Shuffled
    split: test
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/grimulkan_PIPPA-augmented-dedup-system-qwq-all-aphrodite-Shuffled
    split: test
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/lemonilia_LimaRP-Only-NonSus-Simple-CustomShareGPT-qwq-all-aphrodite-Shuffled
    split: test
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/MinervaAI_Aesir-Preview-Anon-qwq-all-aphrodite-Shuffled
    split: test
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/NyxKrage_chub-logs-sharegpt-longest-CustomShareGPT-qwq-all-aphrodite-Shuffled
    split: test
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/PocketDoc_Dans-Prosemaxx-Cowriter-XL-8192-shrunk-l3-qwq-all-aphrodite-Shuffled
    split: test
    type: customchatml-regex-last-only
  - path: PJMixers-Dev/PocketDoc_Dans-Personamaxx-Rainy-qwq-all-aphrodite-Shuffled
    split: test
    type: customchatml-regex-last-only
val_set_size: 0
eval_strategy: steps
eval_steps: 25
dataset_prepared_path: ./00-Tokenized-Datasets/MN-2407-DSK-QwQify-v0.1-12B-customchatml-regex-last-only
shuffle_merged_datasets: true
dataset_processes:

# 训练超参数
num_epochs: 2
gradient_accumulation_steps: 1
micro_batch_size: 16  # x4 GPUs = 64
eval_batch_size: 16   # x4 GPUs = 64
warmup_steps: 0
optimizer: came_pytorch
optim_args:
optim_target_modules:
lr_scheduler: rex
learning_rate: 2e-5
cosine_min_lr_ratio:
loraplus_lr_ratio:
loraplus_lr_embedding:
weight_decay: 0.1
max_grad_norm: 1
logging_steps: 1

# 模型优化
gradient_checkpointing: unsloth
flash_attention: true
plugins:
  - axolotl.integrations.liger.LigerPlugin
  - axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin
cut_cross_entropy: true
liger_rope: true
liger_rms_norm: true
liger_layer_norm: true
liger_glu_activation: true
liger_cross_entropy: false
liger_fused_linear_cross_entropy: false
lora_mlp_kernel: false
lora_qkv_kernel: false
lora_o_kernel: false

# DeepSpeed
deepspeed: deepspeed_configs/zero3_bf16.json

# 垃圾回收
gc_steps: 1

# 调试配置
debug: true
seed: 42

# 标记配置
special_tokens:
  bos_token: "<s>"
  eos_token: "<|im_end|>"
  pad_token: "<pad>"
tokens:

训练超参数

训练过程中使用了以下超参数：

学习率：2e-05
训练批次大小：16
评估批次大小：16
随机种子：42
分布式类型：多GPU
设备数量：4
总训练批次大小：64
总评估批次大小：64
优化器：使用OptimizerNames.ADAMW_HF，其中betas=(0.9,0.999)，epsilon=1e-08，无额外优化器参数
学习率调度器类型：余弦
训练轮数：2.0

训练结果

训练损失	轮数	步数	验证损失
2.134	0.0038	1	2.0025
1.6185	0.0951	25	1.5748
1.5187	0.1901	50	1.4871
1.4757	0.2852	75	1.4410
1.4008	0.3802	100	1.4100
1.4116	0.4753	125	1.3857
1.357	0.5703	150	1.3630
1.3435	0.6654	175	1.3478
1.3332	0.7605	200	1.3353
1.3042	0.8555	225	1.3308
1.2993	0.9506	250	1.3228
1.3105	1.0456	275	1.3154
1.2782	1.1407	300	1.3094
1.3063	1.2357	325	1.3070
1.3003	1.3308	350	1.3005
1.2937	1.4259	375	1.2952
1.283	1.5209	400	1.2922
1.2692	1.6160	425	1.2887
1.2639	1.7110	450	1.2855
1.2546	1.8061	475	1.2822
1.2711	1.9011	500	1.2787
1.2492	1.9962	525	1.2770