tiny-ko-124m-sft-muon开源韩语NLP模型 - 免费助力韩语自然语言处理

首页

Tiny Ko 124m Sft Muon

由 minpeter 开发

基于minpeter/tiny-ko-124m-base-muon模型在多个数据集上微调得到的韩语自然语言处理模型

大型语言模型

Transformers

#韩语指令微调 #小参数高效 #多任务适配

下载量 508

发布时间 : 7/13/2025

模型简介

该模型是在多个韩语数据集上微调得到的，适用于韩语相关的自然语言处理任务，如文本生成、对话系统等。

模型特点

多数据集微调

在多个韩语数据集上进行微调，提升了模型的泛化能力和性能。

优化的训练配置

使用了muon优化器和cosine学习率调度器，训练过程高效稳定。

长上下文支持

支持最大65536的上下文长度，适合处理长文本任务。

模型能力

韩语文本生成

韩语对话系统

指令跟随

使用案例

对话系统

韩语聊天机器人

用于构建韩语对话系统，提供自然流畅的对话体验。

文本生成

韩语内容创作

生成韩语文章、故事或其他创意内容。

🚀 tiny-ko-124m-sft-muon

tiny-ko-124m-sft-muon 是基于 minpeter/tiny-ko-124m-base-muon 模型在多个数据集上微调得到的模型。该模型在评估集上取得了一定的效果，为相关自然语言处理任务提供了新的解决方案。

查看 axolotl 配置

axolotl 版本：0.12.0.dev0

base_model: minpeter/tiny-ko-124m-base-muon

hub_model_id: minpeter/tiny-ko-124m-sft-muon
output_dir: ./outputs/tiny-ko-124m-sft-muon
wandb_project: "axolotl"
wandb_entity: "kasfiekfs-e"

model_type: LlamaForCausalLM
tokenizer_type: AutoTokenizer

strict: false

chat_template: chatml
datasets:
  - path: HuggingFaceTB/smol-smoltalk
    type: chat_template
    split: train
    field_messages: messages
    message_property_mappings:
      role: role
      content: content

  - path: trillionlabs/multisystem-curated
    type: chat_template
    split: train
    field_messages: messages
    message_property_mappings:
      role: role
      content: content

  - path: allenai/tulu-3-sft-personas-instruction-following
    type: chat_template
    split: train
    field_messages: messages
    message_property_mappings:
      role: role
      content: content

  - path: lemon-mint/smol-koreantalk
    type: chat_template
    split: train
    field_messages: messages
    message_property_mappings:
      role: role
      content: content

  - path: lemon-mint/Korean-FineTome-100k
    type: chat_template
    split: train
    field_messages: messages
    message_property_mappings:
      role: role
      content: content

  - path: heegyu/open-korean-instructions-v20231020
    type: chat_template
    split: train
    field_messages: conversations
    message_property_mappings:
      role: from
      content: value
    roles:
      user: ["human", "user"]
      assistant: ["gpt", "assistant", "bot"]
      system: ["system", "input"]

  - path: coastral/korean-writing-style-instruct
    type: chat_template
    split: train
    field_messages: conversations
    message_property_mappings:
      role: from
      content: value

  - path: devngho/korean-instruction-mix
    type: chat_template
    split: train
    field_messages: messages
    message_property_mappings:
      role: from
      content: value

dataset_prepared_path: last_run_prepared
val_set_size: 0.001
save_safetensors: true
sequence_len: 2048
sample_packing: false
pad_to_sequence_len: false
use_pose: true
pose_max_context_len: 65536

overrides_of_model_config:
  rope_theta: 10000.0
  max_position_embeddings: 65536

gradient_accumulation_steps: 8
micro_batch_size: 32
num_epochs: 1
optimizer: muon
lr_scheduler: cosine
learning_rate: 3e-4

train_on_inputs: false
group_by_length: false
bf16: true
fp16:
tf32: true

gradient_checkpointing: false
gradient_checkpointing_kwargs:
  use_reentrant: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true
sdp_attention:
s2_attention:

save_steps: 200
warmup_steps: 20
eval_steps: 200
debug:
deepspeed:
weight_decay: 0.0
fsdp:
fsdp_config:

📚 详细文档

模型概述

该模型是 minpeter/tiny-ko-124m-base-muon 的微调版本，在以下数据集上进行了训练：

HuggingFaceTB/smol-smoltalk
trillionlabs/multisystem-curated
allenai/tulu-3-sft-personas-instruction-following
lemon-mint/smol-koreantalk
lemon-mint/Korean-FineTome-100k
heegyu/open-korean-instructions-v20231020
coastral/korean-writing-style-instruct
devngho/korean-instruction-mix

在评估集上，该模型的损失为 1.6461。

训练和评估数据

具体的数据信息待补充。

训练过程

训练超参数

训练过程中使用了以下超参数：

属性	详情
学习率	0.0003
训练批次大小	32
评估批次大小	32
随机种子	42
分布式类型	多 GPU
设备数量	4
梯度累积步数	8
总训练批次大小	1024
总评估批次大小	128
优化器	使用 OptimizerNames.ADAMW_TORCH，其中 betas=(0.9, 0.999)，epsilon=1e-08，无额外优化器参数
学习率调度器类型	余弦
学习率调度器热身步数	20
训练步数	6865

训练结果

训练损失	轮数	步数	验证损失
无记录	0	0	2.4581
1.8892	0.1165	200	1.9059
1.802	0.2331	400	1.8333
1.7906	0.3496	600	1.7918
1.7761	0.4661	800	1.7638
1.7145	0.5827	1000	1.7423
1.7114	0.6992	1200	1.7255
1.6798	0.8157	1400	1.7123
1.6722	0.9323	1600	1.7006
1.6821	1.0484	1800	1.6928
1.6414	1.1649	2000	1.6864
1.6473	1.2814	2200	1.6794
1.6202	1.3980	2400	1.6729
1.6141	1.5145	2600	1.6689
1.6415	1.6310	2800	1.6645
1.6165	1.7476	3000	1.6603
1.6292	1.8641	3200	1.6573
1.6277	1.9806	3400	1.6541
1.6033	2.0967	3600	1.6537
1.6432	2.2133	3800	1.6517
1.602	2.3298	4000	1.6505
1.6435	2.4463	4200	1.6493
1.5941	2.5629	4400	1.6481
1.594	2.6794	4600	1.6473
1.5986	2.7959	4800	1.6468
1.586	2.9125	5000	1.6464
1.6146	3.0286	5200	1.6462
1.5985	3.1451	5400	1.6462
1.574	3.2616	5600	1.6462
1.5823	3.3782	5800	1.6460
1.597	3.4947	6000	1.6460
1.5859	3.6112	6200	1.6460
1.5769	3.7277	6400	1.6459
1.572	3.8443	6600	1.6459
1.6111	3.9608	6800	1.6461