tiny-ko-124m-sft-muon開源韓語NLP模型 - 免費助力韓語自然語言處理

首頁

Tiny Ko 124m Sft Muon

由minpeter開發

基於minpeter/tiny-ko-124m-base-muon模型在多個數據集上微調得到的韓語自然語言處理模型

大型語言模型

Transformers

#韓語指令微調 #小參數高效 #多任務適配

下載量 508

發布時間 : 7/13/2025

模型概述

該模型是在多個韓語數據集上微調得到的，適用於韓語相關的自然語言處理任務，如文本生成、對話系統等。

模型特點

多數據集微調

在多個韓語數據集上進行微調，提升了模型的泛化能力和性能。

優化的訓練配置

使用了muon優化器和cosine學習率調度器，訓練過程高效穩定。

長上下文支持

支持最大65536的上下文長度，適合處理長文本任務。

模型能力

韓語文本生成

韓語對話系統

指令跟隨

使用案例

對話系統

韓語聊天機器人

用於構建韓語對話系統，提供自然流暢的對話體驗。

文本生成

韓語內容創作

生成韓語文章、故事或其他創意內容。

🚀 tiny-ko-124m-sft-muon

tiny-ko-124m-sft-muon 是基於 minpeter/tiny-ko-124m-base-muon 模型在多個數據集上微調得到的模型。該模型在評估集上取得了一定的效果，為相關自然語言處理任務提供了新的解決方案。

查看 axolotl 配置

axolotl 版本：0.12.0.dev0

base_model: minpeter/tiny-ko-124m-base-muon

hub_model_id: minpeter/tiny-ko-124m-sft-muon
output_dir: ./outputs/tiny-ko-124m-sft-muon
wandb_project: "axolotl"
wandb_entity: "kasfiekfs-e"

model_type: LlamaForCausalLM
tokenizer_type: AutoTokenizer

strict: false

chat_template: chatml
datasets:
  - path: HuggingFaceTB/smol-smoltalk
    type: chat_template
    split: train
    field_messages: messages
    message_property_mappings:
      role: role
      content: content

  - path: trillionlabs/multisystem-curated
    type: chat_template
    split: train
    field_messages: messages
    message_property_mappings:
      role: role
      content: content

  - path: allenai/tulu-3-sft-personas-instruction-following
    type: chat_template
    split: train
    field_messages: messages
    message_property_mappings:
      role: role
      content: content

  - path: lemon-mint/smol-koreantalk
    type: chat_template
    split: train
    field_messages: messages
    message_property_mappings:
      role: role
      content: content

  - path: lemon-mint/Korean-FineTome-100k
    type: chat_template
    split: train
    field_messages: messages
    message_property_mappings:
      role: role
      content: content

  - path: heegyu/open-korean-instructions-v20231020
    type: chat_template
    split: train
    field_messages: conversations
    message_property_mappings:
      role: from
      content: value
    roles:
      user: ["human", "user"]
      assistant: ["gpt", "assistant", "bot"]
      system: ["system", "input"]

  - path: coastral/korean-writing-style-instruct
    type: chat_template
    split: train
    field_messages: conversations
    message_property_mappings:
      role: from
      content: value

  - path: devngho/korean-instruction-mix
    type: chat_template
    split: train
    field_messages: messages
    message_property_mappings:
      role: from
      content: value

dataset_prepared_path: last_run_prepared
val_set_size: 0.001
save_safetensors: true
sequence_len: 2048
sample_packing: false
pad_to_sequence_len: false
use_pose: true
pose_max_context_len: 65536

overrides_of_model_config:
  rope_theta: 10000.0
  max_position_embeddings: 65536

gradient_accumulation_steps: 8
micro_batch_size: 32
num_epochs: 1
optimizer: muon
lr_scheduler: cosine
learning_rate: 3e-4

train_on_inputs: false
group_by_length: false
bf16: true
fp16:
tf32: true

gradient_checkpointing: false
gradient_checkpointing_kwargs:
  use_reentrant: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true
sdp_attention:
s2_attention:

save_steps: 200
warmup_steps: 20
eval_steps: 200
debug:
deepspeed:
weight_decay: 0.0
fsdp:
fsdp_config:

📚 詳細文檔

模型概述

該模型是 minpeter/tiny-ko-124m-base-muon 的微調版本，在以下數據集上進行了訓練：

HuggingFaceTB/smol-smoltalk
trillionlabs/multisystem-curated
allenai/tulu-3-sft-personas-instruction-following
lemon-mint/smol-koreantalk
lemon-mint/Korean-FineTome-100k
heegyu/open-korean-instructions-v20231020
coastral/korean-writing-style-instruct
devngho/korean-instruction-mix

在評估集上，該模型的損失為 1.6461。

訓練和評估數據

具體的數據信息待補充。

訓練過程

訓練超參數

訓練過程中使用了以下超參數：

屬性	詳情
學習率	0.0003
訓練批次大小	32
評估批次大小	32
隨機種子	42
分佈式類型	多 GPU
設備數量	4
梯度累積步數	8
總訓練批次大小	1024
總評估批次大小	128
優化器	使用 OptimizerNames.ADAMW_TORCH，其中 betas=(0.9, 0.999)，epsilon=1e-08，無額外優化器參數
學習率調度器類型	餘弦
學習率調度器熱身步數	20
訓練步數	6865

訓練結果

訓練損失	輪數	步數	驗證損失
無記錄	0	0	2.4581
1.8892	0.1165	200	1.9059
1.802	0.2331	400	1.8333
1.7906	0.3496	600	1.7918
1.7761	0.4661	800	1.7638
1.7145	0.5827	1000	1.7423
1.7114	0.6992	1200	1.7255
1.6798	0.8157	1400	1.7123
1.6722	0.9323	1600	1.7006
1.6821	1.0484	1800	1.6928
1.6414	1.1649	2000	1.6864
1.6473	1.2814	2200	1.6794
1.6202	1.3980	2400	1.6729
1.6141	1.5145	2600	1.6689
1.6415	1.6310	2800	1.6645
1.6165	1.7476	3000	1.6603
1.6292	1.8641	3200	1.6573
1.6277	1.9806	3400	1.6541
1.6033	2.0967	3600	1.6537
1.6432	2.2133	3800	1.6517
1.602	2.3298	4000	1.6505
1.6435	2.4463	4200	1.6493
1.5941	2.5629	4400	1.6481
1.594	2.6794	4600	1.6473
1.5986	2.7959	4800	1.6468
1.586	2.9125	5000	1.6464
1.6146	3.0286	5200	1.6462
1.5985	3.1451	5400	1.6462
1.574	3.2616	5600	1.6462
1.5823	3.3782	5800	1.6460
1.597	3.4947	6000	1.6460
1.5859	3.6112	6200	1.6460
1.5769	3.7277	6400	1.6459
1.572	3.8443	6600	1.6459
1.6111	3.9608	6800	1.6461