🚀 俄羅斯o1 / GigaChat 20B - A3B指令GGUF
本項目是一個基於LoRA的適配器,用於對[GigaChat - 20B - A3B](https://huggingface.co/ai - sage/GigaChat - 20B - A3B - instruct - bf16)模型進行微調。它在[Egor - AI/Russian_thinking_dataset](https://huggingface.co/datasets/Egor - AI/Russian_thinking_dataset)數據集上進行訓練,該數據集是[BintangFortuna/OpenO1 - SFT - EN - SY](https://huggingface.co/datasets/BintangFortuna/OpenO1 - SFT - EN - SY)數據集的俄語機器翻譯版本。經過訓練的模型能夠在俄語環境下模仿OpenAI
的o1
進行邏輯思考。
📦 模型信息
屬性 |
詳情 |
模型類型 |
基於[GigaChat - 20B - A3B](https://huggingface.co/ai - sage/GigaChat - 20B - A3B - instruct - bf16)的LoRA適配器 |
訓練數據 |
[Egor - AI/Russian_thinking_dataset](https://huggingface.co/datasets/Egor - AI/Russian_thinking_dataset) |
基礎模型 |
evilfreelancer/o1_gigachat - 20b - a3b_lora |
任務類型 |
問答 |
標籤 |
chat、o1、cot、thinking、reflection |
許可證 |
MIT |
🚀 快速開始
系統提示
使用該模型時,需要使用以下系統提示:
Вы — ИИ - помощник. Отформатируйте свои ответы следующим образом: <Thought> Ваши мысли (понимание, рассуждения) </Thought> <output> Ваш ответ </output>
🔧 技術細節
訓練信息
- 訓練工具:使用impruver工具進行訓練。
- 配置文件:採用[GigaChat/20B - A3B_lora_o1](https://github.com/EvilFreelancer/impruver/blob/main/recipes/configs/GigaChat/20B - A3B_lora_o1.yaml)配置。
- 訓練時長:在RTX 4090上訓練大約花費117小時,需要23GB顯存。
配置文件
output_dir: ./models/GigaChat_20B - A3B_lora_thinking
train_path: ./train.GigaChat_20B - A3B_lora_thinking.jsonl
val_path: ./val.GigaChat_20B - A3B_lora_thinking.jsonl
datasets:
- name: Egor - AI/Russian_thinking_dataset
converter: impruver.instruction_to_messages
mapping:
system: system
instruction: prompt
output: response
model:
class: custom.gigachat.DeepseekForCausalLM
name: ai - sage/GigaChat_20B - A3B - instruct - bf16
attn_implementation: flash_attention_2
load_in_4bit: true
load_in_8bit: false
dtype: bf16
lora:
r: 8
lora_alpha: 32
lora_dropout: 0.1
bias: none
target_modules: [ q_proj, v_proj, k_proj, o_proj, gate_proj, down_proj, up_proj ]
task_type: CAUSAL_LM
tokenizer:
class: transformers.AutoTokenizer
name: ai - sage/GigaChat_20B - A3B - instruct
max_tokens_count: 1500
special_tokens:
pad_token_id: 1
pad_token: <s>
bos_token_id: 1
bos_token: <s>
eos_token_id: 128001
eos_token: <|message_sep|>
chat_template: >
{% if messages[0]['role'] == 'system' -%}
{%- set loop_messages = messages[1:] -%}
{%- set system_message = bos_token + messages[0]['content'] + additional_special_tokens[1] -%}
{%- else -%}
{%- set loop_messages = messages -%}
{%- set system_message = bos_token + '' -%}
{%- endif -%}
{%- for message in messages %}
{%- if message['role'] == 'system' -%}
{{ system_message -}}
{%- endif -%}
{%- if message['role'] == 'user' -%}
{{ message['role'] + additional_special_tokens[0] + message['content'] + additional_special_tokens[1] -}}
{{ 'available functions' + additional_special_tokens[0] + additional_special_tokens[2] + additional_special_tokens[3] + additional_special_tokens[1] -}}
{%- endif -%}
{%- if message['role'] == 'assistant' -%}
{{ message['role'] + additional_special_tokens[0] + message['content'] + additional_special_tokens[1] -}}
{%- endif -%}
{%- if loop.last and add_generation_prompt -%}
{{ 'assistant' + additional_special_tokens[0] -}}
{%- endif -%}
{%- endfor %}
trainer:
eval_strategy: steps
save_strategy: steps
eval_steps: 100
save_steps: 100
per_device_train_batch_size: 1
per_device_eval_batch_size: 1
gradient_accumulation_steps: 8
logging_steps: 1
learning_rate: 0.0004
num_train_epochs: 2
lr_scheduler_type: cosine
warmup_steps: 16
optim: adamw_torch_4bit
metric_for_best_model: eval_loss
load_best_model_at_end: true
save_total_limit: 2
seed: 42
remove_unused_columns: false
max_grad_norm: 1.0
weight_decay: 0.08
torch_compile: false
📄 許可證
本項目採用MIT許可證。
🔗 相關鏈接
- 模型鏈接:https://huggingface.co/evilfreelancer/o1_gigachat - 20b - a3b_lora
- W&B報告:https://api.wandb.ai/links/evilfreelancer/nlec8bt8