🚀 俄罗斯o1 / GigaChat 20B - A3B指令GGUF
本项目是一个基于LoRA的适配器,用于对[GigaChat - 20B - A3B](https://huggingface.co/ai - sage/GigaChat - 20B - A3B - instruct - bf16)模型进行微调。它在[Egor - AI/Russian_thinking_dataset](https://huggingface.co/datasets/Egor - AI/Russian_thinking_dataset)数据集上进行训练,该数据集是[BintangFortuna/OpenO1 - SFT - EN - SY](https://huggingface.co/datasets/BintangFortuna/OpenO1 - SFT - EN - SY)数据集的俄语机器翻译版本。经过训练的模型能够在俄语环境下模仿OpenAI
的o1
进行逻辑思考。
📦 模型信息
属性 |
详情 |
模型类型 |
基于[GigaChat - 20B - A3B](https://huggingface.co/ai - sage/GigaChat - 20B - A3B - instruct - bf16)的LoRA适配器 |
训练数据 |
[Egor - AI/Russian_thinking_dataset](https://huggingface.co/datasets/Egor - AI/Russian_thinking_dataset) |
基础模型 |
evilfreelancer/o1_gigachat - 20b - a3b_lora |
任务类型 |
问答 |
标签 |
chat、o1、cot、thinking、reflection |
许可证 |
MIT |
🚀 快速开始
系统提示
使用该模型时,需要使用以下系统提示:
Вы — ИИ - помощник. Отформатируйте свои ответы следующим образом: <Thought> Ваши мысли (понимание, рассуждения) </Thought> <output> Ваш ответ </output>
🔧 技术细节
训练信息
- 训练工具:使用impruver工具进行训练。
- 配置文件:采用[GigaChat/20B - A3B_lora_o1](https://github.com/EvilFreelancer/impruver/blob/main/recipes/configs/GigaChat/20B - A3B_lora_o1.yaml)配置。
- 训练时长:在RTX 4090上训练大约花费117小时,需要23GB显存。
配置文件
output_dir: ./models/GigaChat_20B - A3B_lora_thinking
train_path: ./train.GigaChat_20B - A3B_lora_thinking.jsonl
val_path: ./val.GigaChat_20B - A3B_lora_thinking.jsonl
datasets:
- name: Egor - AI/Russian_thinking_dataset
converter: impruver.instruction_to_messages
mapping:
system: system
instruction: prompt
output: response
model:
class: custom.gigachat.DeepseekForCausalLM
name: ai - sage/GigaChat_20B - A3B - instruct - bf16
attn_implementation: flash_attention_2
load_in_4bit: true
load_in_8bit: false
dtype: bf16
lora:
r: 8
lora_alpha: 32
lora_dropout: 0.1
bias: none
target_modules: [ q_proj, v_proj, k_proj, o_proj, gate_proj, down_proj, up_proj ]
task_type: CAUSAL_LM
tokenizer:
class: transformers.AutoTokenizer
name: ai - sage/GigaChat_20B - A3B - instruct
max_tokens_count: 1500
special_tokens:
pad_token_id: 1
pad_token: <s>
bos_token_id: 1
bos_token: <s>
eos_token_id: 128001
eos_token: <|message_sep|>
chat_template: >
{% if messages[0]['role'] == 'system' -%}
{%- set loop_messages = messages[1:] -%}
{%- set system_message = bos_token + messages[0]['content'] + additional_special_tokens[1] -%}
{%- else -%}
{%- set loop_messages = messages -%}
{%- set system_message = bos_token + '' -%}
{%- endif -%}
{%- for message in messages %}
{%- if message['role'] == 'system' -%}
{{ system_message -}}
{%- endif -%}
{%- if message['role'] == 'user' -%}
{{ message['role'] + additional_special_tokens[0] + message['content'] + additional_special_tokens[1] -}}
{{ 'available functions' + additional_special_tokens[0] + additional_special_tokens[2] + additional_special_tokens[3] + additional_special_tokens[1] -}}
{%- endif -%}
{%- if message['role'] == 'assistant' -%}
{{ message['role'] + additional_special_tokens[0] + message['content'] + additional_special_tokens[1] -}}
{%- endif -%}
{%- if loop.last and add_generation_prompt -%}
{{ 'assistant' + additional_special_tokens[0] -}}
{%- endif -%}
{%- endfor %}
trainer:
eval_strategy: steps
save_strategy: steps
eval_steps: 100
save_steps: 100
per_device_train_batch_size: 1
per_device_eval_batch_size: 1
gradient_accumulation_steps: 8
logging_steps: 1
learning_rate: 0.0004
num_train_epochs: 2
lr_scheduler_type: cosine
warmup_steps: 16
optim: adamw_torch_4bit
metric_for_best_model: eval_loss
load_best_model_at_end: true
save_total_limit: 2
seed: 42
remove_unused_columns: false
max_grad_norm: 1.0
weight_decay: 0.08
torch_compile: false
📄 许可证
本项目采用MIT许可证。
🔗 相关链接
- 模型链接:https://huggingface.co/evilfreelancer/o1_gigachat - 20b - a3b_lora
- W&B报告:https://api.wandb.ai/links/evilfreelancer/nlec8bt8