o1_gigachat-20b-a3b_gguf开源模型 - 免费部署模拟俄语逻辑思考过程

首页

O1 Gigachat 20b A3b Gguf

由 evilfreelancer 开发

基于GigaChat-20B-A3B模型训练的LoRA适配器，专门用于俄语逻辑思考过程模拟

大型语言模型支持多种语言开源协议:MIT #俄语思维推理 #LoRA微调 #多轮对话优化

下载量 152

发布时间 : 1/16/2025

模型简介

该模型通过LoRA适配器增强了GigaChat-20B-A3B的俄语逻辑思考能力，能够模仿类似OpenAI o1模型的思考过程，特别适合需要展示推理步骤的俄语对话场景

模型特点

俄语逻辑思考模拟

专门针对俄语优化的思考过程展示能力，能按照指定格式输出推理步骤

LoRA微调

使用低秩适配器技术对基础模型进行高效微调，保留原模型能力的同时增加特定功能

结构化输出

支持<Thought>和<output>标签的结构化响应，清晰分离推理过程和最终答案

模型能力

俄语文本生成

逻辑推理过程展示

结构化问答

多轮对话

使用案例

智能助手

俄语教育助手

帮助学生理解复杂问题的解决过程

展示分步推理，提高学习效果

专业咨询

技术问题诊断

分析技术问题并提供详细解决步骤

清晰的思考过程有助于用户理解解决方案

🚀 俄罗斯o1 / GigaChat 20B - A3B指令GGUF

本项目是一个基于LoRA的适配器，用于对[GigaChat - 20B - A3B](https://huggingface.co/ai - sage/GigaChat - 20B - A3B - instruct - bf16)模型进行微调。它在[Egor - AI/Russian_thinking_dataset](https://huggingface.co/datasets/Egor - AI/Russian_thinking_dataset)数据集上进行训练，该数据集是[BintangFortuna/OpenO1 - SFT - EN - SY](https://huggingface.co/datasets/BintangFortuna/OpenO1 - SFT - EN - SY)数据集的俄语机器翻译版本。经过训练的模型能够在俄语环境下模仿OpenAI的o1进行逻辑思考。

📦 模型信息

属性	详情
模型类型	基于[GigaChat - 20B - A3B](https://huggingface.co/ai - sage/GigaChat - 20B - A3B - instruct - bf16)的LoRA适配器
训练数据	[Egor - AI/Russian_thinking_dataset](https://huggingface.co/datasets/Egor - AI/Russian_thinking_dataset)
基础模型	evilfreelancer/o1_gigachat - 20b - a3b_lora
任务类型	问答
标签	chat、o1、cot、thinking、reflection
许可证	MIT

🚀 快速开始

系统提示

使用该模型时，需要使用以下系统提示：

Вы — ИИ - помощник. Отформатируйте свои ответы следующим образом: <Thought> Ваши мысли (понимание, рассуждения) </Thought> <output> Ваш ответ </output>

🔧 技术细节

训练信息

训练工具：使用impruver工具进行训练。
配置文件：采用[GigaChat/20B - A3B_lora_o1](https://github.com/EvilFreelancer/impruver/blob/main/recipes/configs/GigaChat/20B - A3B_lora_o1.yaml)配置。
训练时长：在RTX 4090上训练大约花费117小时，需要23GB显存。

配置文件

output_dir: ./models/GigaChat_20B - A3B_lora_thinking
train_path: ./train.GigaChat_20B - A3B_lora_thinking.jsonl
val_path: ./val.GigaChat_20B - A3B_lora_thinking.jsonl

datasets:
  - name: Egor - AI/Russian_thinking_dataset
    converter: impruver.instruction_to_messages
    mapping:
      system: system
      instruction: prompt
      output: response

model:
  class: custom.gigachat.DeepseekForCausalLM
  name: ai - sage/GigaChat_20B - A3B - instruct - bf16
  attn_implementation: flash_attention_2
  load_in_4bit: true
  load_in_8bit: false
  dtype: bf16

lora:
  r: 8
  lora_alpha: 32
  lora_dropout: 0.1
  bias: none
  target_modules: [ q_proj, v_proj, k_proj, o_proj, gate_proj, down_proj, up_proj ]
  task_type: CAUSAL_LM

tokenizer:
  class: transformers.AutoTokenizer
  name: ai - sage/GigaChat_20B - A3B - instruct
  max_tokens_count: 1500
  special_tokens:
    pad_token_id: 1
    pad_token: <s>
    bos_token_id: 1
    bos_token: <s>
    eos_token_id: 128001
    eos_token: <|message_sep|>
  chat_template: >
    {% if messages[0]['role'] == 'system' -%}
        {%- set loop_messages = messages[1:] -%}
        {%- set system_message = bos_token + messages[0]['content'] + additional_special_tokens[1] -%}
    {%- else -%}
        {%- set loop_messages = messages -%}
        {%- set system_message = bos_token + '' -%}
    {%- endif -%}
    {%- for message in messages %}
        {%- if message['role'] == 'system' -%}
            {{ system_message -}}
        {%- endif -%}
        {%- if message['role'] == 'user' -%}
            {{ message['role'] + additional_special_tokens[0] + message['content'] + additional_special_tokens[1] -}}
            {{ 'available functions' + additional_special_tokens[0] + additional_special_tokens[2] + additional_special_tokens[3]  + additional_special_tokens[1] -}}
        {%- endif -%}
        {%- if message['role'] == 'assistant' -%}
            {{ message['role'] + additional_special_tokens[0] + message['content'] + additional_special_tokens[1] -}}
        {%- endif -%}
        {%- if loop.last and add_generation_prompt -%}
            {{ 'assistant' + additional_special_tokens[0] -}}
        {%- endif -%}
    {%- endfor %}

trainer:
  eval_strategy: steps
  save_strategy: steps
  eval_steps: 100
  save_steps: 100
  per_device_train_batch_size: 1
  per_device_eval_batch_size: 1
  gradient_accumulation_steps: 8
  logging_steps: 1
  learning_rate: 0.0004
  num_train_epochs: 2
  lr_scheduler_type: cosine
  warmup_steps: 16
  optim: adamw_torch_4bit
  metric_for_best_model: eval_loss
  load_best_model_at_end: true
  save_total_limit: 2
  seed: 42
  remove_unused_columns: false
  max_grad_norm: 1.0
  weight_decay: 0.08
  torch_compile: false