o1_gigachat-20b-a3b_gguf開源模型 - 免費部署模擬俄語邏輯思考過程

首頁

O1 Gigachat 20b A3b Gguf

由evilfreelancer開發

基於GigaChat-20B-A3B模型訓練的LoRA適配器，專門用於俄語邏輯思考過程模擬

大型語言模型支持多種語言開源協議:MIT #俄語思維推理 #LoRA微調 #多輪對話優化

下載量 152

發布時間 : 1/16/2025

模型概述

該模型通過LoRA適配器增強了GigaChat-20B-A3B的俄語邏輯思考能力，能夠模仿類似OpenAI o1模型的思考過程，特別適合需要展示推理步驟的俄語對話場景

模型特點

俄語邏輯思考模擬

專門針對俄語優化的思考過程展示能力，能按照指定格式輸出推理步驟

LoRA微調

使用低秩適配器技術對基礎模型進行高效微調，保留原模型能力的同時增加特定功能

結構化輸出

支持<Thought>和<output>標籤的結構化響應，清晰分離推理過程和最終答案

模型能力

俄語文本生成

邏輯推理過程展示

結構化問答

多輪對話

使用案例

智能助手

俄語教育助手

幫助學生理解複雜問題的解決過程

展示分步推理，提高學習效果

專業諮詢

技術問題診斷

分析技術問題並提供詳細解決步驟

清晰的思考過程有助於用戶理解解決方案

🚀 俄羅斯o1 / GigaChat 20B - A3B指令GGUF

本項目是一個基於LoRA的適配器，用於對[GigaChat - 20B - A3B](https://huggingface.co/ai - sage/GigaChat - 20B - A3B - instruct - bf16)模型進行微調。它在[Egor - AI/Russian_thinking_dataset](https://huggingface.co/datasets/Egor - AI/Russian_thinking_dataset)數據集上進行訓練，該數據集是[BintangFortuna/OpenO1 - SFT - EN - SY](https://huggingface.co/datasets/BintangFortuna/OpenO1 - SFT - EN - SY)數據集的俄語機器翻譯版本。經過訓練的模型能夠在俄語環境下模仿OpenAI的o1進行邏輯思考。

📦 模型信息

屬性	詳情
模型類型	基於[GigaChat - 20B - A3B](https://huggingface.co/ai - sage/GigaChat - 20B - A3B - instruct - bf16)的LoRA適配器
訓練數據	[Egor - AI/Russian_thinking_dataset](https://huggingface.co/datasets/Egor - AI/Russian_thinking_dataset)
基礎模型	evilfreelancer/o1_gigachat - 20b - a3b_lora
任務類型	問答
標籤	chat、o1、cot、thinking、reflection
許可證	MIT

🚀 快速開始

系統提示

使用該模型時，需要使用以下系統提示：

Вы — ИИ - помощник. Отформатируйте свои ответы следующим образом: <Thought> Ваши мысли (понимание, рассуждения) </Thought> <output> Ваш ответ </output>

🔧 技術細節

訓練信息

訓練工具：使用impruver工具進行訓練。
配置文件：採用[GigaChat/20B - A3B_lora_o1](https://github.com/EvilFreelancer/impruver/blob/main/recipes/configs/GigaChat/20B - A3B_lora_o1.yaml)配置。
訓練時長：在RTX 4090上訓練大約花費117小時，需要23GB顯存。

配置文件

output_dir: ./models/GigaChat_20B - A3B_lora_thinking
train_path: ./train.GigaChat_20B - A3B_lora_thinking.jsonl
val_path: ./val.GigaChat_20B - A3B_lora_thinking.jsonl

datasets:
  - name: Egor - AI/Russian_thinking_dataset
    converter: impruver.instruction_to_messages
    mapping:
      system: system
      instruction: prompt
      output: response

model:
  class: custom.gigachat.DeepseekForCausalLM
  name: ai - sage/GigaChat_20B - A3B - instruct - bf16
  attn_implementation: flash_attention_2
  load_in_4bit: true
  load_in_8bit: false
  dtype: bf16

lora:
  r: 8
  lora_alpha: 32
  lora_dropout: 0.1
  bias: none
  target_modules: [ q_proj, v_proj, k_proj, o_proj, gate_proj, down_proj, up_proj ]
  task_type: CAUSAL_LM

tokenizer:
  class: transformers.AutoTokenizer
  name: ai - sage/GigaChat_20B - A3B - instruct
  max_tokens_count: 1500
  special_tokens:
    pad_token_id: 1
    pad_token: <s>
    bos_token_id: 1
    bos_token: <s>
    eos_token_id: 128001
    eos_token: <|message_sep|>
  chat_template: >
    {% if messages[0]['role'] == 'system' -%}
        {%- set loop_messages = messages[1:] -%}
        {%- set system_message = bos_token + messages[0]['content'] + additional_special_tokens[1] -%}
    {%- else -%}
        {%- set loop_messages = messages -%}
        {%- set system_message = bos_token + '' -%}
    {%- endif -%}
    {%- for message in messages %}
        {%- if message['role'] == 'system' -%}
            {{ system_message -}}
        {%- endif -%}
        {%- if message['role'] == 'user' -%}
            {{ message['role'] + additional_special_tokens[0] + message['content'] + additional_special_tokens[1] -}}
            {{ 'available functions' + additional_special_tokens[0] + additional_special_tokens[2] + additional_special_tokens[3]  + additional_special_tokens[1] -}}
        {%- endif -%}
        {%- if message['role'] == 'assistant' -%}
            {{ message['role'] + additional_special_tokens[0] + message['content'] + additional_special_tokens[1] -}}
        {%- endif -%}
        {%- if loop.last and add_generation_prompt -%}
            {{ 'assistant' + additional_special_tokens[0] -}}
        {%- endif -%}
    {%- endfor %}

trainer:
  eval_strategy: steps
  save_strategy: steps
  eval_steps: 100
  save_steps: 100
  per_device_train_batch_size: 1
  per_device_eval_batch_size: 1
  gradient_accumulation_steps: 8
  logging_steps: 1
  learning_rate: 0.0004
  num_train_epochs: 2
  lr_scheduler_type: cosine
  warmup_steps: 16
  optim: adamw_torch_4bit
  metric_for_best_model: eval_loss
  load_best_model_at_end: true
  save_total_limit: 2
  seed: 42
  remove_unused_columns: false
  max_grad_norm: 1.0
  weight_decay: 0.08
  torch_compile: false