Nuke_X_Gemma3_1B_Reasoner_Testing开源推理模型 - 增强逻辑推理能力助力高效决策

首页

Nuke X Gemma3 1B Reasoner Testing

由 NuclearAi 开发

基于Google Gemma-3-1B优化的推理增强模型，通过GRPO算法和高质量数据集提升逻辑推理能力

大型语言模型

Transformers

英语开源协议:Apache-2.0 #GRPO增强推理 #对话式逻辑推理 #Unsloth优化

下载量 77

发布时间 : 3/31/2025

模型简介

该模型是针对Gemma-3-1B的优化版本，专注于提升文本生成和逻辑推理能力，适用于对话式AI场景

模型特点

推理能力增强

通过GRPO算法和专用训练数据集显著提升原版Gemma的推理能力

高效微调

仅使用150条高质量数据进行5步微调，30分钟内完成训练

Unsloth优化

采用Unsloth框架进行高效训练和推理优化

模型能力

文本生成

逻辑推理

对话式AI

故事创作

使用案例

创意写作

短篇故事生成

生成符合逻辑的创意短篇故事

如示例中生成的'学会飞行的猫'故事

问答系统

结构化问题解答

提供包含推理过程的详细解答

模型会先展示思考过程再给出最终答案

🚀 核智AI微调的Gemma 3模型

本项目基于Google的Gemma 3模型进行微调，通过GRPO技术和专业数据集提升了模型的推理能力。微调后的模型在测试中表现出色，欢迎大家提供反馈，以便我们进一步优化。

🚀 快速开始

模型信息

属性	详情
模型类型	基于Google Gemma 3微调的对话与推理模型
训练数据	NuclearAi/HyperThink-v1
开发者	NuclearAi
许可证	apache-2.0
基础模型	google/gemma-3-1b-it

模型介绍

Gemma 是谷歌推出的一系列轻量级、最先进的开源模型，采用了与 Gemini 模型相同的研究和技术。不过，Gemma 在推理能力方面有所欠缺，相比其他一些模型不够先进。

在 核智AI（Nuclear AI），我们通过利用 GRPO 技术，并为其提供专门的数据集来提升 Gemma 的推理能力。由于这是一个实验性模型，我们使用了 150行高质量数据 并进行了 五步微调，大约耗时 30分钟。

在测试该模型时，其表现让我们大为惊叹！我们非常期待听到您的反馈，以便我们能够利用更多的步骤和更强的计算能力来微调更大版本的模型。

📦 安装指南

安装依赖库

# 1. 安装与Gemma 3兼容的特定transformers库
pip install --no-deps git+https://github.com/huggingface/transformers@v4.49.0-Gemma-3

# 2. 安装Unsloth（根据您的环境进行调整 - 例如，如果不在Colab上，请移除[colab-new]）
pip install "unsloth[colab-new]@git+https://github.com/unslothai/unsloth.git"

# 3. 安装PyTorch（从https://pytorch.org/根据您的CUDA版本选择命令）
# CUDA 12.1的示例：
# pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
# 仅CPU的示例：
# pip install torch torchvision torchaudio

# 4. 安装accelerate和bitsandbytes
pip install accelerate bitsandbytes

💻 使用示例

基础用法

import torch
from unsloth import FastModel
from transformers import TextStreamer


# 1. 模型和分词器加载
max_seq_length = 1024
model_name = "NuclearAi/Nuke_X_Gemma3_1B_Reasoner_Testing"

print(f"Loading model: {model_name}...")

model, tokenizer = FastModel.from_pretrained(
    model_name = model_name,
    max_seq_length = max_seq_length,
    dtype = None,         # 让Unsloth选择最佳数据类型（float16, bf16, float32）
    load_in_4bit = False, # 如果需要4位量化，请设置为True
    device_map = "auto",  # 如果可用，自动使用GPU
)
print("Model loaded.")


# 2. 定义提示结构
reasoning_start = "<think>"
reasoning_end   = "</think>"
solution_start = "<response>"
solution_end = "</response>"


system_prompt = \
f"""You are given a problem.
Think about the problem and provide your working out.
Place it between {reasoning_start} and {reasoning_end}.
Then, provide your solution between {solution_start}{solution_end}"""


# 3. 用户输入
user_question = "Write a short story about a cat who learns to fly." # 尝试其他问题


# 4. 为对话模型格式化输入
messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user",   "content": user_question},
]

text_input = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True # 对生成很重要
)


# 5. 分词并准备生成
device = model.device if hasattr(model, 'device') else ('cuda' if torch.cuda.is_available() else 'cpu')
inputs = tokenizer([text_input], return_tensors="pt").to(device)
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)


# 6. 生成响应
print("\n--- Model Response ---")
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        streamer=streamer,
        max_new_tokens=1024,
        temperature=0.7,
        top_p=0.9,
        top_k=50,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )
print("\n--- End of Response ---")