thinkygemma - 4B Open Source Pseudo-Inference Expert Model, Free Deployment to Boost Structured Inference Applications

Thinkygemma 4b

Developed by xsanskarx

A pseudo-reasoning expert model fine-tuned from Google Gemma-3-4b-pt, designed for structured reasoning/pseudo-inductive reasoning

Large Language Model

Transformers

#Pseudo-Reasoning Expert #Chain-of-Thought Fine-Tuning #Structured Reasoning

Downloads 19

Release Time : 3/14/2025

Model Overview

This model is a fine-tuned version of Google Gemma-3-4b-it, aiming to mimic an excellent reasoner, focusing on structured reasoning and pseudo-inductive reasoning tasks.

Model Features

Structured Reasoning Capability

Designed for structured reasoning and pseudo-inductive reasoning, capable of generating logically coherent reasoning processes.

Efficient Fine-Tuning

Utilizes LoRA fine-tuning technique (r = 128, alpha = 256), completing training in just 9 hours on a single NVIDIA H100.

High-Quality Training Data

Trained on 25,000 validated Chain-of-Thought (CoT) trajectories, sourced from DeepSeek R1 and Qwen QWQ.

Model Capabilities

Text Generation

Structured Reasoning

Pseudo-Inductive Reasoning

Use Cases

Education

Logical Reasoning Teaching

Used to generate logical reasoning examples, helping students understand the problem-solving process of complex issues.

Generates coherent reasoning chains, demonstrating step-by-step problem-solving processes.

Research

Reasoning Capability Research

Used to study the reasoning capabilities and pseudo-reasoning behaviors of AI models.

Provides analyzable reasoning trajectories, aiding in understanding model reasoning mechanisms.

🚀 thinkygemma-4b: your average fake reasoner

A fine - tuned model from Gemma - 3 - 4b - pt for structured reasoning.

🚀 Quick Start

This section provides a quick guide on how to set up and use the thinkygemma-4b model.

from transformers import AutoTokenizer, Gemma3ForConditionalGeneration, TextStreamer
import torch

# Load model and tokenizer
model_id = "xsanskarx/thinkygemma-4b"
model = Gemma3ForConditionalGeneration.from_pretrained(model_id, device_map="auto").eval()
tokenizer = AutoTokenizer.from_pretrained(model_id)

def ask_model(prompt: str, max_tokens=8192, temperature=0.7):
    """
    Function to ask a question to the model and stream the response.
    """
    messages = [
        {"role": "system", "content": "You are an expert math problem solver, think and reason inside <think> tags, enclose all reasoning in <think> tags, verifying logic step by step and then return your final structured answer"},
        {"role": "user", "content": prompt}
    ]

    formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False)
    inputs = tokenizer(formatted_prompt, return_tensors="pt").to(model.device)

    streamer = TextStreamer(tokenizer, skip_special_tokens=True)
    with torch.inference_mode():
        model.generate(**inputs, max_new_tokens=max_tokens, do_sample=True, temperature=temperature, streamer=streamer)

# Example usage
ask_model("do 2+2")

✨ Features

Fine - Tuned for Reasoning: This is a fine - tuned version of Google's Gemma - 3 - 4b - it, adapted for structured reasoning / fake induced reasoning. It is designed to act like a great reasoner.
Model ID: xsanskarx/thinkygemma-4b
Parameters Trained: 1.8 billion
Trained on: 25k rows of verified Chain - of - Thought (CoT) traces from DeepSeek R1 and Qwen QWQ
Next Planned Step: GRPO
Adapters Repo: xsanskarx/thinkgemma-4b

📦 Installation

The installation process mainly involves loading the model and tokenizer using the transformers library. The following code snippet shows the installation steps:

from transformers import AutoTokenizer, Gemma3ForConditionalGeneration
model_id = "xsanskarx/thinkygemma-4b"
model = Gemma3ForConditionalGeneration.from_pretrained(model_id, device_map="auto").eval()
tokenizer = AutoTokenizer.from_pretrained(model_id)

📚 Documentation

Model Description

This is a fine - tuned version of Google's Gemma - 3 - 4b - it, adapted for structured reasoning / fake induced reasoning. It is designed to excel in acting like a great reasoner.

Training Details

Property	Details
Hardware	Single NVIDIA H100
Training Time	9 hours (1 epoch)
Training Method	LoRA fine - tuning (r = 128, alpha = 256)
Dataset	25k CoT traces
Base Model	`google/gemma-3-4b-it`

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご