Gemma-3-4b Inference Open - Source Language Model - Focus on Inference Tasks and Efficiently Solve Various Problems

Gemma 3 4b Reasoning

Developed by ericrisco

Gemma-3-4b Reasoning is a Transformer-based language model fine-tuned using the GRPO method, specializing in reasoning task optimization.

Large Language Model

Transformers

EnglishOpen Source License:Apache-2.0 #Mathematical Reasoning Optimization #Structured Problem Solving #GRPO Fine-tuning

Downloads 53

Release Time : 3/13/2025

Model Overview

This model is designed for structured reasoning tasks, excelling in mathematical and logical reasoning, multi-step problem solving, and instruction-based reasoning.

Model Features

GRPO Optimization

Fine-tuned using group reward strategy optimization to enhance the model's reasoning capabilities.

Structured Reasoning

Excels at tasks requiring step-by-step reasoning and structured explanations.

Mathematical Proficiency

Performs exceptionally well on mathematical and logical reasoning problems.

Model Capabilities

Mathematical Reasoning

Logical Reasoning

Multi-step Problem Solving

Structured Explanation Generation

Use Cases

Education

Math Problem Solving

Solves complex mathematical word problems, providing step-by-step reasoning.

Capable of correctly answering math problems from the GSM8K dataset.

Research

Logical Reasoning Tests

Used to test and evaluate AI systems' logical reasoning capabilities.

🚀 Gemma-3-4b Reasoning R1 Model Card

Gemma-3-4b Reasoning is a transformer-based language model fine-tuned with GRPO for reasoning tasks, leveraging the DeepSeek-R1 methodology.

🚀 Quick Start

The model uses structured XML templates for dialogue and reasoning tasks. Here is a basic example to show how to use it:

Basic Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "ericrisco/gemma-3-4b-reasoning"

prompt = "A cyclist travels 60 km in 3 hours at a constant speed. If he maintains the same speed, how many kilometers will he travel in 5 hours?"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name, device_map="auto", torch_dtype=torch.bfloat16
)

messages = [{"role": "user", "content": prompt}]

input_text = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)

inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=200)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(response)

✨ Features

Reasoning Focused: Gemma-3-4b Reasoning is a fine - tuned model designed to excel in structured, logical problem - solving and mathematical reasoning.
Enhanced Reasoning Ability: Trained on the GSM8K dataset using GRPO, it can reason step - by - step and provide structured explanations.
Robust CoT Capabilities: The model exhibits robust internal Chain - of - Thought (CoT) capabilities, consistently demonstrating detailed explanations and structured problem - solving skills across reasoning tasks.

📦 Installation

The documentation does not provide specific installation steps, so this section is skipped.

📚 Documentation

Model Details

Description

Gemma-3-4b Reasoning is a reasoning - focused fine - tuned model designed to excel in structured, logical problem - solving and mathematical reasoning. The training was performed on the GSM8K dataset using GRPO, enhancing the model's ability to reason step - by - step and provide structured explanations.

Training Dataset

GSM8K (English): Specialized dataset for mathematical and logical reasoning problems.

Intended Use

Direct Use

The model is specifically designed for structured reasoning tasks, including:

Mathematical and logical reasoning
Multi - step problem solving
Instruction - based reasoning

Out-of-scope Use

This model should not be used for unethical or malicious activities that breach legal and ethical standards.

Performance

The Gemma-3-4b Reasoning model exhibits robust internal Chain - of - Thought (CoT) capabilities, consistently demonstrating detailed explanations and structured problem - solving skills across reasoning tasks.

Limitations

The model is primarily optimized for numeric and structured reasoning and might produce less accurate or unexpected results when applied to unrelated tasks.

Citations

Gemma Multimodal Reasoning Model by Google
GRPO Implementation by TRL

Author

Eric Risco

🔧 Technical Details

The documentation does not provide specific technical details, so this section is skipped.

📄 License

The entire Gemma-3-4b Reasoning family is available under a permissive Apache 2.0 license. All training scripts and configurations used are publicly accessible.

📋 Model Information

Property	Details
Model Type	Transformer-based language model fine - tuned with GRPO
Training Data	GSM8K (English)
Base Model	google/gemma-3-4b-it
License	Apache 2.0

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご