Phi-4-mini-reasoning Open-source Model - Specializing in Mathematical Reasoning and Easily Deployable Even in Low-resource Environments!

Phi 4 Mini Reasoning Unsloth Bnb 4bit

Developed by unsloth

Phi-4-mini-reasoning is a lightweight open-source model focused on mathematical reasoning, supporting a context length of 128K tokens and suitable for environments with limited computing resources.

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:MIT #Mathematical reasoning optimization #Lightweight 128K context #Synthetic data distillation

Downloads 2,329

Release Time : 5/1/2025

Model Overview

Phi-4-mini-reasoning is a lightweight open-source model built on synthetic data, focusing on high-quality, reasoning-rich data and fine-tuned for more advanced mathematical reasoning capabilities.

Model Features

Focus on mathematical reasoning

Designed specifically for multi-step, logic-intensive mathematical problem-solving tasks, performing excellently in environments with limited memory/computation and scenarios with latency requirements.

Long context support

Supports a context length of 128K tokens, capable of handling more complex problems.

Efficient reasoning

Balances reasoning ability and efficiency, suitable for educational applications, embedded tutoring, and lightweight deployment on edge or mobile systems.

Model Capabilities

Mathematical problem solving

Multi-step reasoning

Symbolic computation

Advanced word problem solving

Use Cases

Education

Mathematics tutoring

Provide step-by-step mathematical problem solutions and tutoring for students.

Provide high-quality step-by-step problem-solving capabilities in environments with limited computing resources.

Embedded systems

Edge device deployment

Deploy a lightweight mathematical reasoning model on edge or mobile systems.

Provide efficient reasoning capabilities in scenarios with strict latency requirements.

🚀 Phi-4-mini-reasoning

Phi-4-mini-reasoning is a lightweight open model. It focuses on high - quality, reasoning - dense data and is finetuned for advanced math reasoning. It supports a 128K token context length, offering efficient performance in math - related tasks.

🚀 Quick Start

To quickly start using Phi-4-mini-reasoning, you need to set up the necessary environment. First, ensure you have the required packages installed:

flash_attn==2.7.4.post1
torch==2.5.1
transformers==4.51.3
accelerate==1.3.0

After obtaining the model checkpoints, you can use the following sample code for inference:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
torch.random.manual_seed(0)

model_id = "microsoft/Phi-4-mini-reasoning"
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="cuda",
    torch_dtype="auto",
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained(model_id)

messages = [{
    "role": "user",
    "content": "How to solve 3*x^2+4*x+5=1?"
}]   
inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_dict=True,
    return_tensors="pt",
)

outputs = model.generate(
    **inputs.to(model.device),
    max_new_tokens=32768,
    temperature=0.8,
    top_p=0.95,
    do_sample=True,
)
outputs = tokenizer.batch_decode(outputs[:, inputs["input_ids"].shape[-1]:])

print(outputs[0])

✨ Features

Lightweight and Efficient: With only 3.8B parameters, it achieves a similar level of multilingual language understanding and reasoning ability as much larger models, suitable for memory/compute - constrained environments.
Advanced Math Reasoning: Focuses on high - quality, reasoning - dense data and is finetuned for more advanced math reasoning capabilities, excelling in multi - step, logic - intensive mathematical problem - solving.
Large Context Length: Supports a 128K token context length, maintaining context across steps in problem - solving.

📚 Documentation

Model Summary

Phi-4-mini-reasoning is a lightweight open model built upon synthetic data with a focus on high - quality, reasoning dense data. It is further finetuned for more advanced math reasoning capabilities. The model belongs to the Phi - 4 model family and supports 128K token context length.

Phi - 4 models:

Intended Uses

Primary Use Cases

Phi-4-mini-reasoning is designed for multi - step, logic - intensive mathematical problem - solving tasks under memory/compute constrained environments and latency bound scenarios. Use cases include formal proof generation, symbolic computation, advanced word problems, and a wide range of mathematical reasoning scenarios.

Use Case Considerations

This model is designed and tested for math reasoning only. Developers should consider common limitations of language models, performance differences across languages, and evaluate and mitigate for accuracy, safety, and fairness before using it in specific downstream use cases, especially high - risk scenarios. They should also adhere to applicable laws or regulations.

Release Notes

This release of Phi-4-mini-reasoning addresses user feedback and market demand for a compact reasoning model. It is a compact transformer - based language model optimized for mathematical reasoning, delivering high - quality, step - by - step problem solving in constrained environments.

Model Quality

Model	AIME	MATH - 500	GPQA Diamond
o1 - mini*	63.6	90.0	60.0
DeepSeek - R1 - Distill - Qwen - 7B	53.3	91.4	49.5
DeepSeek - R1 - Distill - Llama - 8B	43.3	86.9	47.3
Bespoke - Stratos - 7B*	20.0	82.0	37.8
OpenThinker - 7B*	31.3	83.0	42.4
Llama - 3.2 - 3B - Instruct	6.7	44.4	25.3
Phi - 4 - Mini (base model, 3.8B)	10.0	71.8	36.9
Phi - 4 - mini - reasoning (3.8B)	57.5	94.6	52.0

Overall, the 3.8B - param model achieves a similar level of multilingual language understanding and reasoning ability as larger models. However, it is limited by its size for certain tasks, and users may experience factual incorrectness. This weakness may be resolved by augmenting it with a search engine under RAG settings.

Usage

Tokenizer

Phi-4-mini-reasoning supports a vocabulary size of up to 200064 tokens. The tokenizer files provide placeholder tokens for downstream fine - tuning and can be extended up to the model's vocabulary size.

Input Formats

The Phi-4-mini-instruct model is best suited for prompts using specific formats:

Chat format:

<|system|>Your name is Phi, an AI math expert developed by Microsoft.<|end|><|user|>How to solve 3*x^2+4*x+5=1?<|end|><|assistant|>

Training

Model

Architecture: Shares the same architecture as Phi - 4 - Mini, a 3.8B - parameter dense decoder - only Transformer model. Major changes compared to Phi - 3.5 - Mini are 200K vocabulary, grouped - query attention, and shared input and output embedding.
Inputs: Text, best suited for chat - format prompts.
Context length: 128K tokens
GPUs: 128 H100 - 80G
Training time: 2 days
Training data: 150B tokens
Outputs: Generated text
Dates: Trained in February 2024
Status: A static model trained on offline datasets with a cutoff date of February 2025 for publicly available data.
Supported languages: English
Release date: April 2025

Training Datasets

The training data consists of synthetic mathematical content generated by Deepseek - R1. The synthetic dataset has over one million diverse math problems, and about 30 billion tokens of math content are retained after verification. The dataset integrates three components:

High - quality, publicly available math questions and part of the SFT data for the base Phi - 4 - Mini model.
Synthetic math data generated by Deepseek - R1 for supervised fine - tuning and model distillation.
Preference data with correct and incorrect answers to enhance reasoning capabilities.

Software

PyTorch
Transformers
[Flash - Attention](https://github.com/HazyResearch/flash - attention)

Hardware

The Phi-4-mini-reasoning model uses flash attention by default, requiring specific GPU hardware. It has been tested on NVIDIA A100 and NVIDIA H100. If you want to run it on NVIDIA V100 or earlier generation GPUs, call AutoModelForCausalLM.from_pretrained() with attn_implementation="eager".

Safety Evaluation and Red - Teaming

The Phi - 4 family of models uses a robust safety post - training approach, combining SFT, DPO, and RLHF with human - labeled and synthetic English - language datasets. Phi - 4 - Mini - Reasoning was developed according to Microsoft's responsible AI principles, and its safety was assessed using the Azure AI Foundry's framework.

Responsible AI Considerations

Developers should be aware of potential limitations such as unfairness, unreliability, offensive content, and information inaccuracy. They should apply responsible AI best practices, fine - tune the model for specific use cases, and implement appropriate safeguards.

Appendix A: Benchmark Methodology

We aim to ensure fair comparisons in benchmarks by using the same generation configuration. The model is evaluated on three popular math benchmarks:

Math - 500: Consists of 500 challenging math problems for complex reasoning and problem - solving.
AIME 2024: A highly regarded math competition with difficult problems for assessing advanced skills.
GPQA Diamond: Focuses on evaluating the model's ability to solve a wide range of mathematical questions.

📄 License

The model is licensed under the MIT license.

Trademarks

This project may contain trademarks or logos. Authorized use of Microsoft trademarks or logos must follow Microsoft’s Trademark & Brand Guidelines. Use of third - party trademarks or logos is subject to their policies.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご