MAI-DS-R1 Open-Source AI Model - Enhance Response to Sensitive Topics and Maintain Inference Advantages, Free to Use

MAI DS R1

Developed by microsoft

MAI-DS-R1 is the result of Microsoft AI team's post-training of the DeepSeek-R1 inference model, aimed at enhancing its response capability to sensitive topics and optimizing risk performance, while maintaining the original reasoning ability and competitive advantages.

Large Language Model

Transformers

Open Source License:MIT #Multi-step Reasoning Optimization #Security Compliance Enhancement #Multilingual Capability Unlocking

Downloads 8,840

Release Time : 4/16/2025

Model Overview

A post-training version based on the DeepSeek-R1 inference model, developed by the Microsoft AI team, to fill the information gaps of the previous model and improve its risk performance, while retaining R1's reasoning capabilities.

Model Features

Enhanced Response Capability to Sensitive Topics

Post-training removed most query restrictions from the original R1 model, optimizing its response capability to sensitive topics.

Optimized Risk Performance

Outperforms Perplexity's recently released R1-1776 model in relevant safety benchmarks while retaining the general reasoning capabilities of the original DeepSeek-R1.

Multilingual Support

Utilized approximately 350,000 internally developed multilingual example datasets for post-training.

Retained Original Reasoning Capability

While improving safety, it maintains DeepSeek-R1's original strong chain-of-thought reasoning and general language understanding capabilities.

Model Capabilities

General Text Generation and Understanding

General Knowledge Tasks

Reasoning and Problem Solving

Code Generation and Understanding

Science and Academic Applications

Use Cases

Text Generation

Dialogue Generation

Generate coherent, contextually relevant dialogue content

Article Writing

Generate structurally complete articles based on prompts

Story Continuation

Continue a story based on a given beginning

Knowledge Q&A

Open-domain Q&A

Answer open-domain questions requiring factual knowledge

Reasoning and Problem Solving

Mathematical Word Problems

Solve multi-step mathematical reasoning problems

Adopts chain-of-thought strategies to improve accuracy

Logic Puzzles

Solve complex logic problems

Programming Assistance

Code Generation

Assist programming tasks by generating code snippets

Code Explanation

Explain code functionality and logic

Academic Research

STEM Problem Solving

Assist structured problem solving in science, technology, engineering, and mathematics fields

🚀 MAI-DS-R1: An Enhanced Reasoning Model

MAI-DS-R1 is a DeepSeek-R1 reasoning model post-trained by the Microsoft AI team. It aims to improve responsiveness on blocked topics and risk profiles while maintaining reasoning capabilities and competitive performance.

🚀 Quick Start

The MAI-DS-R1 model preserves the general reasoning capabilities of DeepSeek-R1 and can be directly applied to a wide range of language understanding and generation tasks. For more detailed usage, please refer to the following sections.

✨ Features

Enhanced Responsiveness: Successfully unblocks most previously blocked queries from the original R1 model.
Improved Risk Profile: Outperforms the R1 - 1776 model in relevant safety benchmarks.
Broad Application: Suitable for various language tasks, including text generation, knowledge answering, reasoning, code generation, and scientific applications.

📚 Documentation

Model Details

Model Description

MAI-DS-R1 is a post - trained DeepSeek-R1 reasoning model by the Microsoft AI team. It fills information gaps in the previous model version, improves the risk profile, and maintains R1's reasoning capabilities. The model was trained using 110k Safety and Non - Compliance examples from the Tulu 3 SFT dataset and an internally developed dataset of ~350k multilingual examples covering various topics with reported biases.

Note: Microsoft has post - trained this model to address certain output limitations, but previous limitations and considerations, including security aspects, still apply.

Uses

Direct Use

MAI-DS-R1 can be used for:

General text generation and understanding: Produce coherent and context - relevant text for various prompts, such as dialogue, essays, or story continuation.
General knowledge tasks: Answer open - domain questions requiring factual knowledge.
Reasoning and problem solving: Handle multi - step reasoning tasks using chain - of - thought strategies.
Code generation and comprehension: Assist with programming tasks by generating code snippets or explaining code.
Scientific and academic applications: Assist with structured problem - solving in STEM and research domains.

Downstream Use (Optional)

The model can serve as a foundation for further fine - tuning in domain - specific reasoning tasks, such as automated tutoring systems for mathematics, coding assistants, and research tools in scientific or technical fields.

Out - of - Scope Use

Medical or health advice: The model is not a medical device and cannot guarantee accurate medical diagnoses or safe treatment recommendations.
Legal advice: The model is not a lawyer and should not be used for definitive legal counsel, law interpretation, or independent legal decision - making.
Safety - critical systems: Not suitable for autonomous systems where failures could cause injury, loss of life, or significant property damage.
High - stakes decision support: Should not be relied on for decisions affecting finances, security, or personal well - being.
Malicious or unethical Use: Must not be used to produce harmful, illegal, deceptive, or unethical content.

Bias, Risks, and Limitations

Biases: The model may retain biases from the training data and the original DeepSeek - R1, especially in cultural and demographic aspects.
Risks: It may hallucinate facts, be vulnerable to adversarial prompts, or generate unsafe, biased, or harmful content under certain conditions. Developers should implement content moderation and usage monitoring.
Limitations: MAI-DS-R1 shares DeepSeek - R1's knowledge cutoff and may lack awareness of recent events or domain - specific facts.

Recommendations

Transparency on Limitations: Users should be explicitly informed of the model's potential biases and limitations.
Human Oversight and Verification: Human review or automated validation of outputs should be implemented in sensitive or high - stakes scenarios.
Usage Safeguards: Developers should integrate content filtering, prompt engineering best practices, and continuous monitoring to mitigate risks.
Legal and Regulatory Compliance: Operators must ensure compliance with regional regulations as the model may output politically sensitive content.

Evaluation

Testing Data, Factors & Metrics

Testing Data

Public Benchmarks: Cover a wide range of tasks, including natural language inference, question answering, mathematical reasoning, commonsense reasoning, code generation, and code completion.
Blocking Test Set: Consists of 3.3k prompts on various blocked topics from R1, covering 11 languages.
Harm Mitigation Test Set: A split from the HarmBench dataset, with 320 queries in three functional categories and eight semantic categories.

Factors

Input topic and Sensitivity: Tuned to freely discuss previously blocked topics, but remains restrictive for truly harmful content.
Language: May inherit limitations from the original DeepSeek - R1, with stronger performance in English and Chinese.
Prompt Complexity and Reasoning Required: Performs well on complex queries, but long or complex prompts may be challenging.
User Instructions and Role Prompts: Responses can be shaped by system or developer - provided instructions.

Metrics

Public benchmarks: Accuracy and Pass@1.
Blocking evaluation: Satisfaction and % Responses.
Harm mitigation evaluation: Attack Success Rate and Micro Attack Success Rate.

Results

General Knowledge & Reasoning: Performs on par with DeepSeek - R1 and slightly better than R1 - 1776, especially in mgsm_chain_of_thought_zh.
Blocked Topics: Unblocks 99.3% of samples, matching R1 - 1776, with a higher Satisfaction score.
Harm Mitigation: Outperforms R1 - 1776 and the original R1 model in minimizing harmful content.

Model Architecture and Objective

Model Name: MAI-DS-R1
Architecture: Based on DeepSeek - R1, a transformer - based autoregressive language model using multi - head self - attention and Mixture - of - Experts (MoE) for scalable and efficient inference.
Objective: Post - trained to reduce CCP - aligned restrictions and enhance harm protection while preserving original reasoning and language understanding capabilities.
Pre - trained Model Base: DeepSeek - R1 (671B)

📄 License

This project is licensed under the MIT license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご