đ MAI-DS-R1: An Enhanced Reasoning Model
MAI-DS-R1 is a DeepSeek-R1 reasoning model post-trained by the Microsoft AI team. It aims to improve responsiveness on blocked topics and risk profiles while maintaining reasoning capabilities and competitive performance.
đ Quick Start
The MAI-DS-R1 model preserves the general reasoning capabilities of DeepSeek-R1 and can be directly applied to a wide range of language understanding and generation tasks. For more detailed usage, please refer to the following sections.
⨠Features
- Enhanced Responsiveness: Successfully unblocks most previously blocked queries from the original R1 model.
- Improved Risk Profile: Outperforms the R1 - 1776 model in relevant safety benchmarks.
- Broad Application: Suitable for various language tasks, including text generation, knowledge answering, reasoning, code generation, and scientific applications.
đ Documentation
Model Details
Model Description
MAI-DS-R1 is a post - trained DeepSeek-R1 reasoning model by the Microsoft AI team. It fills information gaps in the previous model version, improves the risk profile, and maintains R1's reasoning capabilities. The model was trained using 110k Safety and Non - Compliance examples from the Tulu 3 SFT dataset and an internally developed dataset of ~350k multilingual examples covering various topics with reported biases.
Note: Microsoft has post - trained this model to address certain output limitations, but previous limitations and considerations, including security aspects, still apply.
Uses
Direct Use
MAI-DS-R1 can be used for:
- General text generation and understanding: Produce coherent and context - relevant text for various prompts, such as dialogue, essays, or story continuation.
- General knowledge tasks: Answer open - domain questions requiring factual knowledge.
- Reasoning and problem solving: Handle multi - step reasoning tasks using chain - of - thought strategies.
- Code generation and comprehension: Assist with programming tasks by generating code snippets or explaining code.
- Scientific and academic applications: Assist with structured problem - solving in STEM and research domains.
Downstream Use (Optional)
The model can serve as a foundation for further fine - tuning in domain - specific reasoning tasks, such as automated tutoring systems for mathematics, coding assistants, and research tools in scientific or technical fields.
Out - of - Scope Use
- Medical or health advice: The model is not a medical device and cannot guarantee accurate medical diagnoses or safe treatment recommendations.
- Legal advice: The model is not a lawyer and should not be used for definitive legal counsel, law interpretation, or independent legal decision - making.
- Safety - critical systems: Not suitable for autonomous systems where failures could cause injury, loss of life, or significant property damage.
- High - stakes decision support: Should not be relied on for decisions affecting finances, security, or personal well - being.
- Malicious or unethical Use: Must not be used to produce harmful, illegal, deceptive, or unethical content.
Bias, Risks, and Limitations
- Biases: The model may retain biases from the training data and the original DeepSeek - R1, especially in cultural and demographic aspects.
- Risks: It may hallucinate facts, be vulnerable to adversarial prompts, or generate unsafe, biased, or harmful content under certain conditions. Developers should implement content moderation and usage monitoring.
- Limitations: MAI-DS-R1 shares DeepSeek - R1's knowledge cutoff and may lack awareness of recent events or domain - specific facts.
Recommendations
- Transparency on Limitations: Users should be explicitly informed of the model's potential biases and limitations.
- Human Oversight and Verification: Human review or automated validation of outputs should be implemented in sensitive or high - stakes scenarios.
- Usage Safeguards: Developers should integrate content filtering, prompt engineering best practices, and continuous monitoring to mitigate risks.
- Legal and Regulatory Compliance: Operators must ensure compliance with regional regulations as the model may output politically sensitive content.
Evaluation
Testing Data, Factors & Metrics
Testing Data
- Public Benchmarks: Cover a wide range of tasks, including natural language inference, question answering, mathematical reasoning, commonsense reasoning, code generation, and code completion.
- Blocking Test Set: Consists of 3.3k prompts on various blocked topics from R1, covering 11 languages.
- Harm Mitigation Test Set: A split from the HarmBench dataset, with 320 queries in three functional categories and eight semantic categories.
Factors
- Input topic and Sensitivity: Tuned to freely discuss previously blocked topics, but remains restrictive for truly harmful content.
- Language: May inherit limitations from the original DeepSeek - R1, with stronger performance in English and Chinese.
- Prompt Complexity and Reasoning Required: Performs well on complex queries, but long or complex prompts may be challenging.
- User Instructions and Role Prompts: Responses can be shaped by system or developer - provided instructions.
Metrics
- Public benchmarks: Accuracy and Pass@1.
- Blocking evaluation: Satisfaction and % Responses.
- Harm mitigation evaluation: Attack Success Rate and Micro Attack Success Rate.
Results
- General Knowledge & Reasoning: Performs on par with DeepSeek - R1 and slightly better than R1 - 1776, especially in mgsm_chain_of_thought_zh.
- Blocked Topics: Unblocks 99.3% of samples, matching R1 - 1776, with a higher Satisfaction score.
- Harm Mitigation: Outperforms R1 - 1776 and the original R1 model in minimizing harmful content.
Model Architecture and Objective
- Model Name: MAI-DS-R1
- Architecture: Based on DeepSeek - R1, a transformer - based autoregressive language model using multi - head self - attention and Mixture - of - Experts (MoE) for scalable and efficient inference.
- Objective: Post - trained to reduce CCP - aligned restrictions and enhance harm protection while preserving original reasoning and language understanding capabilities.
- Pre - trained Model Base: DeepSeek - R1 (671B)
đ License
This project is licensed under the MIT license.