Gazal-R1-32B-GRPO-preview Open-source Medical Language Model - Empowering Medical Reasoning and Clinical Decision-making!

Gazal R1 32B GRPO Preview

Developed by TachyHealth

Gazal - R1 - 32B is a language model specifically designed for medical reasoning and clinical decision - making. It is built on Qwen 3 32B and demonstrates excellent performance in the professional medical field.

Large Language Model

Transformers

Open Source License:Apache-2.0 #Medical reasoning expert #Structured clinical thinking #GRPO reinforcement learning

Downloads 116

Release Time : 5/26/2025

Model Overview

Gazal - R1 - 32B is a language model specifically designed for medical reasoning and clinical decision - making, which can provide strong support for medical research and clinical assistance.

Model Features

Medical expertise

Professionally trained on 107,033 synthetic medical reasoning examples, covering aspects such as diagnostic reasoning, treatment planning, decision - making under uncertainty, and prognosis assessment.

Transparent reasoning

Provide structured clinical thinking with step - by - step explanations in the `<think></think>` tags according to the established clinical reasoning framework.

Excellent performance

Achieved 87.1% on MedQA, 81.6% on MMLU Pro (Medicine), and 79.6% on PubMedQA, surpassing models 12 times larger.

Parameter efficiency

Adopted advanced training techniques including Decomposed Low - Rank Adaptation (DoRA) and Rank - Stable LoRA (rsLoRA).

Alignment optimization

Optimized through Group - Relative Policy Optimization (GRPO) with a complex multi - component reward system.

Medical knowledge

Have a comprehensive understanding of multiple medical specialties and clinical scenarios.

Model Capabilities

Medical reasoning

Clinical decision - making support

Diagnostic reasoning

Treatment planning

Prognosis assessment

Medical knowledge Q&A

Use Cases

Research and education

Medical education and training

Used for medical education and training to help medical students and doctors improve their clinical reasoning abilities.

Clinical reasoning research

Used for clinical reasoning research to help researchers analyze complex medical cases.

Professional support

Literature review assistance

Assist in medical literature reviews and provide structured summaries of medical knowledge.

Clinical case analysis support

Support clinical case analysis and provide detailed diagnosis and treatment suggestions.

🚀 Gazal-R1-32B: Medical Reasoning Language Model

Gazal-R1-32B is a state-of-the-art language model specialized in medical reasoning and clinical decision - making, demonstrating excellent performance in the medical field.

🚀 Quick Start

Basic Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "TachyHealth/Gazal-R1-32B-GRPO-preview"

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

# Medical reasoning prompt
prompt = """A 65-year-old male presents with chest pain, shortness of breath, and elevated troponin levels. 
ECG shows ST-segment elevation in leads II, III, and aVF. What is the most likely diagnosis and immediate management?"""

messages = [
    {"role": "user", "content": prompt}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# Generate response with medical reasoning
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=2048,
    temperature=0.7,
    top_p=0.8,
    top_k=20
)

output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
response = tokenizer.decode(output_ids, skip_special_tokens=True)

print("Medical Assessment:", response)

Structured Medical Reasoning Format

Gazal-R1 is trained to provide structured medical reasoning in the following format:

<think>
Step 1: Analyze presenting symptoms - chest pain, dyspnea, troponin elevation
Step 2: Interpret ECG findings - ST elevation in inferior leads
Step 3: Consider differential diagnoses - STEMI vs NSTEMI vs unstable angina
Step 4: Identify culprit vessel - likely RCA given inferior lead changes
Step 5: Assess urgency - emergent intervention required
Step 6: Plan immediate management - dual antiplatelet, anticoagulation, cath lab
</think>

## Clinical Assessment

**Primary Diagnosis**: ST-Elevation Myocardial Infarction (STEMI), inferior wall

**Reasoning**: The combination of chest pain, elevated troponin, and ST-elevation in leads II, III, and aVF is pathognomonic for inferior STEMI, likely involving the right coronary artery (RCA).

**Immediate Management**:
1. **Reperfusion therapy**: Emergency cardiac catheterization with primary PCI
2. **Antiplatelet therapy**: Aspirin 325mg + P2Y12 inhibitor (clopidogrel/ticagrelor)
3. **Anticoagulation**: Heparin or bivalirudin
4. **Supportive care**: O2 if hypoxic, nitroglycerin for pain (avoid if hypotensive)

**Follow-up**: Post-PCI monitoring, echocardiogram, cardiac rehabilitation referral

✨ Features

Gazal-R1 is a state-of-the-art 32 - billion - parameter language model specifically designed for medical reasoning and clinical decision - making. Built upon Qwen 3 32B, Gazal-R1 demonstrates that strategic training can enable mid - sized models to outperform significantly larger counterparts in specialized medical domains.

Key features include:

Medical Expertise: Specialized training on 107,033 synthetic medical reasoning examples covering diagnostic reasoning, treatment planning, decision - making under uncertainty, and prognostic assessment
Transparent Reasoning: Structured clinical thinking with step - by - step explanations in <think></think> tags, following established clinical reasoning frameworks
State - of - the - Art Performance: Achieves 87.1% on MedQA, 81.6% on MMLU Pro (Medical), and 79.6% on PubMedQA, surpassing models up to 12× larger
Parameter Efficiency: Advanced training techniques including Weight - Decomposed Low - Rank Adaptation (DoRA) and Rank - Stabilized LoRA (rsLoRA)
Alignment Optimization: Refined through Group Relative Policy Optimization (GRPO) with sophisticated multi - component reward systems
Medical Knowledge: Comprehensive understanding across multiple medical specialties and clinical scenarios

📚 Documentation

Model Overview

Gazal-R1-32B has the following characteristics:

Property	Details
Model Type	Causal Language Model (Medical Reasoning Specialist)
Base Model	Qwen 3 32B
Training Stages	Two - stage pipeline (Supervised Fine - Tuning + Reinforcement Learning)
Number of Parameters	32.8B
Number of Parameters (Non - Embedding)	31.2B
Context Length	32,768 tokens natively, extensible to 131,072 with YaRN
Training Data	107,033 synthetic medical reasoning examples + MedReason dataset (32,682 examples)
Fine - tuning Method	DoRA + rsLoRA (Parameter - Efficient Fine - Tuning)
Alignment	Group Relative Policy Optimization (GRPO)

Performance Results

Gazal-R1 achieves exceptional performance across standard medical benchmarks:

Model	Size	MMLU Pro (Medical)	MedMCQA	MedQA	PubMedQA
Gazal-R1 (Final)	32B	81.6	71.9	87.1	79.6
Gazal-R1 (SFT - only)	32B	79.3	72.3	86.9	77.6
Llama 3.1 405B Instruct	405B	70.2	75.8	81.9	74.6
Qwen 2.5 72B Instruct	72B	72.1	66.2	72.7	71.7
Med42 - Llama3.1 - 70B	70B	66.1	72.4	80.4	77.6
Llama 3.1 70B Instruct	70B	74.5	72.5	78.4	78.5
QwQ 32B	32B	70.1	65.6	72.3	73.7
Qwen 3 32B	32B	78.4	71.6	84.4	76.7

Key Achievements:

Highest scores on MMLU Pro (Medical), MedQA, and PubMedQA
Significant improvements from GRPO training (+2.3% on MMLU Pro, +2.0% on PubMedQA)
Outperforms models up to 12× larger (Llama 3.1 405B) on medical reasoning tasks

Training Methodology

Stage 1: Supervised Fine - Tuning (SFT)

Dataset: 107,033 synthetic medical reasoning examples + MedReason dataset
Techniques: DoRA + rsLoRA with rank 256
Focus: Structured clinical reasoning across diagnostic, therapeutic, and prognostic scenarios

Stage 2: Group Relative Policy Optimization (GRPO)

Algorithm: Value - function - free reinforcement learning
Dataset: UltraMedical subset (32K medical MCQs)
Rewards: Multi - component system (accuracy, format, length control, repetition penalty)
Improvements: Enhanced reasoning quality and format adherence

Model Capabilities

Clinical Reasoning Types

Diagnostic Reasoning: Systematic symptom analysis → differential diagnosis
Treatment Planning: Evidence - based therapy selection with patient - specific factors
Decision - Making Under Uncertainty: Risk assessment and clinical judgment
Prognostic Assessment: Outcome prediction based on clinical evidence

Medical Specialties Covered

Internal Medicine
Emergency Medicine
Cardiology
Pulmonology
Infectious Disease
Pharmacology
Pathophysiology
Clinical Laboratory Medicine

Limitations and Important Disclaimers

⚠️ Important Note

NOT A MEDICAL DEVICE: Gazal-R1 is a research model and is NOT intended for direct clinical use, diagnosis, or treatment planning

REQUIRES PROFESSIONAL VERIFICATION: All outputs must be independently verified by qualified medical professionals

NO REAL - TIME UPDATES: Knowledge is static and does not reflect the latest medical research or guidelines

💡 Usage Tip

Knowledge Cutoff: Training data reflects medical knowledge up to the training date

Hallucination Risk: May generate plausible - sounding but factually incorrect information

Evaluation Scope: Primarily evaluated on multiple - choice questions; real - world clinical scenarios may differ

Regional Bias: Training data may contain geographical or demographic biases

⚠️ Ethical Considerations

Professional Responsibility: Final medical decisions must always rest with qualified healthcare providers

Accountability: Users assume responsibility for verifying and appropriately applying model outputs

Patient Safety: Never use for emergency medical situations or time - critical decisions

Use Cases

Research and Education

Medical education and training
Clinical reasoning research
Medical knowledge assessment
Academic medical writing assistance

Professional Support (With Supervision)

Literature review assistance
Clinical case analysis support
Medical documentation aid
Differential diagnosis exploration

NOT Suitable For

Direct patient care
Emergency medical decisions
Replacing clinical judgment
Unsupervised medical advice

Model Access

Model Weights: Available on Hugging Face Hub
Datasets: Training datasets available at TachyHealth/structured_medical and TachyHealth/medical_grpo

📄 License

This model is released under the Apache 2.0 License. Please review the license terms before use.

Contact

For questions about Gazal-R1, please contact:

Research Team: TachyHealth
Website: https://tachyhealth.com/
Gazal Platform: Gazal.ai

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご