đ Gazal-R1-32B: Medical Reasoning Language Model
Gazal-R1-32B is a state-of-the-art language model specialized in medical reasoning and clinical decision - making, demonstrating excellent performance in the medical field.
đ Quick Start
Basic Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "TachyHealth/Gazal-R1-32B-GRPO-preview"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
prompt = """A 65-year-old male presents with chest pain, shortness of breath, and elevated troponin levels.
ECG shows ST-segment elevation in leads II, III, and aVF. What is the most likely diagnosis and immediate management?"""
messages = [
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=2048,
temperature=0.7,
top_p=0.8,
top_k=20
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
response = tokenizer.decode(output_ids, skip_special_tokens=True)
print("Medical Assessment:", response)
Structured Medical Reasoning Format
Gazal-R1 is trained to provide structured medical reasoning in the following format:
<think>
Step 1: Analyze presenting symptoms - chest pain, dyspnea, troponin elevation
Step 2: Interpret ECG findings - ST elevation in inferior leads
Step 3: Consider differential diagnoses - STEMI vs NSTEMI vs unstable angina
Step 4: Identify culprit vessel - likely RCA given inferior lead changes
Step 5: Assess urgency - emergent intervention required
Step 6: Plan immediate management - dual antiplatelet, anticoagulation, cath lab
</think>
## Clinical Assessment
**Primary Diagnosis**: ST-Elevation Myocardial Infarction (STEMI), inferior wall
**Reasoning**: The combination of chest pain, elevated troponin, and ST-elevation in leads II, III, and aVF is pathognomonic for inferior STEMI, likely involving the right coronary artery (RCA).
**Immediate Management**:
1. **Reperfusion therapy**: Emergency cardiac catheterization with primary PCI
2. **Antiplatelet therapy**: Aspirin 325mg + P2Y12 inhibitor (clopidogrel/ticagrelor)
3. **Anticoagulation**: Heparin or bivalirudin
4. **Supportive care**: O2 if hypoxic, nitroglycerin for pain (avoid if hypotensive)
**Follow-up**: Post-PCI monitoring, echocardiogram, cardiac rehabilitation referral
⨠Features
Gazal-R1 is a state-of-the-art 32 - billion - parameter language model specifically designed for medical reasoning and clinical decision - making. Built upon Qwen 3 32B, Gazal-R1 demonstrates that strategic training can enable mid - sized models to outperform significantly larger counterparts in specialized medical domains.
Key features include:
- Medical Expertise: Specialized training on 107,033 synthetic medical reasoning examples covering diagnostic reasoning, treatment planning, decision - making under uncertainty, and prognostic assessment
- Transparent Reasoning: Structured clinical thinking with step - by - step explanations in
<think></think>
tags, following established clinical reasoning frameworks
- State - of - the - Art Performance: Achieves 87.1% on MedQA, 81.6% on MMLU Pro (Medical), and 79.6% on PubMedQA, surpassing models up to 12Ã larger
- Parameter Efficiency: Advanced training techniques including Weight - Decomposed Low - Rank Adaptation (DoRA) and Rank - Stabilized LoRA (rsLoRA)
- Alignment Optimization: Refined through Group Relative Policy Optimization (GRPO) with sophisticated multi - component reward systems
- Medical Knowledge: Comprehensive understanding across multiple medical specialties and clinical scenarios
đ Documentation
Model Overview
Gazal-R1-32B has the following characteristics:
Property |
Details |
Model Type |
Causal Language Model (Medical Reasoning Specialist) |
Base Model |
Qwen 3 32B |
Training Stages |
Two - stage pipeline (Supervised Fine - Tuning + Reinforcement Learning) |
Number of Parameters |
32.8B |
Number of Parameters (Non - Embedding) |
31.2B |
Context Length |
32,768 tokens natively, extensible to 131,072 with YaRN |
Training Data |
107,033 synthetic medical reasoning examples + MedReason dataset (32,682 examples) |
Fine - tuning Method |
DoRA + rsLoRA (Parameter - Efficient Fine - Tuning) |
Alignment |
Group Relative Policy Optimization (GRPO) |
Performance Results
Gazal-R1 achieves exceptional performance across standard medical benchmarks:
Model |
Size |
MMLU Pro (Medical) |
MedMCQA |
MedQA |
PubMedQA |
Gazal-R1 (Final) |
32B |
81.6 |
71.9 |
87.1 |
79.6 |
Gazal-R1 (SFT - only) |
32B |
79.3 |
72.3 |
86.9 |
77.6 |
Llama 3.1 405B Instruct |
405B |
70.2 |
75.8 |
81.9 |
74.6 |
Qwen 2.5 72B Instruct |
72B |
72.1 |
66.2 |
72.7 |
71.7 |
Med42 - Llama3.1 - 70B |
70B |
66.1 |
72.4 |
80.4 |
77.6 |
Llama 3.1 70B Instruct |
70B |
74.5 |
72.5 |
78.4 |
78.5 |
QwQ 32B |
32B |
70.1 |
65.6 |
72.3 |
73.7 |
Qwen 3 32B |
32B |
78.4 |
71.6 |
84.4 |
76.7 |
Key Achievements:
- Highest scores on MMLU Pro (Medical), MedQA, and PubMedQA
- Significant improvements from GRPO training (+2.3% on MMLU Pro, +2.0% on PubMedQA)
- Outperforms models up to 12Ã larger (Llama 3.1 405B) on medical reasoning tasks
Training Methodology
Stage 1: Supervised Fine - Tuning (SFT)
- Dataset: 107,033 synthetic medical reasoning examples + MedReason dataset
- Techniques: DoRA + rsLoRA with rank 256
- Focus: Structured clinical reasoning across diagnostic, therapeutic, and prognostic scenarios
Stage 2: Group Relative Policy Optimization (GRPO)
- Algorithm: Value - function - free reinforcement learning
- Dataset: UltraMedical subset (32K medical MCQs)
- Rewards: Multi - component system (accuracy, format, length control, repetition penalty)
- Improvements: Enhanced reasoning quality and format adherence
Model Capabilities
Clinical Reasoning Types
- Diagnostic Reasoning: Systematic symptom analysis â differential diagnosis
- Treatment Planning: Evidence - based therapy selection with patient - specific factors
- Decision - Making Under Uncertainty: Risk assessment and clinical judgment
- Prognostic Assessment: Outcome prediction based on clinical evidence
Medical Specialties Covered
- Internal Medicine
- Emergency Medicine
- Cardiology
- Pulmonology
- Infectious Disease
- Pharmacology
- Pathophysiology
- Clinical Laboratory Medicine
Limitations and Important Disclaimers
â ī¸ Important Note
- NOT A MEDICAL DEVICE: Gazal-R1 is a research model and is NOT intended for direct clinical use, diagnosis, or treatment planning
- REQUIRES PROFESSIONAL VERIFICATION: All outputs must be independently verified by qualified medical professionals
- NO REAL - TIME UPDATES: Knowledge is static and does not reflect the latest medical research or guidelines
đĄ Usage Tip
- Knowledge Cutoff: Training data reflects medical knowledge up to the training date
- Hallucination Risk: May generate plausible - sounding but factually incorrect information
- Evaluation Scope: Primarily evaluated on multiple - choice questions; real - world clinical scenarios may differ
- Regional Bias: Training data may contain geographical or demographic biases
â ī¸ Ethical Considerations
- Professional Responsibility: Final medical decisions must always rest with qualified healthcare providers
- Accountability: Users assume responsibility for verifying and appropriately applying model outputs
- Patient Safety: Never use for emergency medical situations or time - critical decisions
Use Cases
Research and Education
- Medical education and training
- Clinical reasoning research
- Medical knowledge assessment
- Academic medical writing assistance
Professional Support (With Supervision)
- Literature review assistance
- Clinical case analysis support
- Medical documentation aid
- Differential diagnosis exploration
NOT Suitable For
- Direct patient care
- Emergency medical decisions
- Replacing clinical judgment
- Unsupervised medical advice
Model Access
đ License
This model is released under the Apache 2.0 License. Please review the license terms before use.
Contact
For questions about Gazal-R1, please contact: