# Interpretability Scoring
RM R1 DeepSeek Distilled Qwen 32B
MIT
RM-R1 is a training framework for reasoning reward models (ReasRM), which evaluates candidate answers by generating scoring criteria or reasoning trajectories, providing interpretable evaluations.
Large Language Model
Transformers English

R
gaotang
506
0
RM R1 Qwen2.5 Instruct 7B
MIT
RM-R1 is a training framework for reasoning reward models (ReasRM), which evaluates candidate answers by generating scoring criteria or reasoning traces, significantly improving accuracy and interpretability compared to traditional reward models.
Large Language Model
Transformers English

R
gaotang
23
2
Featured Recommended AI Models