# QA Evaluation
RM R1 DeepSeek Distilled Qwen 14B
MIT
RM-R1 is a training framework for reasoning reward models (ReasRM), which evaluates candidate answers by generating scoring criteria or reasoning traces, providing explainable judgments.
Large Language Model
Transformers English

R
gaotang
95
1
RM R1 Qwen2.5 Instruct 14B
MIT
RM-R1 is a training framework for reasoning reward models (ReasRM), which evaluates candidate answers by generating scoring criteria or reasoning traces, providing explainable assessments.
Large Language Model
Transformers English

R
gaotang
21
1
Reward Model Deberta V3 Large V2
MIT
This reward model is trained to predict which generated answer humans would prefer for a given question. Suitable for QA evaluation, RLHF reward scoring, and toxic answer detection.
Large Language Model
Transformers English

R
OpenAssistant
11.15k
219
Featured Recommended AI Models