V

Visualprm 8B V1 1

Developed by OpenGVLab
VisualPRM-8B-v1.1 is an advanced multimodal process reward model with 8 billion parameters, which enhances the reasoning ability of large multimodal language models through the Best-of-N evaluation strategy.
Downloads 249
Release Time : 4/13/2025

Model Overview

This model aims to enhance the reasoning ability of existing large multimodal language models (MLLMs) and optimize the model output through the process reward mechanism.

Model Features

Multimodal process reward
Evaluate and optimize multimodal reasoning steps through the process reward mechanism
Best-of-N evaluation strategy
Adopt the BoN strategy to select the optimal solution from multiple candidate responses
Large-scale training data
Trained on the VisualPRM400K dataset, containing 400,000 samples
Wide applicability
Can improve the performance of large multimodal language models of different scales and architectures

Model Capabilities

Multimodal reasoning evaluation
Process reward scoring
Best response selection
Geometric problem solving
Visual-language joint understanding

Use Cases

Education
Geometric problem solving evaluation
Evaluate and optimize the step-by-step solutions of geometric problems by the model
Achieved a performance improvement of 5.9 points on InternVL2.5-78B
Research
Multimodal model optimization
Optimize other large multimodal language models as a reward model
Improved the reasoning performance of three types of MLLMs and four different scales
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase