R

R1 VL 2B

Developed by jingyiZ00
R1-VL-2B is a vision-language reasoning model trained through Stepwise Group Relative Policy Optimization (StepGRPO), optimized based on Qwen2-VL-2B-Instruct.
Downloads 272
Release Time : 3/18/2025

Model Overview

R1-VL-2B is a vision-language model focused on image-text-to-text tasks, capable of understanding and generating text content related to images.

Model Features

Stepwise Group Relative Policy Optimization (StepGRPO)
Adopts the StepGRPO training method to optimize the model's performance in vision-language tasks.
Based on Qwen2-VL-2B-Instruct
Built upon Qwen2-VL-2B-Instruct, inheriting its robust vision-language processing capabilities.

Model Capabilities

Image Understanding
Text Generation
Vision-Language Reasoning

Use Cases

Visual Question Answering
Image Caption Generation
Generates detailed textual descriptions based on input images.
Visual Question Answering
Answers questions related to image content.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase