3

3B Curr ReFT

Developed by ZTE-AIM
A multimodal large language model fine-tuned from Qwen2.5-VL using the innovative Curr-ReFT method, significantly enhancing visual-language understanding and reasoning capabilities.
Downloads 37
Release Time : 3/25/2025

Model Overview

Curr-ReFT is a multimodal large language model fine-tuned from Qwen2.5-VL through curriculum reinforcement learning and rejection sample self-optimization, suitable for complex tasks such as visual reasoning, fine-grained image understanding, and multimodal problem-solving.

Model Features

Curriculum Reinforcement Learning
The training process is divided into two stages, first gradually increasing task complexity through curriculum reinforcement learning.
Rejection Sample Self-Optimization
Self-optimization based on rejection samples to maintain foundational capabilities.
Multimodal Reasoning Capability
Possesses strong multimodal reasoning abilities to tackle cross-domain challenges.

Model Capabilities

Visual-Language Understanding
Visual Reasoning
Fine-Grained Image Understanding
Multimodal Problem-Solving
Image-Text Generation

Use Cases

Visual Reasoning
Image Digit Recognition
Recognize digits in images and answer related questions.
High-accuracy digit recognition and reasoning capabilities.
Multimodal Problem-Solving
Complex Question Answering
Answer complex questions by combining image and text information.
Provide accurate and context-aware responses.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase