V

VLM R1 Qwen2.5VL 3B OVD 0321

Developed by omlab
A zero-shot object detection model based on Qwen2.5-VL-3B-Instruct, enhanced with VLM-R1 reinforcement learning, supporting open vocabulary detection tasks.
Downloads 892
Release Time : 3/21/2025

Model Overview

This model combines vision-language models with reinforcement learning techniques, specifically designed for Open Vocabulary Detection (OVD), capable of recognizing new category objects not explicitly labeled in the training data.

Model Features

Reinforcement Learning Enhancement
Optimizes model performance using the VLM-R1 reinforcement learning algorithm
Open Vocabulary Detection
Supports recognizing new category objects not included in the training data
Multimodal Understanding
Combines visual and linguistic information for object detection

Model Capabilities

Zero-shot Object Detection
Open Vocabulary Recognition
Multimodal Understanding
Vision-Language Reasoning

Use Cases

Computer Vision
Smart Surveillance
Detects unknown category objects in surveillance footage
Autonomous Driving
Identifies new types of obstacles in road environments not covered by training data
Retail Analytics
Product Recognition
Identifies categories and attributes of newly launched products
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase