Model Selection

Zero-shot Object Detection

# Zero-shot Object Detection

Llmdet Swin Large Hf

LLMDet is a powerful open-vocabulary object detector supervised by large language models, a highlight paper at CVPR2025

Object Detection

Llmdet Swin Base Hf

LLMDet is an open-vocabulary object detector supervised by large language models, capable of zero-shot object detection.

Object Detection

VLM R1 Qwen2.5VL 3B OVD 0321

A zero-shot object detection model based on Qwen2.5-VL-3B-Instruct, enhanced with VLM-R1 reinforcement learning, supporting open vocabulary detection tasks.

Safetensors English

Inference Endpoint For Omdet Turbo Swin Tiny Hf

A zero-shot object detection model based on the Swin-Tiny architecture, supporting French and English, suitable for various object detection tasks.

Object Detection

Transformers Supports Multiple Languages

YOLOE is a real-time visual omni-model that supports various vision tasks including zero-shot object detection.

Object Detection

YOLOE is a real-time visual omni-model that combines object detection and visual understanding capabilities, suitable for various visual tasks.

Object Detection

YOLOE is a zero-shot object detection model capable of detecting various objects in visual scenes in real-time.

Object Detection

Qwen2.5vl 3B VLM R1 REC 500steps

A vision-language model based on Qwen2.5-VL-3B-Instruct, enhanced with VLM-R1 reinforcement learning, focusing on referring expression comprehension tasks.

Safetensors English

Grounding Dino Tiny ONNX

A lightweight zero-shot object detection model in ONNX format, compatible with Transformers.js, suitable for browser-side deployment.

Object Detection

Omdet Turbo Swin Tiny Hf

OmDet-Turbo is an efficient fusion-head open-vocabulary detection model based on real-time Transformer, suitable for zero-shot object detection tasks.

Object Detection

Owlv2 Base Patch16

OWLv2 is a vision-language pre-trained model focused on object detection and localization tasks.

Object Detection

Owlv2 Base Patch16 Ensemble

OWLv2 is a zero-shot text-conditioned object detection model that can locate objects in images through text queries.

Object Detection

Owlv2 Large Patch14 Ensemble

OWLv2 is a zero-shot text-conditioned object detection model that can locate objects in images through text queries.

Owlv2 Large Patch14

OWLv2 is a zero-shot text-conditioned object detection model that can detect objects in images through text queries without requiring category-specific training data.

Grounding Dino Base

Grounding DINO is an open-set object detection model that achieves zero-shot object detection capabilities by combining the DINO detector with a text encoder.

Object Detection

Owlvit Large Patch14

OWL-ViT is a zero-shot text-conditioned object detection model that can retrieve objects in images through text queries.

Owlvit Base Patch16

OWL-ViT is a zero-shot text-conditioned object detection model that can detect objects in images via text queries.

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase