D-FINE-xlarge-obj2coco Open-source Object Detection Model - Highly Practical for High-precision Object Localization

Dfine Xlarge Obj2coco

Developed by ustc-community

D-FINE is a model for object detection that achieves excellent positioning accuracy by redefining the bounding box regression task in the DETR model.

Object Detection

Transformers

EnglishOpen Source License:Apache-2.0 #Fine-grained object detection #Bounding box regression optimization #Autonomous driving adaptation

Downloads 4,191

Release Time : 3/28/2025

Model Overview

D-FINE is a powerful real-time object detector that enhances the positioning accuracy of object detection through two key components: Fine-grained Distribution Refinement (FDR) and Global Optimal Localization Self-Distillation (GO-LSD).

Model Features

Fine-grained Distribution Refinement (FDR)

Redefine the bounding box regression task to improve positioning accuracy.

Global Optimal Localization Self-Distillation (GO-LSD)

Optimize model performance through self-distillation technology.

Real-time object detection

Suitable for scenarios that require real-time processing, such as autonomous driving and monitoring systems.

Model Capabilities

Object detection

Real-time processing

High-precision positioning

Use Cases

Autonomous driving

Vehicle and pedestrian detection

Detect vehicles and pedestrians in real-time in autonomous driving systems.

High-precision positioning capabilities enhance the safety of autonomous driving.

Monitoring systems

Abnormal behavior detection

Detect abnormal behaviors or suspicious objects in surveillance videos.

Real-time processing capabilities ensure timely response.

Retail analysis

Product recognition

Identify and locate products in a retail environment.

High-precision detection improves inventory management and customer experience.

🚀 D-FINE

D-FINE is a powerful real-time object detector. It redefines the bounding box regression task in DETR models, achieving outstanding localization precision. This README provides an overview, performance details, usage examples, training information, and application scenarios of the D-FINE model.

✨ Features

High Precision: Redefines the bounding box regression task in DETR models to achieve outstanding localization precision.
Two Key Components: Comprises Fine-grained Distribution Refinement (FDR) and Global Optimal Localization Self-Distillation (GO-LSD).
Versatile Applications: Ideal for real-time object detection in various fields such as autonomous driving, surveillance systems, robotics, and retail analytics.
Flexible Deployment: Suitable for both edge devices and large-scale systems, ensuring high accuracy and speed in dynamic, real-world environments.

📦 Installation

No specific installation steps are provided in the original document. If you want to use the D-FINE model with the transformers library, you can install the transformers library via the following command:

pip install transformers

💻 Usage Examples

Basic Usage

import torch
import requests

from PIL import Image
from transformers import DFineForObjectDetection, AutoImageProcessor

url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
image = Image.open(requests.get(url, stream=True).raw)

image_processor = AutoImageProcessor.from_pretrained("ustc-community/dfine-xlarge-obj2coco")
model = DFineForObjectDetection.from_pretrained("ustc-community/dfine-xlarge-obj2coco")

inputs = image_processor(images=image, return_tensors="pt")

with torch.no_grad():
    outputs = model(**inputs)

results = image_processor.post_process_object_detection(outputs, target_sizes=torch.tensor([image.size[::-1]]), threshold=0.3)

for result in results:
    for score, label_id, box in zip(result["scores"], result["labels"], result["boxes"]):
        score, label = score.item(), label_id.item()
        box = [round(i, 2) for i in box.tolist()]
        print(f"{model.config.id2label[label]}: {score:.2f} {box}")

📚 Documentation

Overview

The D-FINE model was proposed in D-FINE: Redefine Regression Task in DETRs as Fine-grained Distribution Refinement by Yansong Peng, Hebei Li, Peixi Wu, Yueyi Zhang, Xiaoyan Sun, Feng Wu. This model was contributed by VladOS95-cyber with the help of @qubvel-hf. This is the HF transformers implementation for D-FINE.

_coco -> model trained on COCO
_obj365 -> model trained on Object365
_obj2coco -> model trained on Object365 and then finetuned on COCO

Performance

D-FINE, a powerful real-time object detector that achieves outstanding localization precision by redefining the bounding box regression task in DETR models. D-FINE comprises two key components: Fine-grained Distribution Refinement (FDR) and Global Optimal Localization Self-Distillation (GO-LSD).

Training

D-FINE is trained on COCO (Lin et al. [2014]) train2017 and validated on COCO val2017 dataset. We report the standard AP metrics (averaged over uniformly sampled IoU thresholds ranging from 0.50 - 0.95 with a step size of 0.05), and APval5000 commonly used in real scenarios.

Applications

D-FINE is ideal for real-time object detection in diverse applications such as autonomous driving, surveillance systems, robotics, and retail analytics. Its enhanced flexibility and deployment-friendly design make it suitable for both edge devices and large-scale systems + ensures high accuracy and speed in dynamic, real-world environments.

📄 License

This project is licensed under the Apache-2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご