Y

Yolos Base

Developed by hustvl
YOLOS is a vision Transformer (ViT)-based object detection model trained with DETR loss, achieving 42 AP performance on the COCO dataset.
Downloads 2,638
Release Time : 4/26/2022

Model Overview

YOLOS is a vision Transformer (ViT) trained with DETR loss, specifically designed for object detection tasks. The model performs excellently on the COCO 2017 validation set, matching the performance of complex frameworks like DETR and Faster R-CNN.

Model Features

Transformer-based object detection
YOLOS adopts the vision Transformer architecture, transforming the object detection task into a sequence prediction problem, simplifying the complexity of traditional detection frameworks.
Bipartite matching loss
Uses the Hungarian matching algorithm to establish optimal correspondences between predictions and annotations, combining cross-entropy loss with L1 and generalized IoU loss to optimize model parameters.
High performance
Achieves 42 AP on the COCO 2017 validation set, matching the performance of DETR and more complex frameworks like Faster R-CNN.

Model Capabilities

Object detection
Image analysis
Bounding box prediction

Use Cases

Computer vision
Scene understanding
Detects objects and their positions in images, suitable for scenarios like surveillance and autonomous driving.
Accurately identifies and locates multiple objects in images.
Image annotation
Automatically generates annotations for images, including object categories and positions.
Provides high-quality image annotations, reducing manual labeling costs.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase