Y

Yolos Small

Developed by hustvl
A vision Transformer (ViT)-based object detection model trained with DETR loss function, achieving excellent performance on the COCO dataset.
Downloads 154.46k
Release Time : 4/26/2022

Model Overview

YOLOS is a concise and efficient vision Transformer model specifically designed for object detection tasks. It employs DETR-style bipartite matching loss for training and achieves detection accuracy comparable to DETR and Faster R-CNN on the COCO dataset.

Model Features

Transformer Architecture
Utilizes a pure vision Transformer structure, enabling efficient object detection without traditional CNN components.
Bipartite Matching Loss
Employs the Hungarian algorithm for optimal matching between predictions and annotations, combining cross-entropy and bounding box loss for end-to-end training.
Concise Design
Simple yet powerful structure, with the base-size model achieving 42 AP on COCO.

Model Capabilities

Multi-object detection in images
Bounding box prediction
Object classification

Use Cases

Scene Understanding
Surveillance Video Analysis
Real-time detection of targets such as pedestrians and vehicles in surveillance footage.
Autonomous Driving Perception
Identifying traffic participants and obstacles in road environments.
Content Analysis
Image Content Moderation
Detecting specific objects or sensitive content in images.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase