M

Minh

Developed by minh14122003
YOLOS is an object detection model based on Vision Transformer (ViT), trained with DETR loss, and performs excellently on the COCO dataset.
Downloads 14
Release Time : 12/1/2024

Model Overview

YOLOS is a Vision Transformer model trained with DETR loss, specifically designed for object detection tasks, and can be directly used to detect objects in images.

Model Features

Transformer-based vision model
Adopts the Vision Transformer architecture, processing images as sequential data for end-to-end object detection.
Bipartite matching loss training
Uses the Hungarian matching algorithm to optimize the mapping between queries and annotations, combined with cross-entropy and bounding box loss for training.
Simple and efficient
Simple structure but excellent performance, comparable to complex frameworks like Faster R-CNN.

Model Capabilities

Object detection
Image analysis
Object localization

Use Cases

General object detection
Everyday scene object detection
Detects common objects in images, such as people, animals, vehicles, etc.
Achieves 28.7 AP on the COCO validation set
Surveillance video analysis
Used for object detection and tracking in surveillance videos.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase