đ YOLOS (small-sized) model
The YOLOS (small-sized) model is fine - tuned for object detection, especially license plate detection, based on the original YOLOS architecture.
đ Quick Start
The original YOLOS model was fine - tuned on COCO 2017 object detection (118k annotated images). It was introduced in the paper You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection by Fang et al. and first released in this repository.
This model was further fine - tuned on the license plate dataset from Kaggle. The dataset consists of 735 images of annotations categorised as "vehicle" and "license - plate". The model was trained for 200 epochs on a single GPU using Google Colab.
⨠Features
- Fine - tuned on COCO 2017 object detection dataset.
- Further fine - tuned on a license plate detection dataset.
- Capable of object detection, especially for vehicles and license plates.
đĻ Installation
No specific installation steps are provided in the original document, so this section is skipped.
đģ Usage Examples
Basic Usage
from transformers import YolosFeatureExtractor, YolosForObjectDetection
from PIL import Image
import requests
url = 'https://drive.google.com/uc?id=1p9wJIqRz3W50e2f_A0D8ftla8hoXz4T5'
image = Image.open(requests.get(url, stream=True).raw)
feature_extractor = YolosFeatureExtractor.from_pretrained('nickmuchi/yolos-small-rego-plates-detection')
model = YolosForObjectDetection.from_pretrained('nickmuchi/yolos-small-rego-plates-detection')
inputs = feature_extractor(images=image, return_tensors="pt")
outputs = model(**inputs)
logits = outputs.logits
bboxes = outputs.pred_boxes
Currently, both the feature extractor and model support PyTorch.
đ Documentation
Model description
YOLOS is a Vision Transformer (ViT) trained using the DETR loss. Despite its simplicity, a base - sized YOLOS model is able to achieve 42 AP on COCO validation 2017 (similar to DETR and more complex frameworks such as Faster R - CNN).
Intended uses & limitations
You can use the raw model for object detection. See the model hub to look for all available YOLOS models.
đ§ Technical Details
Training data
The YOLOS model was pre - trained on ImageNet - 1k and fine - tuned on COCO 2017 object detection, a dataset consisting of 118k/5k annotated images for training/validation respectively.
Training
This model was fine - tuned for 200 epochs on the license plate dataset.
Evaluation results
This model achieves an AP (average precision) of 47.9.
Accumulating evaluation results...
Metrics |
Metric Parameter |
Location |
Dets |
Value |
Average Precision |
(AP) @[ IoU = 0.50:0.95 |
area = all |
maxDets = 100 ] |
0.479 |
Average Precision |
(AP) @[ IoU = 0.50 |
area = all |
maxDets = 100 ] |
0.752 |
Average Precision |
(AP) @[ IoU = 0.75 |
area = all |
maxDets = 100 ] |
0.555 |
Average Precision |
(AP) @[ IoU = 0.50:0.95 |
area = small |
maxDets = 100 ] |
0.147 |
Average Precision |
(AP) @[ IoU = 0.50:0.95 |
area = medium |
maxDets = 100 ] |
0.420 |
Average Precision |
(AP) @[ IoU = 0.50:0.95 |
area = large |
maxDets = 100 ] |
0.804 |
Average Recall |
(AR) @[ IoU = 0.50:0.95 |
area = all |
maxDets = 1 ] |
0.437 |
Average Recall |
(AR) @[ IoU = 0.50:0.95 |
area = all |
maxDets = 10 ] |
0.641 |
Average Recall |
(AR) @[ IoU = 0.50:0.95 |
area = all |
maxDets = 100 ] |
0.676 |
Average Recall |
(AR) @[ IoU = 0.50:0.95 |
area = small |
maxDets = 100 ] |
0.268 |
Average Recall |
(AR) @[ IoU = 0.50:0.95 |
area = medium |
maxDets = 100 ] |
0.641 |
Average Recall |
(AR) @[ IoU = 0.50:0.95 |
area = large |
maxDets = 100 ] |
0.870 |
đ License
The model is licensed under the Apache 2.0 license.
Property |
Details |
Model Type |
YOLOS (small - sized) for object and license plate detection |
Training Data |
Pre - trained on ImageNet - 1k, fine - tuned on COCO 2017 object detection and license plate dataset from Kaggle |