đ Segformer B3 fine-tuned for clothes segmentation
This is a SegFormer model fine-tuned on the ATR dataset for clothes segmentation. It can also be applied to human segmentation. The dataset on Hugging Face is named "mattmdjaga/human_parsing_dataset".
đ Quick Start
Installation
To use this model, you need to install the transformers
library. You can install it using the following command:
pip install transformers
Usage
from transformers import SegformerImageProcessor, AutoModelForSemanticSegmentation
from PIL import Image
import requests
import matplotlib.pyplot as plt
import torch.nn as nn
processor = SegformerImageProcessor.from_pretrained("sayeed99/segformer_b3_clothes")
model = AutoModelForSemanticSegmentation.from_pretrained("sayeed99/segformer_b3_clothes")
url = "https://plus.unsplash.com/premium_photo-1673210886161-bfcc40f54d1f?ixlib=rb-4.0.3&ixid=MnwxMjA3fDB8MHxzZWFyY2h8MXx8cGVyc29uJTIwc3RhbmRpbmd8ZW58MHx8MHx8&w=1000&q=80"
image = Image.open(requests.get(url, stream=True).raw)
inputs = processor(images=image, return_tensors="pt")
outputs = model(**inputs)
logits = outputs.logits.cpu()
upsampled_logits = nn.functional.interpolate(
logits,
size=image.size[::-1],
mode="bilinear",
align_corners=False,
)
pred_seg = upsampled_logits.argmax(dim=1)[0]
plt.imshow(pred_seg)
Labels
Label Index |
Label Name |
0 |
Background |
1 |
Hat |
2 |
Hair |
3 |
Sunglasses |
4 |
Upper-clothes |
5 |
Skirt |
6 |
Pants |
7 |
Dress |
8 |
Belt |
9 |
Left-shoe |
10 |
Right-shoe |
11 |
Face |
12 |
Left-leg |
13 |
Right-leg |
14 |
Left-arm |
15 |
Right-arm |
16 |
Bag |
17 |
Scarf |
⨠Features
- Multi - purpose: Can be used for both clothes segmentation and human segmentation.
- New Training Code: Training code is available, and more user - friendly versions will be added soon.
đ Documentation
Evaluation
Label Index |
Label Name |
Category Accuracy |
Category IoU |
0 |
Background |
0.99 |
0.99 |
1 |
Hat |
0.73 |
0.68 |
2 |
Hair |
0.91 |
0.82 |
3 |
Sunglasses |
0.73 |
0.63 |
4 |
Upper - clothes |
0.87 |
0.78 |
5 |
Skirt |
0.76 |
0.65 |
6 |
Pants |
0.90 |
0.84 |
7 |
Dress |
0.74 |
0.55 |
8 |
Belt |
0.35 |
0.30 |
9 |
Left - shoe |
0.74 |
0.58 |
10 |
Right - shoe |
0.75 |
0.60 |
11 |
Face |
0.92 |
0.85 |
12 |
Left - leg |
0.90 |
0.82 |
13 |
Right - leg |
0.90 |
0.81 |
14 |
Left - arm |
0.86 |
0.74 |
15 |
Right - arm |
0.82 |
0.73 |
16 |
Bag |
0.91 |
0.84 |
17 |
Scarf |
0.63 |
0.29 |
Overall Evaluation Metrics:
- Evaluation Loss: 0.15
- Mean Accuracy: 0.80
- Mean IoU: 0.69
đ License
The license for this model can be found here.
đ§ Technical Details
BibTeX entry and citation info
@article{DBLP:journals/corr/abs-2105-15203,
author = {Enze Xie and
Wenhai Wang and
Zhiding Yu and
Anima Anandkumar and
Jose M. Alvarez and
Ping Luo},
title = {SegFormer: Simple and Efficient Design for Semantic Segmentation with
Transformers},
journal = {CoRR},
volume = {abs/2105.15203},
year = {2021},
url = {https://arxiv.org/abs/2105.15203},
eprinttype = {arXiv},
eprint = {2105.15203},
timestamp = {Wed, 02 Jun 2021 11:46:42 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-2105-15203.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}