đ SegFormer (b4-sized) model fine-tuned on ADE20k
A SegFormer model fine-tuned on ADE20k at 512x512 resolution, designed for efficient semantic segmentation.
đ Quick Start
The SegFormer model presented here is fine-tuned on the ADE20k dataset at a resolution of 512x512. It was first introduced in the paper SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers by Xie et al. and initially released in this repository.
Disclaimer: The team that released SegFormer did not create a model card for this model. This model card was written by the Hugging Face team.
⨠Features
SegFormer combines a hierarchical Transformer encoder with a lightweight all - MLP decode head, achieving excellent results on semantic segmentation benchmarks like ADE20K and Cityscapes. The hierarchical Transformer is pre - trained on ImageNet - 1k, and then a decode head is added and fine - tuned on a downstream dataset.
đģ Usage Examples
Basic Usage
Here is an example of using this model to classify an image from the COCO 2017 dataset into one of the 1,000 ImageNet classes:
from transformers import SegformerFeatureExtractor, SegformerForSemanticSegmentation
from PIL import Image
import requests
feature_extractor = SegformerFeatureExtractor.from_pretrained("nvidia/segformer-b4-finetuned-ade-512-512")
model = SegformerForSemanticSegmentation.from_pretrained("nvidia/segformer-b4-finetuned-ade-512-512")
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = feature_extractor(images=image, return_tensors="pt")
outputs = model(**inputs)
logits = outputs.logits
For more code examples, refer to the documentation.
đ Documentation
You can use the raw model for semantic segmentation. Check the model hub to find fine - tuned versions for tasks that interest you.
đ License
The license for this model can be found here.
BibTeX entry and citation info
@article{DBLP:journals/corr/abs-2105-15203,
author = {Enze Xie and
Wenhai Wang and
Zhiding Yu and
Anima Anandkumar and
Jose M. Alvarez and
Ping Luo},
title = {SegFormer: Simple and Efficient Design for Semantic Segmentation with
Transformers},
journal = {CoRR},
volume = {abs/2105.15203},
year = {2021},
url = {https://arxiv.org/abs/2105.15203},
eprinttype = {arXiv},
eprint = {2105.15203},
timestamp = {Wed, 02 Jun 2021 11:46:42 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-2105-15203.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Property |
Details |
Model Type |
Vision, Image - Segmentation |
Training Data |
scene_parse_150 |
đĄ Usage Tip
You can find fine - tuned versions of the model on the model hub according to your specific needs.
The following are some example images for quick testing: