M

Mit B0

Developed by nvidia
SegFormer is a Transformer-based semantic segmentation model featuring a hierarchical encoder and lightweight MLP decoder design, excelling in benchmarks like ADE20K and Cityscapes.
Downloads 83.99k
Release Time : 3/2/2022

Model Overview

This model is the hierarchical Transformer encoder part of SegFormer, pretrained on ImageNet-1k, suitable for fine-tuning in downstream semantic segmentation tasks.

Model Features

Hierarchical Transformer Architecture
Employs a hierarchically designed Transformer encoder to effectively capture multi-scale features.
Lightweight MLP Decoder
Paired with a lightweight all-MLP decoder for efficient semantic segmentation.
ImageNet Pretraining
Encoder pretrained on ImageNet-1k dataset, offering robust feature extraction capabilities.

Model Capabilities

Image Feature Extraction
Semantic Segmentation Task Fine-tuning

Use Cases

Computer Vision
Scene Understanding
Used for semantic segmentation in scene parsing datasets like ADE20K.
Urban Scene Analysis
Performs segmentation of urban scene elements (e.g., roads, buildings) on Cityscapes dataset.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase