M

Mit B4

Developed by nvidia
SegFormer encoder fine-tuned on ImageNet-1k, featuring a hierarchical Transformer architecture for semantic segmentation tasks
Downloads 3,573
Release Time : 3/2/2022

Model Overview

SegFormer is a Transformer-based semantic segmentation model with a hierarchical encoder and lightweight all-MLP decoder design. This version includes only the pretrained encoder for downstream task fine-tuning.

Model Features

Hierarchical Transformer Architecture
Uses a hierarchical Transformer encoder to effectively capture multi-scale features
Lightweight Design
Combines a lightweight all-MLP decoder to reduce computational complexity while maintaining performance
Pretrained Encoder
Provides an encoder pretrained on ImageNet-1k for easy fine-tuning on downstream tasks

Model Capabilities

Image Semantic Segmentation
Multi-scale Feature Extraction

Use Cases

Computer Vision
Scene Understanding
Used for semantic segmentation on datasets like ADE20K for scene parsing
Demonstrates excellent performance on benchmarks such as ADE20K and Cityscapes
Object Recognition
Identifies and segments specific objects in images
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase