S

Segformer B4 Finetuned Cityscapes 1024 1024

Developed by nvidia
SegFormer is a Transformer-based semantic segmentation model fine-tuned on the Cityscapes dataset, suitable for 1024x1024 resolution image segmentation tasks.
Downloads 10.06k
Release Time : 3/2/2022

Model Overview

This model features a hierarchical Transformer encoder and lightweight all-MLP decoder head architecture, specifically designed for semantic segmentation tasks, delivering excellent performance on benchmarks like Cityscapes.

Model Features

Hierarchical Transformer encoder
Employs a hierarchical Transformer architecture to effectively capture multi-scale features
Lightweight MLP decoder head
Uses an all-MLP structure decoder head to maintain efficient inference speed
High-resolution support
Specifically optimized for 1024x1024 resolution images

Model Capabilities

Image semantic segmentation
Road scene understanding
Multi-category object recognition

Use Cases

Autonomous driving
Road scene segmentation
Performs pixel-level semantic segmentation of urban road scenes
Accurately identifies elements like roads, vehicles, and pedestrians
Urban mapping
Urban landscape analysis
Extracts building, vegetation and other information from aerial or street-view images
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase