Upernet Swin Small
UPerNet semantic segmentation model based on Swin Transformer small architecture, suitable for scene parsing tasks like ADE20K
Downloads 100
Release Time : 4/12/2025
Model Overview
This model adopts the UPerNet architecture combined with Swin-Small as the encoder, specifically designed for high-precision semantic segmentation tasks, particularly suitable for scene parsing and image segmentation applications
Model Features
Swin Transformer Backbone
Utilizes the advanced Swin-Small as the encoder, incorporating hierarchical window attention mechanisms to effectively capture multi-scale features
UPerNet Decoder Architecture
Employs the Unified Perceptual Parsing Network (UPerNet) as the decoder to achieve efficient multi-scale feature fusion
Pre-trained Support
Provides out-of-the-box pre-trained weights, supporting quick loading via HuggingFace Hub
ADE20K Optimization
Specifically optimized for the ADE20K scene parsing dataset, supporting 150-class semantic segmentation
Model Capabilities
Image Semantic Segmentation
Scene Parsing
Pixel-Level Classification
Multi-Scale Feature Extraction
Use Cases
Computer Vision
Scene Understanding
Performs pixel-level recognition and segmentation of various objects in complex scenes
Can output precise segmentation masks containing 150 classes of objects
Autonomous Driving Perception
Parses various elements in road scenes (vehicles, pedestrians, roads, etc.)
Remote Sensing Image Analysis
Classifies and segments ground objects in satellite/aerial images
Featured Recommended AI Models
Š 2025AIbase