oneformer_ade20k_dinat_large开源图像分割模型 - 单模型搞定语义、实例和全景分割

首页

Oneformer Ade20k Dinat Large

由 shi-labs 开发

首个多任务通用图像分割框架，单一模型支持语义/实例/全景分割任务

图像分割

Transformers

开源协议:MIT #多任务图像分割 #通用Transformer架构 #任务动态适应

下载量 2,275

发布时间 : 11/15/2022

模型简介

OneFormer 是基于 Transformer 的通用图像分割模型，通过单一架构和训练流程实现语义分割、实例分割和全景分割三项任务，在 ADE20k 数据集上训练。

模型特点

多任务统一架构

单一模型同时支持语义分割、实例分割和全景分割三项任务

动态任务适应

通过任务令牌机制实现训练时的任务引导和推理时的任务动态切换

超越专用模型

在多项分割任务上性能超过专门设计的单任务模型

模型能力

语义分割

实例分割

全景分割

场景解析

物体识别

使用案例

计算机视觉

场景理解

对室内外场景进行像素级语义解析

可识别150类场景元素（基于ADE20k数据集）

物体实例分割

识别并分割图像中的独立物体实例

可处理复杂场景中的重叠物体

自动驾驶

道路场景解析

实时分割道路、车辆、行人等元素

适用于自动驾驶系统的环境感知模块

🚀 OneFormer

OneFormer是首个多任务通用图像分割框架，仅需在单个数据集上以单一通用架构和模型进行一次训练，就能在语义、实例和全景分割任务中超越现有的专门模型。

🚀 快速开始

OneFormer模型在ADE20k数据集（大尺寸版本，Dinat主干网络）上进行训练。它由Jain等人在论文 OneFormer: One Transformer to Rule Universal Image Segmentation 中提出，并首次在此仓库中发布。

模型图片

✨ 主要特性

模型描述

OneFormer是首个多任务通用图像分割框架。它仅需使用单一通用架构、单一模型，并在单一数据集上进行一次训练，就能在语义、实例和全景分割任务中超越现有的专门模型。OneFormer使用任务令牌来使模型专注于当前任务，使架构在训练时具有任务导向性，在推理时具有任务动态性，且所有操作都通过单一模型完成。

模型图片

预期用途与局限性

你可以使用此特定检查点进行语义、实例和全景分割。请查看模型中心以查找在不同数据集上微调的其他版本。

💻 使用示例

基础用法

from transformers import OneFormerProcessor, OneFormerForUniversalSegmentation
from PIL import Image
import requests
url = "https://huggingface.co/datasets/shi-labs/oneformer_demo/blob/main/ade20k.jpeg"
image = Image.open(requests.get(url, stream=True).raw)

# Loading a single model for all three tasks
processor = OneFormerProcessor.from_pretrained("shi-labs/oneformer_ade20k_dinat_large")
model = OneFormerForUniversalSegmentation.from_pretrained("shi-labs/oneformer_ade20k_dinat_large")

# Semantic Segmentation
semantic_inputs = processor(images=image, task_inputs=["semantic"], return_tensors="pt")
semantic_outputs = model(**semantic_inputs)
# pass through image_processor for postprocessing
predicted_semantic_map = processor.post_process_semantic_segmentation(outputs, target_sizes=[image.size[::-1]])[0]

# Instance Segmentation
instance_inputs = processor(images=image, task_inputs=["instance"], return_tensors="pt")
instance_outputs = model(**instance_inputs)
# pass through image_processor for postprocessing
predicted_instance_map = processor.post_process_instance_segmentation(outputs, target_sizes=[image.size[::-1]])[0]["segmentation"]

# Panoptic Segmentation
panoptic_inputs = processor(images=image, task_inputs=["panoptic"], return_tensors="pt")
panoptic_outputs = model(**panoptic_inputs)
# pass through image_processor for postprocessing
predicted_semantic_map = processor.post_process_panoptic_segmentation(outputs, target_sizes=[image.size[::-1]])[0]["segmentation"]

如需更多示例，请参考文档。

📚 详细文档

引用

@article{jain2022oneformer,
      title={{OneFormer: One Transformer to Rule Universal Image Segmentation}},
      author={Jitesh Jain and Jiachen Li and MangTik Chiu and Ali Hassani and Nikita Orlov and Humphrey Shi},
      journal={arXiv}, 
      year={2022}
    }

📄 许可证

本项目采用MIT许可证。

属性	详情
模型类型	OneFormer模型，在ADE20k数据集（大尺寸版本，Dinat主干网络）上训练
训练数据	ADE20k数据集
适用任务	语义、实例和全景分割
引用论文	OneFormer: One Transformer to Rule Universal Image Segmentation
仓库地址	https://github.com/SHI-Labs/OneFormer