BiRefNet_HR Open-Source Image Segmentation Model - Free Deployment for High-Resolution Background Removal and Mask Generation

Birefnet HR

Developed by ZhengPeng7

BiRefNet is a bilateral reference framework model for high-resolution binary image segmentation, focusing on background removal and mask generation tasks.

Image Segmentation

Safetensors

Open Source License:MIT #High-resolution image segmentation #Bilateral reference framework #Camouflaged object detection

Downloads 35.07k

Release Time : 2/1/2025

Model Overview

BiRefNet is an advanced image segmentation model specifically designed for binary segmentation tasks of high-resolution images, such as background removal and mask generation. It adopts a bilateral reference framework, supports input with a resolution of up to 2048x2048, and can handle inferences at higher resolutions.

Model Features

High-resolution support

The model is trained using images with a resolution of 2048x2048 and supports inferences at higher resolutions.

Bilateral reference framework

It adopts a unique bilateral reference framework design to improve segmentation accuracy.

Multi-task support

It supports multiple tasks such as background removal, mask generation, binary image segmentation, camouflaged object detection, and salient object detection.

Model Capabilities

Image segmentation

Background removal

Mask generation

Binary image segmentation

Camouflaged object detection

Salient object detection

Use Cases

Image editing

Product image background removal

Used for automatic background removal of product images on e-commerce platforms

Generate product images with a transparent background

Portrait segmentation

Used for separating portraits from backgrounds in photo editing applications

Generate accurate portrait masks

Computer vision

Camouflaged object detection

Used for identifying camouflaged objects in military or security fields

High-precision object segmentation results

🚀 BiRefNet

BiRefNet is a model designed for high - resolution dichotomous image segmentation, excelling in tasks such as background removal, mask generation, and object detection.

🚀 Quick Start

0. Install Packages:

pip install -qr https://raw.githubusercontent.com/ZhengPeng7/BiRefNet/main/requirements.txt

1. Load BiRefNet:

Use codes + weights from HuggingFace

Only use the weights on HuggingFace -- Pro: No need to download BiRefNet codes manually; Con: Codes on HuggingFace might not be latest version (I'll try to keep them always latest).

# Load BiRefNet with weights
from transformers import AutoModelForImageSegmentation
birefnet = AutoModelForImageSegmentation.from_pretrained('ZhengPeng7/BiRefNet_HR', trust_remote_code=True)

Use codes from GitHub + weights from HuggingFace

Only use the weights on HuggingFace -- Pro: codes are always latest; Con: Need to clone the BiRefNet repo from my GitHub.

# Download codes
git clone https://github.com/ZhengPeng7/BiRefNet.git
cd BiRefNet

# Use codes locally
from models.birefnet import BiRefNet

# Load weights from Hugging Face Models
birefnet = BiRefNet.from_pretrained('ZhengPeng7/BiRefNet_HR')

Use codes from GitHub + weights from local space

Only use the weights and codes both locally.

# Use codes and weights locally
import torch
from utils import check_state_dict

birefnet = BiRefNet(bb_pretrained=False)
state_dict = torch.load(PATH_TO_WEIGHT, map_location='cpu')
state_dict = check_state_dict(state_dict)
birefnet.load_state_dict(state_dict)

Use the loaded BiRefNet for inference

# Imports
from PIL import Image
import matplotlib.pyplot as plt
import torch
from torchvision import transforms
from models.birefnet import BiRefNet

birefnet = ... # -- BiRefNet should be loaded with codes above, either way.
torch.set_float32_matmul_precision(['high', 'highest'][0])
birefnet.to('cuda')
birefnet.eval()
birefnet.half()

def extract_object(birefnet, imagepath):
    # Data settings
    image_size = (2048, 2048)
    transform_image = transforms.Compose([
        transforms.Resize(image_size),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ])

    image = Image.open(imagepath)
    input_images = transform_image(image).unsqueeze(0).to('cuda').half()

    # Prediction
    with torch.no_grad():
        preds = birefnet(input_images)[-1].sigmoid().cpu()
    pred = preds[0].squeeze()
    pred_pil = transforms.ToPILImage()(pred)
    mask = pred_pil.resize(image.size)
    image.putalpha(mask)
    return image, mask

# Visualization
plt.axis("off")
plt.imshow(extract_object(birefnet, imagepath='PATH-TO-YOUR_IMAGE.jpg')[0])
plt.show()

2. Use inference endpoint locally:

You may need to click the deploy and set up the endpoint by yourself, which would make some costs.

import requests
import base64
from io import BytesIO
from PIL import Image


YOUR_HF_TOKEN = 'xxx'
API_URL = "xxx"
headers = {
    "Authorization": "Bearer {}".format(YOUR_HF_TOKEN)
}

def base64_to_bytes(base64_string):
    # Remove the data URI prefix if present
    if "data:image" in base64_string:
        base64_string = base64_string.split(",")[1]

    # Decode the Base64 string into bytes
    image_bytes = base64.b64decode(base64_string)
    return image_bytes

def bytes_to_base64(image_bytes):
    # Create a BytesIO object to handle the image data
    image_stream = BytesIO(image_bytes)

    # Open the image using Pillow (PIL)
    image = Image.open(image_stream)
    return image

def query(payload):
    response = requests.post(API_URL, headers=headers, json=payload)
    return response.json()

output = query({
    "inputs": "https://hips.hearstapps.com/hmg-prod/images/gettyimages-1229892983-square.jpg",
    "parameters": {}
})

output_image = bytes_to_base64(base64_to_bytes(output))
output_image

✨ Features

Multitask Capability: Suitable for background removal, mask generation, Dichotomous Image Segmentation, Camouflaged Object Detection, and Salient Object Detection.
High - Resolution Inference: Trained with 2048x2048 images for better performance in high - resolution scenarios.
SOTA Performance: Achieved state - of - the - art performance on three tasks (DIS, HRSOD, and COD).

📦 Installation

The installation steps are included in the "Quick Start" section.

💻 Usage Examples

Basic Usage

The basic usage of loading and using BiRefNet is demonstrated in the "Quick Start" section, including different ways to load the model and perform inference.

Advanced Usage

The code for using the inference endpoint locally in the "Quick Start" section can be considered an advanced usage scenario, which involves making requests to an API for inference.

📚 Documentation

This BiRefNet for standard dichotomous image segmentation (DIS) is trained on DIS - TR and validated on DIS - TEs and DIS - VD.

This repo holds the official model weights of "Bilateral Reference for High - Resolution Dichotomous Image Segmentation" (CAAI AIR 2024).

This repo contains the weights of BiRefNet proposed in our paper, which has achieved the SOTA performance on three tasks (DIS, HRSOD, and COD).

Go to my GitHub page for BiRefNet codes and the latest updates: https://github.com/ZhengPeng7/BiRefNet :)

Try our online demos for inference:

Online Image Inference on Colab:
Online Inference with GUI on Hugging Face with adjustable resolutions:
Inference and evaluation of your given weights:

🔧 Technical Details

Performance:

All tested in FP16 mode.

Dataset	Method	Resolution	maxFm	wFmeasure	MAE	Smeasure	meanEm	HCE	maxEm	meanFm	adpEm	adpFm	mBA	maxBIoU	meanBIoU
DIS-VD	BiRefNet_HR-general-epoch_130	2048x2048	.925	.894	.026	.927	.952	811	.960	.909	.944	.888	.828	.837	.817
DIS-VD	BiRefNet_HR-general-epoch_130	1024x1024	.876	.840	.041	.893	.913	1348	.926	.860	.930	.857	.765	.769	.742
DIS-VD	BiRefNet-general-epoch_244	2048x2048	.888	.858	.037	.898	.934	811	.941	.878	.927	.862	.802	.790	.776
DIS-VD	BiRefNet-general-epoch_244	1024x1024	.908	.877	.034	.912	.943	1128	.953	.894	.944	.881	.796	.812	.789

📄 License

This project is licensed under the MIT license.

Acknowledgement:

Many thanks to @freepik for their generous support on GPU resources for training this model!

Citation

@article{zheng2024birefnet,
  title={Bilateral Reference for High-Resolution Dichotomous Image Segmentation},
  author={Zheng, Peng and Gao, Dehong and Fan, Deng-Ping and Liu, Li and Laaksonen, Jorma and Ouyang, Wanli and Sebe, Nicu},
  journal={CAAI Artificial Intelligence Research},
  volume = {3},
  pages = {9150038},
  year={2024}
}

DIS-Sample_1	DIS-Sample_2

This repo is the official implementation of "Bilateral Reference for High - Resolution Dichotomous Image Segmentation" (CAAI AIR 2024).

Visit our GitHub repo: https://github.com/ZhengPeng7/BiRefNet for more details -- codes, docs, and model zoo!

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご