đ BiRefNet
BiRefNet is a model designed for high - resolution dichotomous image segmentation, excelling in tasks such as background removal, mask generation, and object detection.
đ Quick Start
0. Install Packages:
pip install -qr https://raw.githubusercontent.com/ZhengPeng7/BiRefNet/main/requirements.txt
1. Load BiRefNet:
Use codes + weights from HuggingFace
Only use the weights on HuggingFace -- Pro: No need to download BiRefNet codes manually; Con: Codes on HuggingFace might not be latest version (I'll try to keep them always latest).
from transformers import AutoModelForImageSegmentation
birefnet = AutoModelForImageSegmentation.from_pretrained('ZhengPeng7/BiRefNet_HR', trust_remote_code=True)
Use codes from GitHub + weights from HuggingFace
Only use the weights on HuggingFace -- Pro: codes are always latest; Con: Need to clone the BiRefNet repo from my GitHub.
# Download codes
git clone https://github.com/ZhengPeng7/BiRefNet.git
cd BiRefNet
from models.birefnet import BiRefNet
birefnet = BiRefNet.from_pretrained('ZhengPeng7/BiRefNet_HR')
Use codes from GitHub + weights from local space
Only use the weights and codes both locally.
import torch
from utils import check_state_dict
birefnet = BiRefNet(bb_pretrained=False)
state_dict = torch.load(PATH_TO_WEIGHT, map_location='cpu')
state_dict = check_state_dict(state_dict)
birefnet.load_state_dict(state_dict)
Use the loaded BiRefNet for inference
from PIL import Image
import matplotlib.pyplot as plt
import torch
from torchvision import transforms
from models.birefnet import BiRefNet
birefnet = ...
torch.set_float32_matmul_precision(['high', 'highest'][0])
birefnet.to('cuda')
birefnet.eval()
birefnet.half()
def extract_object(birefnet, imagepath):
image_size = (2048, 2048)
transform_image = transforms.Compose([
transforms.Resize(image_size),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])
image = Image.open(imagepath)
input_images = transform_image(image).unsqueeze(0).to('cuda').half()
with torch.no_grad():
preds = birefnet(input_images)[-1].sigmoid().cpu()
pred = preds[0].squeeze()
pred_pil = transforms.ToPILImage()(pred)
mask = pred_pil.resize(image.size)
image.putalpha(mask)
return image, mask
plt.axis("off")
plt.imshow(extract_object(birefnet, imagepath='PATH-TO-YOUR_IMAGE.jpg')[0])
plt.show()
2. Use inference endpoint locally:
You may need to click the deploy and set up the endpoint by yourself, which would make some costs.
import requests
import base64
from io import BytesIO
from PIL import Image
YOUR_HF_TOKEN = 'xxx'
API_URL = "xxx"
headers = {
"Authorization": "Bearer {}".format(YOUR_HF_TOKEN)
}
def base64_to_bytes(base64_string):
if "data:image" in base64_string:
base64_string = base64_string.split(",")[1]
image_bytes = base64.b64decode(base64_string)
return image_bytes
def bytes_to_base64(image_bytes):
image_stream = BytesIO(image_bytes)
image = Image.open(image_stream)
return image
def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)
return response.json()
output = query({
"inputs": "https://hips.hearstapps.com/hmg-prod/images/gettyimages-1229892983-square.jpg",
"parameters": {}
})
output_image = bytes_to_base64(base64_to_bytes(output))
output_image
⨠Features
- Multitask Capability: Suitable for background removal, mask generation, Dichotomous Image Segmentation, Camouflaged Object Detection, and Salient Object Detection.
- High - Resolution Inference: Trained with
2048x2048
images for better performance in high - resolution scenarios.
- SOTA Performance: Achieved state - of - the - art performance on three tasks (DIS, HRSOD, and COD).
đĻ Installation
The installation steps are included in the "Quick Start" section.
đģ Usage Examples
Basic Usage
The basic usage of loading and using BiRefNet is demonstrated in the "Quick Start" section, including different ways to load the model and perform inference.
Advanced Usage
The code for using the inference endpoint locally in the "Quick Start" section can be considered an advanced usage scenario, which involves making requests to an API for inference.
đ Documentation
This BiRefNet for standard dichotomous image segmentation (DIS) is trained on DIS - TR and validated on DIS - TEs and DIS - VD.
This repo holds the official model weights of "Bilateral Reference for High - Resolution Dichotomous Image Segmentation" (CAAI AIR 2024).
This repo contains the weights of BiRefNet proposed in our paper, which has achieved the SOTA performance on three tasks (DIS, HRSOD, and COD).
Go to my GitHub page for BiRefNet codes and the latest updates: https://github.com/ZhengPeng7/BiRefNet :)
Try our online demos for inference:
- Online Image Inference on Colab:

- Online Inference with GUI on Hugging Face with adjustable resolutions:

- Inference and evaluation of your given weights:

đ§ Technical Details
Performance:
All tested in FP16 mode.
Dataset |
Method |
Resolution |
maxFm |
wFmeasure |
MAE |
Smeasure |
meanEm |
HCE |
maxEm |
meanFm |
adpEm |
adpFm |
mBA |
maxBIoU |
meanBIoU |
DIS-VD |
BiRefNet_HR-general-epoch_130 |
2048x2048 |
.925 |
.894 |
.026 |
.927 |
.952 |
811 |
.960 |
.909 |
.944 |
.888 |
.828 |
.837 |
.817 |
DIS-VD |
BiRefNet_HR-general-epoch_130 |
1024x1024 |
.876 |
.840 |
.041 |
.893 |
.913 |
1348 |
.926 |
.860 |
.930 |
.857 |
.765 |
.769 |
.742 |
DIS-VD |
BiRefNet-general-epoch_244 |
2048x2048 |
.888 |
.858 |
.037 |
.898 |
.934 |
811 |
.941 |
.878 |
.927 |
.862 |
.802 |
.790 |
.776 |
DIS-VD |
BiRefNet-general-epoch_244 |
1024x1024 |
.908 |
.877 |
.034 |
.912 |
.943 |
1128 |
.953 |
.894 |
.944 |
.881 |
.796 |
.812 |
.789 |
đ License
This project is licensed under the MIT license.
Acknowledgement:
- Many thanks to @freepik for their generous support on GPU resources for training this model!
Citation
@article{zheng2024birefnet,
title={Bilateral Reference for High-Resolution Dichotomous Image Segmentation},
author={Zheng, Peng and Gao, Dehong and Fan, Deng-Ping and Liu, Li and Laaksonen, Jorma and Ouyang, Wanli and Sebe, Nicu},
journal={CAAI Artificial Intelligence Research},
volume = {3},
pages = {9150038},
year={2024}
}
DIS-Sample_1 |
DIS-Sample_2 |
 |
 |
This repo is the official implementation of "Bilateral Reference for High - Resolution Dichotomous Image Segmentation" (CAAI AIR 2024).
Visit our GitHub repo: https://github.com/ZhengPeng7/BiRefNet for more details -- codes, docs, and model zoo!