Open-source face-parsing model - Precise facial parsing for semantic segmentation in multiple application scenarios

Home

Face Parsing

Developed by jonathandinu

Semantic segmentation model fine-tuned from nvidia/mit-b5 for face parsing tasks

Image Segmentation

Transformers

English#Facial Part Segmentation #High-Precision Semantic Segmentation #Celebrity Face Parsing

Downloads 398.59k

Release Time : 7/6/2022

Model Overview

This is a semantic segmentation model based on the Segformer architecture, specifically designed for face parsing tasks. It can segment facial images into 19 different semantic regions (such as skin, eyes, nose, lips, etc.).

Model Features

High-Precision Face Parsing

Accurately segments different facial parts, including 19 semantic regions such as skin, eyes, eyebrows, and lips.

Transformer-Based Architecture

Utilizes the Segformer architecture, combining the advantages of Transformers with efficient design.

Browser Compatibility

Provides ONNX format, supporting inference in browsers using Transformers.js.

Model Capabilities

Facial Region Segmentation

Semantic Segmentation

Image Analysis

Face Parsing

Use Cases

Computer Vision

Beauty Applications

Precisely identifies different facial regions to achieve targeted beauty effects.

Can accurately apply beauty effects to specific facial areas.

Virtual Makeup

Identifies regions like lips and eyes to apply virtual makeup effects.

Can accurately apply virtual cosmetics to the correct facial positions.

Facial Feature Analysis

Analyzes the features and proportions of different facial regions.

Can be used for facial recognition, emotion analysis, and other applications.

🚀 Face Parsing

This is a Semantic segmentation model fine - tuned from nvidia/mit-b5 with CelebAMask-HQ for face parsing. It provides solutions for face parsing tasks.

example image and output

This Semantic segmentation model is fine - tuned from nvidia/mit-b5 using the CelebAMask-HQ dataset for face parsing. For more options, refer to the Transformers Segformer docs.

The ONNX model for web inference is contributed by Xenova.

🚀 Quick Start

💻 Usage Examples

Basic Usage in Python

An exhaustive list of labels can be extracted from config.json.

id	label	note
0	background
1	skin
2	nose
3	eye_g	eyeglasses
4	l_eye	left eye
5	r_eye	right eye
6	l_brow	left eyebrow
7	r_brow	right eyebrow
8	l_ear	left ear
9	r_ear	right ear
10	mouth	area between lips
11	u_lip	upper lip
12	l_lip	lower lip
13	hair
14	hat
15	ear_r	earring
16	neck_l	necklace
17	neck
18	cloth	clothing

import torch
from torch import nn
from transformers import SegformerImageProcessor, SegformerForSemanticSegmentation

from PIL import Image
import matplotlib.pyplot as plt
import requests

# convenience expression for automatically determining device
device = (
    "cuda"
    # Device for NVIDIA or AMD GPUs
    if torch.cuda.is_available()
    else "mps"
    # Device for Apple Silicon (Metal Performance Shaders)
    if torch.backends.mps.is_available()
    else "cpu"
)

# load models
image_processor = SegformerImageProcessor.from_pretrained("jonathandinu/face-parsing")
model = SegformerForSemanticSegmentation.from_pretrained("jonathandinu/face-parsing")
model.to(device)

# expects a PIL.Image or torch.Tensor
url = "https://images.unsplash.com/photo-1539571696357-5a69c17a67c6"
image = Image.open(requests.get(url, stream=True).raw)

# run inference on image
inputs = image_processor(images=image, return_tensors="pt").to(device)
outputs = model(**inputs)
logits = outputs.logits  # shape (batch_size, num_labels, ~height/4, ~width/4)

# resize output to match input image dimensions
upsampled_logits = nn.functional.interpolate(logits,
                size=image.size[::-1], # H x W
                mode='bilinear',
                align_corners=False)

# get label masks
labels = upsampled_logits.argmax(dim=1)[0]

# move to CPU to visualize in matplotlib
labels_viz = labels.cpu().numpy()
plt.imshow(labels_viz)
plt.show()

Basic Usage in the browser (Transformers.js)

import {
  pipeline,
  env,
} from "https://cdn.jsdelivr.net/npm/@xenova/transformers@2.14.0";

// important to prevent errors since the model files are likely remote on HF hub
env.allowLocalModels = false;

// instantiate image segmentation pipeline with pretrained face parsing model
model = await pipeline("image-segmentation", "jonathandinu/face-parsing");

// async inference since it could take a few seconds
const output = await model(url);

// each label is a separate mask object
// [
//   { score: null, label: 'background', mask: transformers.js RawImage { ... }}
//   { score: null, label: 'hair', mask: transformers.js RawImage { ... }}
//    ...
// ]
for (const m of output) {
  print(`Found ${m.label}`);
  m.mask.save(`${m.label}.png`);
}

Advanced Usage - p5.js

Since p5.js uses an animation loop abstraction, we need to take care loading the model and making predictions.

// ...

// asynchronously load transformers.js and instantiate model
async function preload() {
  // load transformers.js library with a dynamic import
  const { pipeline, env } = await import(
    "https://cdn.jsdelivr.net/npm/@xenova/transformers@2.14.0"
  );

  // important to prevent errors since the model files are remote on HF hub
  env.allowLocalModels = false;

  // instantiate image segmentation pipeline with pretrained face parsing model
  model = await pipeline("image-segmentation", "jonathandinu/face-parsing");

  print("face-parsing model loaded");
}

// ...

full p5.js example

📚 Documentation

Model Description

Property	Details
Developed by	Jonathan Dinu
Model Type	Transformer-based semantic segmentation image model
License	non - commercial research and educational purposes
Resources for more information	Transformers docs on Segformer and/or the original research paper.

🔧 Technical Details

Limitations and Bias

⚠️ Important Note

While the capabilities of computer vision models are impressive, they can also reinforce or exacerbate social biases. The CelebAMask-HQ dataset used for fine-tuning is large but not necessarily perfectly diverse or representative. Also, they are images of.... just celebrities.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご