đ Social Bias NER
This NER model, fine - tuned from BERT, is designed for multi - label token classification of generalizations, unfairness, and stereotypes, offering a solution for detecting social bias in text.
đ Quick Start
Transformers pipeline doesn't have a class for multi - label token classification, but you can use this code to load the model, run it, and format the output.
Basic Usage
import json
import torch
from transformers import BertTokenizerFast, BertForTokenClassification
import gradio as gr
tokenizer = BertTokenizerFast.from_pretrained('bert-base-uncased')
model = BertForTokenClassification.from_pretrained('ethical-spectacle/social-bias-ner')
model.eval()
model.to('cuda' if torch.cuda.is_available() else 'cpu')
id2label = {
0: 'O',
1: 'B-STEREO',
2: 'I-STEREO',
3: 'B-GEN',
4: 'I-GEN',
5: 'B-UNFAIR',
6: 'I-UNFAIR'
}
def predict_ner_tags(sentence):
inputs = tokenizer(sentence, return_tensors="pt", padding=True, truncation=True, max_length=128)
input_ids = inputs['input_ids'].to(model.device)
attention_mask = inputs['attention_mask'].to(model.device)
with torch.no_grad():
outputs = model(input_ids=input_ids, attention_mask=attention_mask)
logits = outputs.logits
probabilities = torch.sigmoid(logits)
predicted_labels = (probabilities > 0.5).int()
result = []
tokens = tokenizer.convert_ids_to_tokens(input_ids[0])
for i, token in enumerate(tokens):
if token not in tokenizer.all_special_tokens:
label_indices = (predicted_labels[0][i] == 1).nonzero(as_tuple=False).squeeze(-1)
labels = [id2label[idx.item()] for idx in label_indices] if label_indices.numel() > 0 else ['O']
result.append({"token": token, "labels": labels})
return json.dumps(result, indent=4)
⨠Features
This NER model is fine - tuned from BERT, for multi - label token classification of:
- (GEN)eralizations
- (UNFAIR)ness
- (STEREO)types
You can try it out in spaces :).
đ Documentation
GUS - Net Project Details:
Resources:
Please cite:
@article{powers2024gusnet,
title={{GUS-Net: Social Bias Classification in Text with Generalizations, Unfairness, and Stereotypes}},
author={Maximus Powers and Umang Mavani and Harshitha Reddy Jonala and Ansh Tiwari and Hua Wei},
journal={arXiv preprint arXiv:2410.08388},
year={2024},
url={https://arxiv.org/abs/2410.08388}
}
Give our research group, Ethical Spectacle, a follow ;).
đ License
This project is licensed under the MIT license.
Property |
Details |
Model Type |
Fine - tuned from BERT for multi - label token classification |
Training Data |
Not specified |
Metrics |
F1: 0.7864, Recall: 0.7617 |
Base Model |
bert - base - uncased |
CO2 Eq Emissions |
Emissions: 8, Training Type: fine - tuning, Geographical Location: Phoenix, AZ, Hardware Used: T4 |