🚀 英文句子偏见检测模型
该模型是一个英文序列分类模型,可检测句子(新闻文章)中的偏见和公平性,助力更客观的文本分析。
🚀 快速开始
此英文序列分类模型基于MBAD数据集进行训练,用于检测句子(新闻文章)中的偏见和公平性。该模型在distilbert - base - uncased模型的基础上构建,训练30个周期,批量大小为16,学习率为5e - 5,最大序列长度为512。
示例展示
以下是一些示例句子及其标签:
- 有偏见示例
- 示例1:“Nevertheless, Trump and other Republicans have tarred the protests as havens for terrorists intent on destroying property.”
- 示例2:“Billie Eilish issues apology for mouthing an anti - Asian derogatory term in a resurfaced video.”
- 示例3:“Christians should make clear that the perpetuation of objectionable vaccines and the lack of alternatives is a kind of coercion.”
- 无偏见示例
- 示例1:“There have been a protest by a group of people”
- 示例2:“While emphasizing he’s not singling out either party, Cohen warned about the danger of normalizing white supremacist ideology.”
模型信息
属性 |
详情 |
模型类型 |
英文序列分类模型 |
训练数据 |
MBAD Data |
碳排放 |
0.319355 Kg |
模型表现
训练准确率 |
验证准确率 |
训练损失 |
测试损失 |
76.97 |
62.00 |
0.45 |
0.96 |
💻 使用示例
基础用法
使用该模型最简单的方法是从Hugging Face加载推理API,另一种方法是通过transformers库提供的pipeline对象。
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification
from transformers import pipeline
tokenizer = AutoTokenizer.from_pretrained("d4data/bias-detection-model")
model = TFAutoModelForSequenceClassification.from_pretrained("d4data/bias-detection-model")
classifier = pipeline('text-classification', model=model, tokenizer=tokenizer)
classifier("The irony, of course, is that the exhibit that invites people to throw trash at vacuuming Ivanka Trump lookalike reflects every stereotype feminists claim to stand against, oversexualizing Ivanka’s body and ignoring her hard work.")
📄 许可证
此模型是由Deepak John Reji和Shaina Raza开展的研究课题 “Bias and Fairness in AI” 的一部分。如果您使用此作品(代码、模型或数据集),请在以下链接处点赞:
Bias & Fairness in AI, (2022), GitHub repository, https://github.com/dreji18/Fairness-in-AI