deberta-v3-large-mlm-reddit-gab开源模型 - 免费检测在线性别歧视言论

首页

Deberta V3 Large Mlm Reddit Gab

由 MilaNLProc 开发

该模型是MilaNLP团队为SemEval-2023任务10（可解释在线性别歧视检测）训练的领域适配模型，基于DeBERTa-v3-large进行Reddit和Gab语料的领域适应训练

大型语言模型

Transformers

英语开源协议:Apache-2.0 #性别歧视检测 #领域自适应 #Reddit文本分析

下载量 436

发布时间 : 2/28/2023

模型简介

通过集成领域适应与正则化预训练的语言模型，专门用于稳健的性别歧视内容检测任务

模型特点

领域适应训练

使用Reddit和Gab平台的特定领域语料进行MLM训练，增强对网络性别歧视内容的识别能力

集成正则化

采用正则化技术缓解词汇过拟合问题，生成更保守可靠的预测结果

争议样本识别

模型能识别标注存在争议的边界案例，反映仇恨言论标注的主观性挑战

模型能力

性别歧视文本分类

仇恨言论检测

社交媒体文本分析

使用案例

内容审核

社交媒体性别歧视内容过滤

自动识别Reddit等平台含有性别歧视倾向的帖子

在SemEval-2023任务10中验证有效

学术研究

仇恨言论分析

研究网络性别歧视言论的语言特征和传播模式

论文中提供了误判案例分析

🚀 MilaNLP EDOS任务模型

本模型作为MilaNLP针对EDOS共享任务解决方案的一部分进行训练和发布。它为在线性别歧视的可解释检测提供了有效的工具，有助于推动相关领域的研究和应用。

🚀 快速开始

本模型已作为MilaNLP对EDOS共享任务解决方案的一部分进行了训练和发布。如需了解更多详细信息，请查阅论文 MilaNLP at SemEval-2023 Task 10: Ensembling Domain-Adapted and Regularized Pretrained Language Models for Robust Sexism Detection。

📚 详细文档

适配详情

我们使用标准的掩码语言模型（MLM）对预训练的DeBERTa 进行了领域适配，训练数据来自任务组织者提供的无标签Reddit语料库（100万条帖子）（Kirk等人，2023）和Gab仇恨语料库（8.7万条帖子）（Kennedy等人，2022）。将这两个数据集连接并打乱后，我们留出5%作为验证数据，并根据数据源进行分层。最终的训练数据集约有2000万个单词。

完整详情请参考上述论文。

📄 许可证

本项目采用Apache-2.0许可证。

📚 引用

如果您使用了该模型，请考虑引用以下文献：

@inproceedings{cercas-curry-etal-2023-milanlp,
    title = "{M}ila{NLP} at {S}em{E}val-2023 Task 10: Ensembling Domain-Adapted and Regularized Pretrained Language Models for Robust Sexism Detection",
    author = "Cercas Curry, Amanda  and
      Attanasio, Giuseppe  and
      Nozza, Debora  and
      Hovy, Dirk",
    booktitle = "Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.semeval-1.285",
    doi = "10.18653/v1/2023.semeval-1.285",
    pages = "2067--2074",
    abstract = "We present the system proposed by the MilaNLP team for the Explainable Detection of Online Sexism (EDOS) shared task.We propose an ensemble modeling approach to combine different classifiers trained with domain adaptation objectives and standard fine-tuning.Our results show that the ensemble is more robust than individual models and that regularized models generate more {``}conservative{''} predictions, mitigating the effects of lexical overfitting.However, our error analysis also finds that many of the misclassified instances are debatable, raising questions about the objective annotatability of hate speech data.",
}