deberta-v3-large-mlm-reddit-gab開源模型 - 免費檢測在線性別歧視言論

首頁

Deberta V3 Large Mlm Reddit Gab

由MilaNLProc開發

該模型是MilaNLP團隊為SemEval-2023任務10（可解釋在線性別歧視檢測）訓練的領域適配模型，基於DeBERTa-v3-large進行Reddit和Gab語料的領域適應訓練

大型語言模型

Transformers

英語開源協議:Apache-2.0 #性別歧視檢測 #領域自適應 #Reddit文本分析

下載量 436

發布時間 : 2/28/2023

模型概述

通過集成領域適應與正則化預訓練的語言模型，專門用於穩健的性別歧視內容檢測任務

模型特點

領域適應訓練

使用Reddit和Gab平臺的特定領域語料進行MLM訓練，增強對網絡性別歧視內容的識別能力

集成正則化

採用正則化技術緩解詞彙過擬合問題，生成更保守可靠的預測結果

爭議樣本識別

模型能識別標註存在爭議的邊界案例，反映仇恨言論標註的主觀性挑戰

模型能力

性別歧視文本分類

仇恨言論檢測

社交媒體文本分析

使用案例

內容審核

社交媒體性別歧視內容過濾

自動識別Reddit等平臺含有性別歧視傾向的帖子

在SemEval-2023任務10中驗證有效

學術研究

仇恨言論分析

研究網絡性別歧視言論的語言特徵和傳播模式

論文中提供了誤判案例分析

🚀 MilaNLP EDOS任務模型

本模型作為MilaNLP針對EDOS共享任務解決方案的一部分進行訓練和發佈。它為在線性別歧視的可解釋檢測提供了有效的工具，有助於推動相關領域的研究和應用。

🚀 快速開始

本模型已作為MilaNLP對EDOS共享任務解決方案的一部分進行了訓練和發佈。如需瞭解更多詳細信息，請查閱論文 MilaNLP at SemEval-2023 Task 10: Ensembling Domain-Adapted and Regularized Pretrained Language Models for Robust Sexism Detection。

📚 詳細文檔

適配詳情

我們使用標準的掩碼語言模型（MLM）對預訓練的DeBERTa 進行了領域適配，訓練數據來自任務組織者提供的無標籤Reddit語料庫（100萬條帖子）（Kirk等人，2023）和Gab仇恨語料庫（8.7萬條帖子）（Kennedy等人，2022）。將這兩個數據集連接並打亂後，我們留出5%作為驗證數據，並根據數據源進行分層。最終的訓練數據集約有2000萬個單詞。

完整詳情請參考上述論文。

📄 許可證

本項目採用Apache-2.0許可證。

📚 引用

如果您使用了該模型，請考慮引用以下文獻：

@inproceedings{cercas-curry-etal-2023-milanlp,
    title = "{M}ila{NLP} at {S}em{E}val-2023 Task 10: Ensembling Domain-Adapted and Regularized Pretrained Language Models for Robust Sexism Detection",
    author = "Cercas Curry, Amanda  and
      Attanasio, Giuseppe  and
      Nozza, Debora  and
      Hovy, Dirk",
    booktitle = "Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.semeval-1.285",
    doi = "10.18653/v1/2023.semeval-1.285",
    pages = "2067--2074",
    abstract = "We present the system proposed by the MilaNLP team for the Explainable Detection of Online Sexism (EDOS) shared task.We propose an ensemble modeling approach to combine different classifiers trained with domain adaptation objectives and standard fine-tuning.Our results show that the ensemble is more robust than individual models and that regularized models generate more {``}conservative{''} predictions, mitigating the effects of lexical overfitting.However, our error analysis also finds that many of the misclassified instances are debatable, raising questions about the objective annotatability of hate speech data.",
}