đ Inclusively Classification Model
This is an Italian classification model fine-tuned from the Italian BERT model for classifying inclusive language in Italian.
đ Quick Start
This model is designed to classify inclusive language in Italian. It can detect three classes: inclusive
, not_inclusive
, and not_pertinent
.
⨠Features
- Fine - tuned from Italian BERT: Leveraging the pre - trained Italian BERT model for high - quality classification.
- Three - class classification: Capable of distinguishing between inclusive, non - inclusive, and non - pertinent sentences.
đĻ Installation
No installation steps are provided in the original document, so this section is skipped.
đģ Usage Examples
No code examples are provided in the original document, so this section is skipped.
đ Documentation
Training data
The model has been trained on a dataset containing:
- 8580 training sentences
- 1073 validation sentences
- 1072 test sentences
The data collection has been manually annotated by experts in the field of inclusive language (dataset is not publicly available yet).
Training procedure
The model has been fine - tuned from the Italian BERT model using the following hyperparameters:
max_length
: 128
batch_size
: 128
learning_rate
: 5e - 5
warmup_steps
: 500
epochs
: 10 (best model is selected based on validation accuracy)
optimizer
: AdamW
Evaluation results
The model has been evaluated on the test set and obtained the following results:
Model |
Accuracy |
Inclusive F1 |
Not inclusive F1 |
Not pertinent F1 |
TF - IDF + MLP |
0.68 |
0.63 |
0.69 |
0.66 |
TF - IDF + SVM |
0.61 |
0.53 |
0.60 |
0.78 |
TF - IDF + GB |
0.74 |
0.74 |
0.76 |
0.72 |
multilingual |
0.86 |
0.88 |
0.89 |
0.83 |
This |
0.89 |
0.88 |
0.92 |
0.85 |
The model has been compared with a multilingual model trained on the same data and obtained better results.
đ§ Technical Details
The model is fine - tuned from the Italian BERT model, which provides a strong foundation for the classification task. The hyperparameters are carefully selected to optimize the performance on the inclusive language classification dataset.
đ License
This project is licensed under the CC - BY - NC - SA 4.0 license.
đ Citation
If you use this model, please make sure to cite the following papers:
Main paper:
@article{10.1145/3729237,
author = {Greco, Salvatore and La Quatra, Moreno and Cagliero, Luca and Cerquitelli, Tania},
title = {Towards AI - Assisted Inclusive Language Writing in Italian Formal Communications},
year = {2025},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
issn = {2157-6904},
url = {https://doi.org/10.1145/3729237},
doi = {10.1145/3729237},
note = {Just Accepted},
journal = {ACM Trans. Intell. Syst. Technol.},
month = apr,
}
Demo paper:
@InProceedings{PKDD23_inclusively,
author="La Quatra, Moreno
and Greco, Salvatore
and Cagliero, Luca
and Cerquitelli, Tania",
title="Inclusively: An AI - Based Assistant for Inclusive Writing",
booktitle="Machine Learning and Knowledge Discovery in Databases: Applied Data Science and Demo Track",
year="2023",
publisher="Springer Nature Switzerland",
address="Cham",
pages="361--365",
isbn="978-3-031-43430-3",
doi="10.1007/978-3-031-43430-3_31"
}