Inclusively-Reformulation-IT5 Open-Source Model - Free Deployment for Italian Inclusive Language Rewriting

Inclusively Reformulation It5

Developed by E-MIMIC

An Italian sequence-to-sequence model fine-tuned on IT5-large, specifically designed for inclusive language rewriting tasks

Machine Translation

Transformers

#Italian language rewriting #Inclusive language #Sequence-to-sequence

Downloads 70

Release Time : 6/23/2023

Model Overview

This model can analyze and rewrite Italian sentences to make them more inclusive. For example, rewriting gender-specific expressions into gender-neutral ones.

Model Features

Inclusive rewriting

Automatically rewrites non-inclusive expressions into inclusive ones

Professionally annotated training data

Trained on 4,705 expert-annotated sentence pairs to ensure rewriting quality

Synthetic data augmentation

Improves model performance by incorporating rule-generated synthetic data

Model Capabilities

Italian text rewriting

Inclusive language conversion

Gender-neutral expression generation

Use Cases

Formal document writing

Academic document rewriting

Rewriting gender-specific expressions in academic documents into gender-neutral ones

For example, rewriting 'professors' into 'teaching staff'

Corporate document rewriting

Making official corporate documents more inclusive

Content creation

News writing

Helping journalists create more inclusive content

🚀 Inclusively Rewriting model

This is an Italian sequence-to-sequence model fine-tuned from the IT5-large for inclusive language rewriting. It analyzes and rewrites Italian sentences to make them more inclusive when necessary. For instance, it can rewrite I professori devono essere preparati (The professors must be prepared) as Il personale docente deve essere preparato (The teaching staff must be prepared).

📦 Installation

No installation steps provided in the original document, so this section is skipped.

💻 Usage Examples

No code examples provided in the original document, so this section is skipped.

📚 Documentation

📊 Training data

The model was trained on a dataset with 4705 sentence pairs, each having an inclusive and a non - inclusive sentence. The dataset split is as follows:

Training set: 3764 pairs
Validation set: 470 pairs
Test set: 471 pairs

A small set of synthetic data (generated by rules) was used to enhance the model's test - set performance. The total number of pairs for training is 3764 + 75 = 3839 pairs. The data collection was manually annotated by inclusive language experts, and the dataset is not publicly available yet.

⚙️ Training procedure

The model was fine - tuned from the Italian BERT model with these hyperparameters:

max_length: 128
batch_size: 8
learning_rate: 5e - 5
warmup_steps: 500
epochs: 25 (the best model is selected based on validation BLEU score)
optimizer: AdamW

📈 Evaluation results

The model was evaluated on the test set, and here are the results:

Model	BLEU	ROUGE - 2 F1	Human Correct	Human Partial (L)	Human Incorrect (L)
IT5 (no synth. data)	80.32	87.17	64.76	15.71	19.52
This	80.79	87.47	69.52	17.14	13.22

(L) in the metric means "Lower is better". Comparing with the model without synthetic data shows that synthetic data improves the model's test - set performance. Other comparisons can be found in the paper.

📄 Citation

If you use this model, please cite the following papers:

Main paper:

@article{10.1145/3729237,
author = {Greco, Salvatore and La Quatra, Moreno and Cagliero, Luca and Cerquitelli, Tania},
title = {Towards AI-Assisted Inclusive Language Writing in Italian Formal Communications},
year = {2025},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
issn = {2157-6904},
url = {https://doi.org/10.1145/3729237},
doi = {10.1145/3729237},
note = {Just Accepted},
journal = {ACM Trans. Intell. Syst. Technol.},
month = apr,
}

Demo paper:

@InProceedings{PKDD23_inclusively,
author="La Quatra, Moreno
and Greco, Salvatore
and Cagliero, Luca
and Cerquitelli, Tania",
title="Inclusively: An AI-Based Assistant for Inclusive Writing",
booktitle="Machine Learning and Knowledge Discovery in Databases: Applied Data Science and Demo Track",
year="2023",
publisher="Springer Nature Switzerland",
address="Cham",
pages="361--365",
isbn="978-3-031-43430-3",
doi="10.1007/978-3-031-43430-3_31"
}

📄 License

The model is released under the cc-by-nc-sa-4.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご