🚀 Mukayese: Turkish NLP Strikes Back
This model is designed for Turkish text summarization, trained from scratch on the mlsum/tu dataset without pre - training, offering high - quality summarization results.
🚀 Quick Start
This README provides detailed information about the mukayese/transformer - turkish - summarization
model, including its performance, training procedure, and citation details.
✨ Features
- Uncased Model: The model is uncased, initialized from scratch.
- Trained on Specific Dataset: It was trained solely on the mlsum/tu dataset without pre - training.
- Good Evaluation Results: Achieves competitive scores on Rouge metrics.
📚 Documentation
Model Performance
The model achieves the following results on the evaluation set:
Property |
Details |
Rouge1 |
43.2049 |
Rouge2 |
30.7082 |
Rougel |
38.1981 |
Rougelsum |
39.9453 |
Check this paper for more details on the model and the dataset.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 8
- seed: 42
- distributed_type: multi - GPU
- num_devices: 8
- gradient_accumulation_steps: 2
- total_train_batch_size: 64
- total_eval_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e - 08
- lr_scheduler_type: linear
- num_epochs: 15.0
- mixed_precision_training: Native AMP
- label_smoothing_factor: 0.1
Framework versions
- Transformers 4.11.3
- Pytorch 1.8.2+cu111
- Datasets 1.14.0
- Tokenizers 0.10.3
Citation
@misc{safaya-etal-2022-mukayese,
title={Mukayese: Turkish NLP Strikes Back},
author={Ali Safaya and Emirhan Kurtuluş and Arda Göktoğan and Deniz Yuret},
year={2022},
eprint={2203.01215},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
📄 License
This project is licensed under the MIT license.