🚀 mT5-small based Turkish Multitask (Answer Extraction, Question Generation and Question Answering) System
This single model, fine-tuned from Google's Multilingual T5-small on the Turkish Question Answering dataset, serves three downstream tasks: Answer extraction, Question Generation, and Question Answering. The mT5 model was also trained for multiple text2text NLP tasks.
All data processing, training, and pipeline codes can be found on my Github. I will share the training details in the repo as soon as possible.
The mT5 small model has 300 million parameters and the model size is about 1.2GB. Therefore, it takes a significant amount of time to fine-tune it.
During training, 8 epochs and a learning rate of 1e-4 with 0 warmup steps were applied. These hyperparameters and others can be fine-tuned for even better results.
🚀 Quick Start
📦 Installation
!pip install transformers==4.4.2
!pip install sentencepiece==0.1.95
!git clone https://github.com/ozcangundes/multitask-question-generation.git
%cd multitask-question-generation/
💻 Usage Examples
🔍 Basic Usage
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("ozcangundes/mt5-multitask-qa-qg-turkish")
model = AutoModelForSeq2SeqLM.from_pretrained("ozcangundes/mt5-multitask-qa-qg-turkish")
from pipelines import pipeline
multimodel = pipeline("multitask-qa-qg",tokenizer=tokenizer,model=model)
text="Özcan Gündeş, 1993 yılı Tarsus doğumludur. Orta Doğu Teknik Üniversitesi \\\\
Endüstri Mühendisliği bölümünde 2011 2016 yılları arasında lisans eğitimi görmüştür. \\\\
Yüksek lisansını ise 2020 Aralık ayında, 4.00 genel not ortalaması ile \\\\
Boğaziçi Üniversitesi, Yönetim Bilişim Sistemleri bölümünde tamamlamıştır.\\\\
Futbolla yakından ilgilenmekle birlikte, Galatasaray kulübü taraftarıdır."
📋 Example - Both Question Generation and Question Answering
multimodel(text)
=> [{'answer': 'Tarsus', 'question': 'Özcan Gündeş nerede doğmuştur?'},
{'answer': '1993', 'question': 'Özcan Gündeş kaç yılında doğmuştur?'},
{'answer': '2011 2016',
'question': 'Özcan Gündeş lisans eğitimini hangi yıllar arasında tamamlamıştır?'},
{'answer': 'Boğaziçi Üniversitesi, Yönetim Bilişim Sistemleri',
'question': 'Özcan Gündeş yüksek lisansını hangi bölümde tamamlamıştır?'},
{'answer': 'Galatasaray kulübü',
'question': 'Özcan Gündeş futbolla yakından ilgilenmekle birlikte hangi kulübü taraftarıdır?'}]
From this text, 5 questions are generated and they are answered by the model.
📝 Example - Question Answering
Both text and also, related question should be passed into pipeline.
multimodel({"context":text,"question":"Özcan hangi takımı tutmaktadır?"})
=> Galatasaray
multimodel({"context":text,"question":"Özcan, yüksek lisanstan ne zaman mezun oldu?"})
=> 2020 Aralık ayında
multimodel({"context":text,"question":"Özcan'ın yüksek lisans bitirme notu kaçtır?"})
=> 4.00
📄 License
This project is licensed under the Apache-2.0 license.
👏 Acknowledgement
This work is inspired from Suraj Patil's great repo. I would like to thank him for the clean codes and also, Okan Çiftçi for the Turkish dataset 🙏
📊 Information Table
Property |
Details |
Model Type |
mT5-small based Turkish Multitask System |
Training Data |
Turkish Question Answering dataset |
Tags |
question-answering, question-generation, multitask-model |
License |
apache-2.0 |