đ Model Card of lmqg/t5-base-squad-qg-ae
This model is a fine - tuned version of [t5 - base](https://huggingface.co/t5 - base) for joint question generation and answer extraction. It is trained on the lmqg/qg_squad (dataset_name: default) via [lmqg
](https://github.com/asahi417/lm - question - generation).
⨠Features
- Language Model: Utilizes [t5 - base](https://huggingface.co/t5 - base) as the foundation.
- Language: Supports English (
en
).
- Training Data: Trained on lmqg/qg_squad (default).
- Online Demo: Available at https://autoqg.net/.
- Repository: Check the source code at [https://github.com/asahi417/lm - question - generation](https://github.com/asahi417/lm - question - generation).
- Paper: Refer to the research paper https://arxiv.org/abs/2210.03992.
đĻ Installation
Since the model is used through existing libraries, you need to install relevant dependencies. For example, if you use lmqg
or transformers
, you can install them via pip
:
pip install lmqg transformers
đģ Usage Examples
Basic Usage
With [lmqg
](https://github.com/asahi417/lm - question - generation#lmqg - language - model - for - question - generation -)
from lmqg import TransformersQG
model = TransformersQG(language="en", model="lmqg/t5-base-squad-qg-ae")
question_answer_pairs = model.generate_qa("William Turner was an English painter who specialised in watercolour landscapes")
With transformers
from transformers import pipeline
pipe = pipeline("text2text-generation", "lmqg/t5-base-squad-qg-ae")
answer = pipe("generate question: <hl> Beyonce <hl> further expanded her acting career, starring as blues singer Etta James in the 2008 musical biopic, Cadillac Records.")
question = pipe("extract answers: <hl> Beyonce further expanded her acting career, starring as blues singer Etta James in the 2008 musical biopic, Cadillac Records. <hl> Her performance in the film received praise from critics, and she garnered several nominations for her portrayal of James, including a Satellite Award nomination for Best Supporting Actress, and a NAACP Image Award nomination for Outstanding Supporting Actress.")
đ Documentation
Evaluation
Question Generation
- Metric (Question Generation): [raw metric file](https://huggingface.co/lmqg/t5 - base - squad - qg - ae/raw/main/eval/metric.first.sentence.paragraph_answer.question.lmqg_qg_squad.default.json)
Question & Answer Generation
- Metric (Question & Answer Generation): [raw metric file](https://huggingface.co/lmqg/t5 - base - squad - qg - ae/raw/main/eval/metric.first.answer.paragraph.questions_answers.lmqg_qg_squad.default.json)
Answer Extraction
- Metric (Answer Extraction): [raw metric file](https://huggingface.co/lmqg/t5 - base - squad - qg - ae/raw/main/eval/metric.first.answer.paragraph_sentence.answer.lmqg_qg_squad.default.json)
Training hyperparameters
The following hyperparameters were used during fine - tuning:
Property |
Details |
dataset_path |
lmqg/qg_squad |
dataset_name |
default |
input_types |
['paragraph_answer', 'paragraph_sentence'] |
output_types |
['question', 'answer'] |
prefix_types |
['qg', 'ae'] |
model |
t5 - base |
max_length |
512 |
max_length_output |
32 |
epoch |
6 |
batch |
32 |
lr |
0.0001 |
fp16 |
False |
random_seed |
1 |
gradient_accumulation_steps |
4 |
label_smoothing |
0.15 |
The full configuration can be found at [fine - tuning config file](https://huggingface.co/lmqg/t5 - base - squad - qg - ae/raw/main/trainer_config.json).
đ License
This project is licensed under the CC - BY - 4.0 license.
đ Citation
@inproceedings{ushio-etal-2022-generative,
title = "{G}enerative {L}anguage {M}odels for {P}aragraph-{L}evel {Q}uestion {G}eneration",
author = "Ushio, Asahi and
Alva-Manchego, Fernando and
Camacho-Collados, Jose",
booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing",
month = dec,
year = "2022",
address = "Abu Dhabi, U.A.E.",
publisher = "Association for Computational Linguistics",
}