t5-base-squad-qg-ae Open Source Model - Free Support for English Text Question Generation and Answer Extraction

Home

T5 Base Squad Qg Ae

Developed by lmqg

A T5-base fine-tuned joint model for question generation and answer extraction, supporting English text processing

Question Answering System

Transformers

English#Question Generation #Answer Extraction #Multi-metric Evaluation

Downloads 56

Release Time : 3/2/2022

Model Overview

This model is a text generation model fine-tuned based on the T5-base architecture, specifically designed to generate relevant questions and extract corresponding answers from given text, primarily used in education and technical document processing fields.

Model Features

Joint Task Processing

Capable of performing both question generation and answer extraction tasks simultaneously

High-Quality Generation

Fine-tuned on the SQuAD dataset, producing high-quality questions and answers

Easy Integration

Provides simple API interfaces for seamless integration into existing systems

Model Capabilities

Text-to-text generation

Question generation

Answer extraction

Natural language understanding

Contextual question answering

Use Cases

Educational Technology

Automatic Test Question Generation

Automatically generates quiz questions from textbook content

BLEU4 score 26.01, ROUGE-L score 53.4

Learning Assistance

Generates reading comprehension questions for students

BERTScore reached 90.58

Document Processing

Knowledge Base Construction

Extracts key Q&A pairs from technical documents

Answer extraction F1 score 70.18

🚀 Model Card of `lmqg/t5-base-squad-qg-ae`

This model is a fine - tuned version of [t5 - base](https://huggingface.co/t5 - base) for joint question generation and answer extraction. It is trained on the lmqg/qg_squad (dataset_name: default) via [lmqg](https://github.com/asahi417/lm - question - generation).

✨ Features

Language Model: Utilizes [t5 - base](https://huggingface.co/t5 - base) as the foundation.
Language: Supports English (en).
Training Data: Trained on lmqg/qg_squad (default).
Online Demo: Available at https://autoqg.net/.
Repository: Check the source code at [https://github.com/asahi417/lm - question - generation](https://github.com/asahi417/lm - question - generation).
Paper: Refer to the research paper https://arxiv.org/abs/2210.03992.

📦 Installation

Since the model is used through existing libraries, you need to install relevant dependencies. For example, if you use lmqg or transformers, you can install them via pip:

pip install lmqg transformers

💻 Usage Examples

Basic Usage

With [`lmqg`](https://github.com/asahi417/lm - question - generation#lmqg - language - model - for - question - generation -)

from lmqg import TransformersQG

# initialize model
model = TransformersQG(language="en", model="lmqg/t5-base-squad-qg-ae")

# model prediction
question_answer_pairs = model.generate_qa("William Turner was an English painter who specialised in watercolour landscapes")

With `transformers`

from transformers import pipeline

pipe = pipeline("text2text-generation", "lmqg/t5-base-squad-qg-ae")

# answer extraction
answer = pipe("generate question: <hl> Beyonce <hl> further expanded her acting career, starring as blues singer Etta James in the 2008 musical biopic, Cadillac Records.")

# question generation
question = pipe("extract answers: <hl> Beyonce further expanded her acting career, starring as blues singer Etta James in the 2008 musical biopic, Cadillac Records. <hl> Her performance in the film received praise from critics, and she garnered several nominations for her portrayal of James, including a Satellite Award nomination for Best Supporting Actress, and a NAACP Image Award nomination for Outstanding Supporting Actress.")

📚 Documentation

Evaluation

Question Generation

Metric (Question Generation): [raw metric file](https://huggingface.co/lmqg/t5 - base - squad - qg - ae/raw/main/eval/metric.first.sentence.paragraph_answer.question.lmqg_qg_squad.default.json)

Metric	Score	Type	Dataset
BERTScore	90.58	default	lmqg/qg_squad
Bleu_1	58.59	default	lmqg/qg_squad
Bleu_2	42.6	default	lmqg/qg_squad
Bleu_3	32.91	default	lmqg/qg_squad
Bleu_4	26.01	default	lmqg/qg_squad
METEOR	27	default	lmqg/qg_squad
MoverScore	64.72	default	lmqg/qg_squad
ROUGE_L	53.4	default	lmqg/qg_squad

Question & Answer Generation

Metric (Question & Answer Generation): [raw metric file](https://huggingface.co/lmqg/t5 - base - squad - qg - ae/raw/main/eval/metric.first.answer.paragraph.questions_answers.lmqg_qg_squad.default.json)

Metric	Score	Type	Dataset
QAAlignedF1Score (BERTScore)	92.53	default	lmqg/qg_squad
QAAlignedF1Score (MoverScore)	64.23	default	lmqg/qg_squad
QAAlignedPrecision (BERTScore)	92.35	default	lmqg/qg_squad
QAAlignedPrecision (MoverScore)	64.33	default	lmqg/qg_squad
QAAlignedRecall (BERTScore)	92.74	default	lmqg/qg_squad
QAAlignedRecall (MoverScore)	64.23	default	lmqg/qg_squad

Answer Extraction

Metric (Answer Extraction): [raw metric file](https://huggingface.co/lmqg/t5 - base - squad - qg - ae/raw/main/eval/metric.first.answer.paragraph_sentence.answer.lmqg_qg_squad.default.json)

Metric	Score	Type	Dataset
AnswerExactMatch	58.9	default	lmqg/qg_squad
AnswerF1Score	70.18	default	lmqg/qg_squad
BERTScore	91.57	default	lmqg/qg_squad
Bleu_1	56.96	default	lmqg/qg_squad
Bleu_2	52.57	default	lmqg/qg_squad
Bleu_3	48.21	default	lmqg/qg_squad
Bleu_4	44.33	default	lmqg/qg_squad
METEOR	43.94	default	lmqg/qg_squad
MoverScore	82.16	default	lmqg/qg_squad
ROUGE_L	69.62	default	lmqg/qg_squad

Training hyperparameters

The following hyperparameters were used during fine - tuning:

Property	Details
dataset_path	lmqg/qg_squad
dataset_name	default
input_types	['paragraph_answer', 'paragraph_sentence']
output_types	['question', 'answer']
prefix_types	['qg', 'ae']
model	t5 - base
max_length	512
max_length_output	32
epoch	6
batch	32
lr	0.0001
fp16	False
random_seed	1
gradient_accumulation_steps	4
label_smoothing	0.15

The full configuration can be found at [fine - tuning config file](https://huggingface.co/lmqg/t5 - base - squad - qg - ae/raw/main/trainer_config.json).

📄 License

This project is licensed under the CC - BY - 4.0 license.

📖 Citation

@inproceedings{ushio-etal-2022-generative,
    title = "{G}enerative {L}anguage {M}odels for {P}aragraph-{L}evel {Q}uestion {G}eneration",
    author = "Ushio, Asahi  and
        Alva-Manchego, Fernando  and
        Camacho-Collados, Jose",
    booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing",
    month = dec,
    year = "2022",
    address = "Abu Dhabi, U.A.E.",
    publisher = "Association for Computational Linguistics",
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご