mt5-small-jaquad-qg Open-source Japanese Question Generation Model - Quickly Generate Related Questions from Text

Mt5 Small Jaquad Qg

Developed by lmqg

This is a Japanese question generation model fine-tuned from google/mt5-small, specifically designed to generate relevant questions from given texts.

Question Answering System

Transformers

Japanese#Japanese Question Generation #MT5-based #Educational Assistance

Downloads 209

Release Time : 3/2/2022

Model Overview

The model can automatically generate questions based on input Japanese text (especially with highlighted answer sections), suitable for educational and Q&A system scenarios.

Model Features

Japanese Specialization

Question generation capability specifically optimized for Japanese text

Answer-based Question Generation

Can generate relevant questions based on highlighted answer sections in the text

Multi-metric Evaluation

Performs well across multiple metrics including BLEU, METEOR, and ROUGE-L

Model Capabilities

Japanese text comprehension

Question generation

Context-aware Q&A

Educational content generation

Use Cases

Education

Reading Comprehension Question Generation

Automatically generates reading comprehension questions from textbook passages

Generated questions can be used for student practice or testing

Q&A Systems

Generates relevant questions for knowledge base documents

Enhances coverage and diversity of Q&A systems

Content Creation

Article Interactive Content

Generates reader engagement questions for online articles

Improves reader participation and comprehension depth

🚀 Model Card of `lmqg/mt5-small-jaquad-qg`

This model is a fine - tuned version of google/mt5-small for the question generation task on the lmqg/qg_jaquad (dataset_name: default) via lmqg. It aims to generate high - quality questions based on given text, providing a useful tool for natural language processing applications.

✨ Features

Language Model: Utilizes google/mt5-small as the base language model.
Language Support: Specialized for the Japanese language (ja).
Training Data: Trained on the lmqg/qg_jaquad dataset.
Online Demo: An online demonstration is available at https://autoqg.net/.
Repository: The source code and related information can be found in the GitHub repository.
Paper: Refer to the research paper at https://arxiv.org/abs/2210.03992 for more details.

📦 Installation

There is no specific installation steps provided in the original README. If you want to use this model, you may need to install relevant libraries such as lmqg or transformers according to the usage examples.

💻 Usage Examples

Basic Usage

With `lmqg`

from lmqg import TransformersQG

# initialize model
model = TransformersQG(language="ja", model="lmqg/mt5-small-jaquad-qg")

# model prediction
questions = model.generate_q(list_context="フェルメールの作品では、17世紀のオランダの画家、ヨハネス・フェルメールの作品について記述する。フェルメールの作品は、疑問作も含め30数点しか現存しない。現存作品はすべて油彩画で、版画、下絵、素描などは残っていない。", list_answer="30数点")

With `transformers`

from transformers import pipeline

pipe = pipeline("text2text-generation", "lmqg/mt5-small-jaquad-qg")
output = pipe("ゾフィーは貴族出身ではあったが王族出身ではなく、ハプスブルク家の皇位継承者であるフランツ・フェルディナントとの結婚は貴賤結婚となった。皇帝フランツ・ヨーゼフは、2人の間に生まれた子孫が皇位を継がないことを条件として結婚を承認していた。視察が予定されている<hl>6月28日<hl>は2人の14回目の結婚記念日であった。")

📚 Documentation

Evaluation

Metric (Question Generation)

The raw metric file can be found here.

Property	Details
BERTScore	80.87 (default, lmqg/qg_jaquad)
Bleu_1	56.34 (default, lmqg/qg_jaquad)
Bleu_2	44.28 (default, lmqg/qg_jaquad)
Bleu_3	36.31 (default, lmqg/qg_jaquad)
Bleu_4	30.49 (default, lmqg/qg_jaquad)
METEOR	29.03 (default, lmqg/qg_jaquad)
MoverScore	58.67 (default, lmqg/qg_jaquad)
ROUGE_L	50.88 (default, lmqg/qg_jaquad)

Metric (Question & Answer Generation, Reference Answer)

Each question is generated from the gold answer. The raw metric file is available here.

Property	Details
QAAlignedF1Score (BERTScore)	86.07 (default, lmqg/qg_jaquad)
QAAlignedF1Score (MoverScore)	61.83 (default, lmqg/qg_jaquad)
QAAlignedPrecision (BERTScore)	86.08 (default, lmqg/qg_jaquad)
QAAlignedPrecision (MoverScore)	61.85 (default, lmqg/qg_jaquad)
QAAlignedRecall (BERTScore)	86.06 (default, lmqg/qg_jaquad)
QAAlignedRecall (MoverScore)	61.81 (default, lmqg/qg_jaquad)

Metric (Question & Answer Generation, Pipeline Approach)

Each question is generated on the answer generated by lmqg/mt5-small-jaquad-ae. The raw metric file can be accessed here.

Property	Details
QAAlignedF1Score (BERTScore)	79.78 (default, lmqg/qg_jaquad)
QAAlignedF1Score (MoverScore)	55.85 (default, lmqg/qg_jaquad)
QAAlignedPrecision (BERTScore)	76.84 (default, lmqg/qg_jaquad)
QAAlignedPrecision (MoverScore)	53.8 (default, lmqg/qg_jaquad)
QAAlignedRecall (BERTScore)	83.06 (default, lmqg/qg_jaquad)
QAAlignedRecall (MoverScore)	58.22 (default, lmqg/qg_jaquad)

Training hyperparameters

The following hyperparameters were used during fine - tuning:

dataset_path: lmqg/qg_jaquad
dataset_name: default
input_types: ['paragraph_answer']
output_types: ['question']
prefix_types: None
model: google/mt5-small
max_length: 512
max_length_output: 32
epoch: 21
batch: 64
lr: 0.0005
fp16: False
random_seed: 1
gradient_accumulation_steps: 1
label_smoothing: 0.0

The full configuration can be found at fine - tuning config file.

🔧 Technical Details

The model is fine - tuned from the base model google/mt5-small on the lmqg/qg_jaquad dataset. The fine - tuning process involves adjusting the model's parameters to optimize the question generation task. The hyperparameters are carefully selected to balance the training efficiency and the performance of the model.

📄 License

The model is licensed under cc-by-4.0.

📚 Citation

@inproceedings{ushio-etal-2022-generative,
    title = "{G}enerative {L}anguage {M}odels for {P}aragraph-{L}evel {Q}uestion {G}eneration",
    author = "Ushio, Asahi  and
        Alva-Manchego, Fernando  and
        Camacho-Collados, Jose",
    booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing",
    month = dec,
    year = "2022",
    address = "Abu Dhabi, U.A.E.",
    publisher = "Association for Computational Linguistics",
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご