t5-large-generation-race-Distractor Open Source Model - Free for Generating Multiple-choice Question Distractors

T5 Large Generation Race Distractor

Developed by potsawee

This is a T5-large model fine-tuned on the RACE dataset, specifically designed for generating distractors for multiple-choice questions.

Text Generation

Transformers

EnglishOpen Source License:Apache-2.0 #Multiple-choice distractor generation #Educational assessment assistance #Text generation fine-tuning

Downloads 262

Release Time : 2/23/2023

Model Overview

The model takes a combined input of question, answer, and context, and outputs a list of 3 distractors, primarily used in the distractor generation phase of multiple-choice question creation.

Model Features

Distractor generation

Specialized in generating semantically relevant but incorrect distractors for multiple-choice questions

RACE dataset fine-tuning

Fine-tuned on the reading comprehension evaluation dataset RACE to optimize distractor generation quality

Structured input-output

Uses standardized input format (question<sep>answer<sep>context) and output format (list of 3 distractors)

Model Capabilities

Multiple-choice distractor generation

Text generation

Reading comprehension assistance

Use Cases

Educational technology

Automatic test item generation

Automatically generates distractors for multiple-choice questions on online learning platforms

Produces semantically relevant and plausible distractors

Reading comprehension assessment

Creates reading comprehension test questions

Improves test item development efficiency

🚀 t5-large fine-tuned to RACE for Generating Distractors

This project fine-tunes the t5-large model on the RACE dataset to generate a list of 3 distractors from the input of question, answer, and context.

🚀 Quick Start

Input and Output

Input: question <sep> answer <sep> context
Output: list of 3 distractors

✨ Features

The t5-large model is fine-tuned on the RACE dataset. The input is a concatenation of the question, answer, and context, and the output is a list of 3 distractors. This model serves as the second component (g2) in the question generation pipeline in our MQAG paper. You can also refer to the GitHub repository of this project: https://github.com/potsawee/mqag0.

📦 Installation

This section does not contain specific installation steps, so it is skipped.

💻 Usage Examples

Basic Usage

>>> from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

>>> tokenizer = AutoTokenizer.from_pretrained("potsawee/t5-large-generation-race-Distractor")
>>> model = AutoModelForSeq2SeqLM.from_pretrained("potsawee/t5-large-generation-race-Distractor")

>>> context = r"""
... World number one Novak Djokovic says he is hoping for a "positive decision" to allow him 
... to play at Indian Wells and the Miami Open next month. The United States has extended 
... its requirement for international visitors to be vaccinated against Covid-19. Proof of vaccination 
... will be required to enter the country until at least 10 April, but the Serbian has previously 
... said he is unvaccinated. The 35-year-old has applied for special permission to enter the country. 
... Indian Wells and the Miami Open - two of the most prestigious tournaments on the tennis calendar 
... outside the Grand Slams - start on 6 and 20 March respectively. Djokovic says he will return to 
... the ATP tour in Dubai next week after claiming a record-extending 10th Australian Open title 
... and a record-equalling 22nd Grand Slam men's title last month.""".replace("\n", "")
>>> question = "What is the best title for the passage?"
>>> answer = "Djokovic's application for special permission to enter the United States"

>>> input_text = " ".join([question, tokenizer.sep_token, answer, tokenizer.sep_token, context])
>>> inputs = tokenizer(input_text, return_tensors="pt")
>>> outputs = model.generate(**inputs, max_new_tokens=128)
>>> distractors = tokenizer.decode(outputs[0], skip_special_tokens=False)
>>> distractors = distractors.replace(tokenizer.pad_token, "").replace(tokenizer.eos_token, "")
>>> distractors = [y.strip() for y in distractors.split(tokenizer.sep_token)]
>>> print(distractors)
['The United States has extended its requirement for international visitors to be vaccinated against Covid-19',
"Djokovic's return to the ATP tour in Dubai",
"Djokovic's hope for a positive decision to allow him to play at Indian Wells and the Miami Open"]

📚 Documentation

Citation

@article{manakul2023mqag,
  title={MQAG: Multiple-choice Question Answering and Generation for Assessing Information Consistency in Summarization},
  author={Manakul, Potsawee and Liusie, Adian and Gales, Mark JF},
  journal={arXiv preprint arXiv:2301.12307},
  year={2023}
}

📄 License

This project is licensed under the Apache-2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご