🚀 Distractor Generation with T5-base
This repository contains a fine - tuned T5 - base model for generating distractors for multiple - choice questions, leveraging T5's text - to - text framework and a custom separator token.
🚀 Quick Start
You can use this model with Hugging Face's Transformers pipeline. Here is an example code:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model_name = "fares7elsadek/t5-base-distractor-generation"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
SEP_TOKEN = "<sep>"
def generate_distractors(question, context, correct, max_length=64):
input_text = f"{question} {SEP_TOKEN} {context} {SEP_TOKEN} {correct}"
inputs = tokenizer([input_text], return_tensors="pt", truncation=True, padding=True)
outputs = model.generate(
input_ids=inputs["input_ids"],
attention_mask=inputs["attention_mask"],
max_length=max_length
)
decoded = tokenizer.decode(outputs[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)
distractors = [d.strip() for d in decoded.split(SEP_TOKEN)]
return distractors
question = "What is the capital of France?"
context = "France is a country in Western Europe known for its rich history and cultural heritage."
correct = "Paris"
print(generate_distractors(question, context, correct))
✨ Features
- Leverage T5’s text - to - text framework and a custom separator token to generate three plausible distractors for multiple - choice questions.
- Built with PyTorch Lightning, fine - tune the pre - trained T5 - base model.
- Evaluate the model using BLEU scores for each generated distractor.
📦 Installation
The README does not provide specific installation steps, so this section is skipped.
💻 Usage Examples
Basic Usage
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model_name = "fares7elsadek/t5-base-distractor-generation"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
SEP_TOKEN = "<sep>"
def generate_distractors(question, context, correct, max_length=64):
input_text = f"{question} {SEP_TOKEN} {context} {SEP_TOKEN} {correct}"
inputs = tokenizer([input_text], return_tensors="pt", truncation=True, padding=True)
outputs = model.generate(
input_ids=inputs["input_ids"],
attention_mask=inputs["attention_mask"],
max_length=max_length
)
decoded = tokenizer.decode(outputs[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)
distractors = [d.strip() for d in decoded.split(SEP_TOKEN)]
return distractors
question = "What is the capital of France?"
context = "France is a country in Western Europe known for its rich history and cultural heritage."
correct = "Paris"
print(generate_distractors(question, context, correct))
📚 Documentation
Model Overview
Built with PyTorch Lightning, this implementation fine - tunes the pre - trained T5 - base model to generate distractor options. The model takes a single input sequence formatted with the question, context, and correct answer—separated by a custom token—and generates a target sequence containing three distractors. This approach is particularly useful for multiple - choice question generation tasks.
Data Processing
Input Construction
Each input sample is a single string with the following format:
question {SEP_TOKEN} context {SEP_TOKEN} correct
- question: The question text.
- context: The context passage.
- correct: The correct answer.
- SEP_TOKEN: A special token added to the tokenizer to separate the different fields.
Target Construction
Each target sample is constructed as follows:
incorrect1 {SEP_TOKEN} incorrect2 {SEP_TOKEN} incorrect3
This format allows the model to generate three distractors in one pass.
Training Details
- Framework: PyTorch Lightning
- Base Model: T5 - base
- Optimizer: Adam with linear scheduling (using a warmup scheduler)
- Batch Size: 32
- Number of Epochs: 5
- Learning Rate: 2e - 5
- Tokenization:
- Input: Maximum length of 512 tokens
- Target: Maximum length of 64 tokens
- Special Tokens: The custom
SEP_TOKEN
is added to the tokenizer and is used to separate different parts of the input and target sequences.
Evaluation Metrics
The model is evaluated using BLEU scores for each generated distractor. Below are the BLEU scores obtained on the test set:
Distractor |
BLEU - 1 |
BLEU - 2 |
BLEU - 3 |
BLEU - 4 |
Distractor 1 |
29.59 |
21.55 |
17.86 |
15.75 |
Distractor 2 |
25.21 |
16.81 |
13.00 |
10.78 |
Distractor 3 |
23.99 |
15.78 |
12.35 |
10.52 |
These scores indicate that the model is capable of generating distractors with high n - gram overlap compared to reference distractors.
🔧 Technical Details
The model uses T5 - base as the base model, fine - tuned with PyTorch Lightning. It leverages a custom SEP_TOKEN
to separate different parts of the input and target sequences. The input and target sequences have specific length limits for tokenization, and the model is trained with Adam optimizer and linear scheduling.
📄 License
The model is licensed under the MIT license.