Mixqg-large Open-Source Question Generation Model - Generate Relevant Questions for Free Based on Context and Answers

Mixqg Large

Developed by Salesforce

MixQG is a question generation model pre-trained on mixed-answer-type QA datasets, capable of generating relevant questions based on given context and answers.

Question Answering System

Transformers

English#Mixed-answer question generation #QA dataset pre-training #Multi-type answer support

Downloads 275

Release Time : 3/2/2022

Model Overview

MixQG is a novel question generation model pre-trained on mixed-answer-type QA datasets. It can generate natural and fluent questions based on provided text context and specific answers.

Model Features

Mixed-answer type support

Capable of handling question generation tasks with multiple answer types

High-quality question generation

Generates natural and fluent questions that align with contextual semantics

Chinese optimization

A question generation model specifically optimized for Chinese text

Model Capabilities

Text generation

Question generation

Natural language processing

Use Cases

Education

Automatic test question generation

Automatically generates test questions based on textbook content

Improves teacher efficiency and enriches teaching resources

Content creation

Article interaction question generation

Generates interactive questions for online articles

Enhances reader engagement and comprehension depth

🚀 MixQG (Large-sized Model)

MixQG is a novel question generation model pre - trained on a diverse collection of QA datasets with mixed answer types, offering enhanced performance in question generation tasks.

🚀 Quick Start

MixQG is a new question generation model pre - trained on a collection of QA datasets with a mix of answer types. It was introduced in the paper MixQG: Neural Question Generation with Mixed Answer Types and the associated code is released in this repository.

💻 Usage Examples

Basic Usage

Using Huggingface pipeline abstraction:

from transformers import pipeline

nlp = pipeline("text2text-generation", model='Salesforce/mixqg-large', tokenizer='Salesforce/mixqg-large')
    
CONTEXT = "In the late 17th century, Robert Boyle proved that air is necessary for combustion."
ANSWER = "Robert Boyle"

def format_inputs(context: str, answer: str):
    return f"{answer} \\n {context}"
    
text = format_inputs(CONTEXT, ANSWER)

nlp(text)
# should output [{'generated_text': 'Who proved that air is necessary for combustion?'}]

Advanced Usage

Using the pre - trained model directly:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained('Salesforce/mixqg-large')
model = AutoModelForSeq2SeqLM.from_pretrained('Salesforce/mixqg-large')

CONTEXT = "In the late 17th century, Robert Boyle proved that air is necessary for combustion."
ANSWER = "Robert Boyle"

def format_inputs(context: str, answer: str):
    return f"{answer} \\n {context}"
    
text = format_inputs(CONTEXT, ANSWER)

input_ids = tokenizer(text, return_tensors="pt").input_ids
generated_ids = model.generate(input_ids, max_length=32, num_beams=4)
output = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
print(output)
# should output "Who proved that air is necessary for combustion?"

📚 Documentation

Citation

@misc{murakhovska2021mixqg,
      title={MixQG: Neural Question Generation with Mixed Answer Types}, 
      author={Lidiya Murakhovs'ka and Chien-Sheng Wu and Tong Niu and Wenhao Liu and Caiming Xiong},
      year={2021},
      eprint={2110.08175},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

🔧 Technical Details

Ethical Considerations

This release is for research purposes only in support of an academic paper. Our models, datasets, and code are not specifically designed or evaluated for all downstream purposes. We strongly recommend users evaluate and address potential concerns related to accuracy, safety, and fairness before deploying this model. We encourage users to consider the common limitations of AI, comply with applicable laws, and leverage best practices when selecting use cases, particularly for high - risk scenarios where errors or misuse could significantly impact people’s lives, rights, or safety. For further guidance on use cases, refer to our AUP and AI AUP.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご