Flan-t5-large-squad2 Open-source Question Answering Model - Precision Handling of Answerable and Unanswerable Questions

Flan T5 Large Squad2

Developed by sjrhuschlee

An extractive QA model fine-tuned on the SQuAD2.0 dataset based on flan-t5-large, capable of handling both answerable and unanswerable questions.

Question Answering System

Transformers

EnglishOpen Source License:MIT #Extractive QA #LoRA Fine-tuning #SQuAD2.0 Adaptation

Downloads 57

Release Time : 6/14/2023

Model Overview

This model is optimized for English extractive QA tasks, specifically addressing answerable and unanswerable question pairs in the SQuAD2.0 dataset.

Model Features

LoRA Fine-tuning

Efficient fine-tuning using LoRA from the PEFT library, maintaining model performance while reducing computational resource requirements

Special Token Handling

Uses the <cls> token to predict 'no-answer' scenarios, effectively handling unanswerable questions

Multi-dataset Validation

Comprehensively validated on SQuAD, SQuAD2.0, and multiple adversarial datasets

Model Capabilities

Extractive QA

Unanswerable Question Detection

Context Understanding

Use Cases

Customer Support

FAQ Auto-response

Automatically answers common user questions based on knowledge base content

Achieved 86.8% exact match rate on the SQuAD2.0 validation set

Education

Reading Comprehension Assistance

Helps students understand texts and answer related questions

Achieved 95.06 F1 score on the SQuAD validation set

🚀 flan-t5-large for Extractive QA

This is a fine-tuned flan-t5-large model for extractive question answering, trained on SQuAD2.0 dataset, enabling accurate answers to questions.

🚀 Quick Start

This is the flan-t5-large model, fine-tuned using the SQuAD2.0 dataset. It's been trained on question-answer pairs, including unanswerable questions, for the task of Extractive Question Answering.

UPDATE: With transformers version 4.31.0 the use_remote_code=True is no longer necessary.

This model was trained using LoRA available through the PEFT library.

NOTE: The <cls> token must be manually added to the beginning of the question for this model to work properly. It uses the <cls> token to be able to make "no answer" predictions. The t5 tokenizer does not automatically add this special token which is why it is added manually.

✨ Features

Language model: flan-t5-large
Language: English
Downstream-task: Extractive QA
Training data: SQuAD 2.0
Eval data: SQuAD 2.0
Infrastructure: 1x NVIDIA 3070

Property	Details
Model Type	flan-t5-large
Training Data	SQuAD 2.0

📦 Installation

No specific installation steps are provided in the original README.

💻 Usage Examples

Basic Usage

This uses the merged weights (base model weights + LoRA weights) to allow for simple use in Transformers pipelines. It has the same performance as using the weights separately when using the PEFT library.

import torch
from transformers import(
  AutoModelForQuestionAnswering,
  AutoTokenizer,
  pipeline
)
model_name = "sjrhuschlee/flan-t5-large-squad2"

# a) Using pipelines
nlp = pipeline(
  'question-answering',
  model=model_name,
  tokenizer=model_name,
  # trust_remote_code=True, # Do not use if version transformers>=4.31.0
)
qa_input = {
'question': f'{nlp.tokenizer.cls_token}Where do I live?',  # '<cls>Where do I live?'
'context': 'My name is Sarah and I live in London'
}
res = nlp(qa_input)
# {'score': 0.984, 'start': 30, 'end': 37, 'answer': ' London'}

# b) Load model & tokenizer
model = AutoModelForQuestionAnswering.from_pretrained(
  model_name,
  # trust_remote_code=True # Do not use if version transformers>=4.31.0
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

question = f'{tokenizer.cls_token}Where do I live?'  # '<cls>Where do I live?'
context = 'My name is Sarah and I live in London'
encoding = tokenizer(question, context, return_tensors="pt")
output = model(
  encoding["input_ids"],
  attention_mask=encoding["attention_mask"]
)

all_tokens = tokenizer.convert_ids_to_tokens(encoding["input_ids"][0].tolist())
answer_tokens = all_tokens[torch.argmax(output["start_logits"]):torch.argmax(output["end_logits"]) + 1]
answer = tokenizer.decode(tokenizer.convert_tokens_to_ids(answer_tokens))
# 'London'

Advanced Usage

NOTE: This requires code in the PR https://github.com/huggingface/peft/pull/473 for the PEFT library.

#!pip install peft

from peft import LoraConfig, PeftModelForQuestionAnswering
from transformers import AutoModelForQuestionAnswering, AutoTokenizer
model_name = "sjrhuschlee/flan-t5-large-squad2"

📚 Documentation

Metrics

# Squad v2
{
    "eval_HasAns_exact": 85.08771929824562,
    "eval_HasAns_f1": 90.598422845031,
    "eval_HasAns_total": 5928,
    "eval_NoAns_exact": 88.47771236333053,
    "eval_NoAns_f1": 88.47771236333053,
    "eval_NoAns_total": 5945,
    "eval_best_exact": 86.78514276088605,
    "eval_best_exact_thresh": 0.0,
    "eval_best_f1": 89.53654936623764,
    "eval_best_f1_thresh": 0.0,
    "eval_exact": 86.78514276088605,
    "eval_f1": 89.53654936623776,
    "eval_runtime": 1908.3189,
    "eval_samples": 12001,
    "eval_samples_per_second": 6.289,
    "eval_steps_per_second": 0.787,
    "eval_total": 11873
}

# Squad
{
    "eval_HasAns_exact": 85.99810785241249,
    "eval_HasAns_f1": 91.296119057944,
    "eval_HasAns_total": 10570,
    "eval_best_exact": 85.99810785241249,
    "eval_best_exact_thresh": 0.0,
    "eval_best_f1": 91.296119057944,
    "eval_best_f1_thresh": 0.0,
    "eval_exact": 85.99810785241249,
    "eval_f1": 91.296119057944,
    "eval_runtime": 1508.9596,
    "eval_samples": 10657,
    "eval_samples_per_second": 7.062,
    "eval_steps_per_second": 0.883,
    "eval_total": 10570
}

📄 License

This project is under the MIT license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご