đ Longformer-base-4096 fine-tuned on SQuAD v2
The Longformer-base-4096 model fine-tuned on SQuAD v2 for the Q&A downstream task.
⨠Features
- Tags: QA, long context, Q&A
- Datasets: squad_v2
Model Index
Property |
Details |
Model Name |
mrm8488/longformer-base-4096-finetuned-squadv2 |
Task Type |
question-answering |
Dataset Name |
squad_v2 |
Split |
validation |
Exact Match |
79.9242 |
F1 |
83.3467 |
đ Quick Start
Longformer-base-4096
Longformer is a transformer model designed for long documents. longformer-base-4096
is a BERT-like model initialized from the RoBERTa checkpoint and pre-trained for Masked Language Modeling (MLM) on long documents. It supports sequences of length up to 4,096. Longformer combines sliding window (local) attention and global attention, where global attention can be configured by users according to the task to enable the model to learn task-specific representations.
Details of the downstream task (Q&A) - Dataset đ đ§ â
Dataset ID: squad_v2
from HuggingFace/Datasets
Dataset |
Split |
# samples |
squad_v2 |
train |
130319 |
squad_v2 |
valid |
11873 |
How to load it from datasets:
!pip install datasets
from datasets import load_dataset
dataset = load_dataset('squad_v2')
Check out more about this dataset and others in Datasets Viewer
Model fine-tuning đī¸â
The training script is a slightly modified version of this one
đģ Usage Examples
Basic Usage
import torch
from transformers import AutoTokenizer, AutoModelForQuestionAnswering
ckpt = "mrm8488/longformer-base-4096-finetuned-squadv2"
tokenizer = AutoTokenizer.from_pretrained(ckpt)
model = AutoModelForQuestionAnswering.from_pretrained(ckpt)
text = "Huggingface has democratized NLP. Huge thanks to Huggingface for this."
question = "What has Huggingface done ?"
encoding = tokenizer(question, text, return_tensors="pt")
input_ids = encoding["input_ids"]
attention_mask = encoding["attention_mask"]
start_scores, end_scores = model(input_ids, attention_mask=attention_mask)
all_tokens = tokenizer.convert_ids_to_tokens(input_ids[0].tolist())
answer_tokens = all_tokens[torch.argmax(start_scores) :torch.argmax(end_scores)+1]
answer = tokenizer.decode(tokenizer.convert_tokens_to_ids(answer_tokens))
Advanced Usage
from transformers import AutoTokenizer, AutoModelForQuestionAnswering, pipeline
ckpt = "mrm8488/longformer-base-4096-finetuned-squadv2"
tokenizer = AutoTokenizer.from_pretrained(ckpt)
model = AutoModelForQuestionAnswering.from_pretrained(ckpt)
qa = pipeline("question-answering", model=model, tokenizer=tokenizer)
text = "Huggingface has democratized NLP. Huge thanks to Huggingface for this."
question = "What has Huggingface done?"
qa({"question": question, "context": text})
If given the same context we ask something that is not there, the output for no answer will be <s>
đ Acknowledgments
Created by Manuel Romero/@mrm8488 | LinkedIn
Made with âĨ in Spain
