longformer-base-4096-finetuned-squadv2 Open-source Question Answering Model

Longformer Base 4096 Finetuned Squadv2

Developed by mrm8488

This model is a QA system based on the Longformer architecture, fine-tuned on the SQuAD v2 dataset, supporting long text sequences (up to 4096 tokens).

Question Answering System

Transformers

English#Long-form QA #SQuADv2 Fine-tuning #4096 Context Window

Downloads 190

Release Time : 3/2/2022

Model Overview

Longformer-base-4096 is a Transformer model designed for long documents, initialized from RoBERTa and fine-tuned on the SQuAD v2 dataset for QA tasks. It combines sliding window local attention with global attention mechanisms, making it suitable for long-document QA tasks.

Model Features

Long-Text Processing Capability

Supports sequences up to 4096 tokens, ideal for long-document QA tasks.

Hybrid Attention Mechanism

Combines sliding window local attention with global attention to capture long-range dependencies while maintaining efficiency.

High-Precision QA

Achieves 79.92% exact match and 83.35% F1 score on the SQuAD v2 validation set.

Model Capabilities

Long-form QA

Open-domain QA

No-Answer Detection

Use Cases

Document QA Systems

Legal Document Analysis

Extract answers to specific questions from lengthy legal documents.

Research Paper QA

Answer questions about academic papers or technical reports.

Customer Support

FAQ Automation

Answer customer questions from lengthy product documentation.

🚀 Longformer-base-4096 fine-tuned on SQuAD v2

The Longformer-base-4096 model fine-tuned on SQuAD v2 for the Q&A downstream task.

✨ Features

Tags: QA, long context, Q&A
Datasets: squad_v2

Model Index

Property	Details
Model Name	mrm8488/longformer-base-4096-finetuned-squadv2
Task Type	question-answering
Dataset Name	squad_v2
Split	validation
Exact Match	79.9242
F1	83.3467

🚀 Quick Start

Longformer-base-4096

Longformer is a transformer model designed for long documents. longformer-base-4096 is a BERT-like model initialized from the RoBERTa checkpoint and pre-trained for Masked Language Modeling (MLM) on long documents. It supports sequences of length up to 4,096. Longformer combines sliding window (local) attention and global attention, where global attention can be configured by users according to the task to enable the model to learn task-specific representations.

Details of the downstream task (Q&A) - Dataset 📚 🧐 ❓

Dataset ID: squad_v2 from HuggingFace/Datasets

Dataset	Split	# samples
squad_v2	train	130319
squad_v2	valid	11873

How to load it from datasets:

!pip install datasets
from datasets import load_dataset
dataset = load_dataset('squad_v2')

Check out more about this dataset and others in Datasets Viewer

Model fine-tuning 🏋️‍

The training script is a slightly modified version of this one

💻 Usage Examples

Basic Usage

import torch
from transformers import AutoTokenizer, AutoModelForQuestionAnswering
ckpt = "mrm8488/longformer-base-4096-finetuned-squadv2"
tokenizer = AutoTokenizer.from_pretrained(ckpt)
model = AutoModelForQuestionAnswering.from_pretrained(ckpt)

text = "Huggingface has democratized NLP. Huge thanks to Huggingface for this."
question = "What has Huggingface done ?"
encoding = tokenizer(question, text, return_tensors="pt")
input_ids = encoding["input_ids"]

# default is local attention everywhere
# the forward method will automatically set global attention on question tokens
attention_mask = encoding["attention_mask"]

start_scores, end_scores = model(input_ids, attention_mask=attention_mask)
all_tokens = tokenizer.convert_ids_to_tokens(input_ids[0].tolist())

answer_tokens = all_tokens[torch.argmax(start_scores) :torch.argmax(end_scores)+1]
answer = tokenizer.decode(tokenizer.convert_tokens_to_ids(answer_tokens))

# output => democratized NLP

Advanced Usage

from transformers import AutoTokenizer, AutoModelForQuestionAnswering, pipeline

ckpt = "mrm8488/longformer-base-4096-finetuned-squadv2"
tokenizer = AutoTokenizer.from_pretrained(ckpt)
model = AutoModelForQuestionAnswering.from_pretrained(ckpt)

qa = pipeline("question-answering", model=model, tokenizer=tokenizer)

text = "Huggingface has democratized NLP. Huge thanks to Huggingface for this."
question = "What has Huggingface done?"

qa({"question": question, "context": text})

If given the same context we ask something that is not there, the output for no answer will be <s>

📚 Acknowledgments

Created by Manuel Romero/@mrm8488 | LinkedIn Made with ♥ in Spain

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご