longformer-base-4096-finetuned-squadv1 Open-Source Q&A Model - Efficiently Handles Long Document Q&A Tasks

Longformer Base 4096 Finetuned Squadv1

Developed by valhalla

A fine-tuned version of the LONGFORMER-BASE-4096 model on the SQuAD v1 QA dataset, suitable for long-document question answering tasks

Question Answering System Open Source License:MIT #Long-form QA #Global Attention Mechanism #SQuAD Fine-tuning

Downloads 806

Release Time : 3/2/2022

Model Overview

This model is a fine-tuned version of Longformer on the SQuAD v1 dataset, specifically designed for question answering tasks and capable of processing sequences up to 4096 tokens in length.

Model Features

Long Document Processing Capability

Capable of processing sequences up to 4096 tokens in length, making it suitable for long-document question answering tasks.

Global Attention Mechanism

Automatically sets global attention for question tokens in QA tasks, improving answer accuracy.

Efficient Training

Utilizes a sliding window local attention mechanism to reduce computational complexity and enhance training efficiency.

Model Capabilities

Long-document QA

Text Comprehension

Answer Extraction

Use Cases

Question Answering Systems

Reading Comprehension

Extracting answers to questions from long documents

Exact Match 85.1466, F1 Score 91.5415

🚀 LONGFORMER-BASE-4096 fine-tuned on SQuAD v1

This is a fine-tuned longformer-base-4096 model on the SQuAD v1 dataset for question answering tasks, leveraging the capabilities of the Longformer architecture for handling long documents.

🔍 Datasets

squad_v1

📄 License

This project is licensed under the MIT license.

🚀 Quick Start

This is longformer-base-4096 model fine-tuned on SQuAD v1 dataset for question answering task.

Longformer model created by Iz Beltagy, Matthew E. Peters, Arman Coha from AllenAI. As the paper explains it

Longformer is a BERT-like model for long documents.

The pre-trained model can handle sequences with upto 4096 tokens.

📚 Documentation

🔧 Model Training

This model was trained on google colab v100 GPU. You can find the fine-tuning colab here .

Few things to keep in mind while training longformer for QA task: By default, longformer uses sliding-window local attention on all tokens. But for QA, all question tokens should have global attention. For more details on this please refer the paper. The LongformerForQuestionAnswering model automatically does that for you. To allow it to do that:

The input sequence must have three sep tokens, i.e the sequence should be encoded like this <s> question</s></s> context</s>. If you encode the question and answer as an input pair, then the tokenizer already takes care of that, you shouldn't worry about it.
input_ids should always be a batch of examples.

📊 Results

Metric	Value
Exact Match	85.1466
F1	91.5415

💻 Usage Examples

🌟 Basic Usage

import torch
from transformers import AutoTokenizer, AutoModelForQuestionAnswering

tokenizer = AutoTokenizer.from_pretrained("valhalla/longformer-base-4096-finetuned-squadv1")
model = AutoModelForQuestionAnswering.from_pretrained("valhalla/longformer-base-4096-finetuned-squadv1")

text = "Huggingface has democratized NLP. Huge thanks to Huggingface for this."
question = "What has Huggingface done ?"
encoding = tokenizer(question, text, return_tensors="pt")
input_ids = encoding["input_ids"]

# default is local attention everywhere
# the forward method will automatically set global attention on question tokens
attention_mask = encoding["attention_mask"]

start_scores, end_scores = model(input_ids, attention_mask=attention_mask)
all_tokens = tokenizer.convert_ids_to_tokens(input_ids[0].tolist())

answer_tokens = all_tokens[torch.argmax(start_scores) :torch.argmax(end_scores)+1]
answer = tokenizer.decode(tokenizer.convert_tokens_to_ids(answer_tokens))
# output => democratized NLP

⚠️ Important Note

The LongformerForQuestionAnswering isn't yet supported in pipeline. I'll update this card once the support has been added.

Created with ❤️ by Suraj Patil

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご