🚀 BART-LARGE finetuned on SQuADv2
This is a bart-large model finetuned on the SQuADv2 dataset for the question answering task.
🚀 Quick Start
This BART-LARGE model, finetuned on the SQuADv2 dataset, is designed for question answering tasks. It can effectively handle various question - answering scenarios.
✨ Features
- Powerful Architecture: BART is a seq2seq model suitable for both NLG and NLU tasks. It was proposed in the paper BART: Denoising Sequence - to - Sequence Pre - training for Natural Language Generation, Translation, and Comprehension.
- Long Sequence Handling: BART can handle sequences of up to 1024 tokens.
- Comparable Performance: As stated in the paper, bart - large achieves performance comparable to ROBERTa on SQuAD.
📚 Documentation
Model details
BART is a seq2seq model suitable for both NLG and NLU tasks. It was proposed in the paper BART: Denoising Sequence - to - Sequence Pre - training for Natural Language Generation, Translation, and Comprehension.
To use BART for question answering tasks, we feed the complete document into the encoder and decoder, and use the top hidden state of the decoder as a representation for each word. This representation is used to classify the token.
Property |
Details |
Encoder Layers |
12 |
Decoder Layers |
12 |
Hidden Size |
4096 |
Num Attention Heads |
16 |
On Disk Size |
1.63GB |
Model training
This model was trained with the following parameters using the simpletransformers wrapper:
train_args = {
'learning_rate': 1e-5,
'max_seq_length': 512,
'doc_stride': 512,
'overwrite_output_dir': True,
'reprocess_input_data': False,
'train_batch_size': 8,
'num_train_epochs': 2,
'gradient_accumulation_steps': 2,
'no_cache': True,
'use_cached_eval_features': False,
'save_model_every_epoch': False,
'output_dir': "bart-squadv2",
'eval_batch_size': 32,
'fp16_opt_level': 'O2',
}
You can even train your own model using this colab notebook
Results
{"correct": 6832, "similar": 4409, "incorrect": 632, "eval_loss": -14.950117511952177}
💻 Usage Examples
Basic Usage
from transformers import BartTokenizer, BartForQuestionAnswering
import torch
tokenizer = BartTokenizer.from_pretrained('a-ware/bart-squadv2')
model = BartForQuestionAnswering.from_pretrained('a-ware/bart-squadv2')
question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"
encoding = tokenizer(question, text, return_tensors='pt')
input_ids = encoding['input_ids']
attention_mask = encoding['attention_mask']
start_scores, end_scores = model(input_ids, attention_mask=attention_mask, output_attentions=False)[:2]
all_tokens = tokenizer.convert_ids_to_tokens(input_ids[0])
answer = ' '.join(all_tokens[torch.argmax(start_scores) : torch.argmax(end_scores)+1])
answer = tokenizer.convert_tokens_to_ids(answer.split())
answer = tokenizer.decode(answer)
#answer => 'a nice puppet'
Created with ❤️ by A-ware UG 