đ RoBERTa-base fine-tuned on SQuAD v1
This project presents a RoBERTa-base model fine-tuned on the SQuAD v1 dataset. It's designed for question-answering tasks, offering high performance in retrieving accurate answers from given contexts.
đ Quick Start
This model was fine-tuned from the HuggingFace RoBERTa base checkpoint on SQuAD1.1. It's case-sensitive, distinguishing between words like "english" and "English".
⨠Features
- Question-Answering: Specialized for answering questions based on provided context.
- Case-Sensitive: Considers letter case for more accurate results.
đĻ Installation
Prerequisites
Fine-tuning Steps
- Navigate to the question-answering examples directory:
cd examples/question-answering
- Create a data directory:
mkdir -p data
- Download the SQuAD v1.1 training and development datasets:
wget -O data/train-v1.1.json https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v1.1.json
wget -O data/dev-v1.1.json https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json
- Run the fine-tuning script:
python run_energy_squad.py \
--model_type roberta \
--model_name_or_path roberta-base \
--do_train \
--do_eval \
--train_file train-v1.1.json \
--predict_file dev-v1.1.json \
--per_gpu_train_batch_size 12 \
--per_gpu_eval_batch_size 16 \
--learning_rate 3e-5 \
--num_train_epochs 2.0 \
--max_seq_length 320 \
--doc_stride 128 \
--data_dir data \
--output_dir data/roberta-base-squad-v1 2>&1 | tee train-roberta-base-squad-v1.log
Machine Specifications
- CPU: Intel(R) Core(TM) i7 - 6800K CPU @ 3.40GHz
- Memory: 32 GiB
- GPUs: 2 GeForce GTX 1070, each with 8GiB memory
- GPU driver: 418.87.01, CUDA: 10.1
- Python Version: 3.7.5
The fine-tuning process took about 2 hours to complete.
đģ Usage Examples
Basic Usage
from transformers import pipeline
qa_pipeline = pipeline(
"question-answering",
model="csarron/roberta-base-squad-v1",
tokenizer="csarron/roberta-base-squad-v1"
)
predictions = qa_pipeline({
'context': "The game was played on February 7, 2016 at Levi's Stadium in the San Francisco Bay Area at Santa Clara, California.",
'question': "What day was the game played on?"
})
print(predictions)
đ Documentation
Dataset Details
Property |
Details |
Dataset |
SQuAD1.1 |
Train Samples |
96.8K |
Eval Samples |
11.8k |
Results
Property |
Details |
Model Size |
477M |
EM |
83.0 |
F1 |
90.4 |
Note that the above results didn't involve any hyperparameter search.
đ§ Technical Details
The fine-tuning process was carried out on a machine with specific CPU, memory, and GPU configurations. The script used for fine-tuning is provided above, which includes parameters such as batch size, learning rate, and sequence length.
đ License
This project is licensed under the MIT license.
Created by Qingqing Cao | GitHub | Twitter
Made with â¤ī¸ in New York.