đ MobileBERT fine-tuned on SQuAD v2
MobileBERT is a lightweight version of BERT_LARGE, featuring bottleneck structures and a well - designed balance between self - attentions and feed - forward networks. This model is fine - tuned on SQuAD2.0 from the HuggingFace checkpoint google/mobilebert-uncased
, aiming to solve question - answering tasks effectively.
đ Quick Start
The following sections will introduce the details of fine - tuning this model, including dataset information, fine - tuning environment, scripts, results, and usage examples.
⨠Features
- Lightweight Design: MobileBERT is a thin version of BERT_LARGE, suitable for resource - constrained environments.
- Effective for Question - Answering: Fine - tuned on the SQuAD2.0 dataset, it can handle question - answering tasks well.
đĻ Installation
Before fine - tuning the model, you need to install the transformers
library from https://github.com/huggingface/transformers.
đ Documentation
Details
Property |
Details |
Dataset |
SQuAD2.0 |
Train Samples |
130k |
Eval Samples |
12.3k |
Fine - tuning
- Python Version:
3.7.5
- Machine Specs:
CPU: Intel(R) Core(TM) i7 - 6800K CPU @ 3.40GHz
Memory: 32 GiB
GPUs: 2 GeForce GTX 1070, each with 8GiB memory
GPU driver: 418.87.01, CUDA: 10.1
- Script:
# after install https://github.com/huggingface/transformers
cd examples/question - answering
mkdir -p data
wget -O data/train - v2.0.json https://rajpurkar.github.io/SQuAD - explorer/dataset/train - v2.0.json
wget -O data/dev - v2.0.json https://rajpurkar.github.io/SQuAD - explorer/dataset/dev - v2.0.json
export SQUAD_DIR=`pwd`/data
python run_squad.py \
--model_type mobilebert \
--model_name_or_path google/mobilebert - uncased \
--do_train \
--do_eval \
--do_lower_case \
--version_2_with_negative \
--train_file $SQUAD_DIR/train - v2.0.json \
--predict_file $SQUAD_DIR/dev - v2.0.json \
--per_gpu_train_batch_size 16 \
--per_gpu_eval_batch_size 16 \
--learning_rate 4e - 5 \
--num_train_epochs 5.0 \
--max_seq_length 320 \
--doc_stride 128 \
--warmup_steps 1400 \
--save_steps 2000 \
--output_dir $SQUAD_DIR/mobilebert - uncased - warmup - squad_v2 2>&1 | tee train - mobilebert - warmup - squad_v2.log
It took about 3.5 hours to finish the fine - tuning process.
Results
Property |
Details |
Model Size |
95M |
EM (Exact Match) |
75.2 (Original: 76.2) |
F1 Score |
78.8 (Original: 79.2) |
Note that the above results didn't involve any hyperparameter search.
đģ Usage Examples
Basic Usage
from transformers import pipeline
qa_pipeline = pipeline(
"question - answering",
model="csarron/mobilebert - uncased - squad - v2",
tokenizer="csarron/mobilebert - uncased - squad - v2"
)
predictions = qa_pipeline({
'context': "The game was played on February 7, 2016 at Levi's Stadium in the San Francisco Bay Area at Santa Clara, California.",
'question': "What day was the game played on?"
})
print(predictions)
đ License
This project is licensed under the MIT license.
Created by Qingqing Cao | GitHub | Twitter
Made with â¤ī¸ in New York.