BERT-Tiny-Finetuned-SQuADv2 Open-Source Q&A Model - Compact with Only 16.74MB, Quickly and Accurately Answer Questions

Bert Tiny Finetuned Squadv2

Developed by mrm8488

A compact Q&A model fine-tuned on the SQuAD2.0 dataset based on Google's BERT-Tiny architecture, with a size of only 16.74MB

Question Answering System English#Lightweight Q&A #SQuAD2.0 fine-tuning #Low-resource deployment

Downloads 6,327

Release Time : 3/2/2022

Model Overview

A compact Q&A model designed for resource-constrained environments, capable of determining if a passage contains an answer to a question and extracting accurate responses

Model Features

Ultra-lightweight Design

Model size of only 16.74MB, suitable for deployment in resource-constrained environments

SQuAD2.0 Optimization

Specifically optimized for detecting unanswerable questions

Fast Inference

Efficient inference speed thanks to the Tiny architecture

Model Capabilities

Text-based Q&A

No-answer Detection

Context Understanding

Use Cases

Customer Support

FAQ Automation

Automatically answer user questions from knowledge base documents

Approximately 50% accuracy in automated responses

Educational Assistance

Reading Comprehension Tests

Generate and answer comprehension questions based on given texts

🚀 BERT-Tiny fine-tuned on SQuAD v2

This project features BERT-Tiny, developed by Google Research, and fine-tuned on SQuAD 2.0 for the Q&A downstream task. It offers an efficient solution for question - answering in environments with limited computational resources.

Model size (after training): 16.74 MB

✨ Features

Details of BERT-Tiny and its 'family' (from their documentation)

Released on March 11th, 2020, this model is part of 24 smaller BERT models (English only, uncased, trained with WordPiece masking). These models are referenced in Well-Read Students Learn Better: On the Importance of Pre-training Compact Models.

The smaller BERT models are designed for environments with restricted computational resources. They can be fine - tuned in the same way as the original BERT models. However, they are most effective in knowledge distillation scenarios, where the fine - tuning labels are generated by a larger and more accurate teacher model.

Details of the downstream task (Q&A) - Dataset

SQuAD2.0 combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions written adversarially by crowdworkers to resemble answerable ones. To perform well on SQuAD2.0, systems must not only answer questions when possible but also determine when no answer is supported by the paragraph and refrain from answering.

Dataset	Split	# samples
SQuAD2.0	train	130k
SQuAD2.0	eval	12.3k

📦 Installation

The script for fine - tuning can be found here.

🔧 Technical Details

Model training

The model was trained on a Tesla P100 GPU with 25GB of RAM.

📚 Documentation

Results

Metric	# Value
EM	48.60
F1	49.73

Model	EM	F1 score	SIZE (MB)
bert-tiny-finetuned-squadv2	48.60	49.73	16.74
bert-tiny-5-finetuned-squadv2	57.12	60.86	24.34

💻 Usage Examples

Basic Usage

Fast usage with pipelines:

from transformers import pipeline

qa_pipeline = pipeline(
    "question-answering",
    model="mrm8488/bert-tiny-finetuned-squadv2",
    tokenizer="mrm8488/bert-tiny-finetuned-squadv2"
)

qa_pipeline({
    'context': "Manuel Romero has been working hardly in the repository hugginface/transformers lately",
    'question': "Who has been working hard for hugginface/transformers lately?"

})

# Output:

{
  "answer": "Manuel Romero",
  "end": 13,
  "score": 0.05684709993458714,
  "start": 0
}

Advanced Usage

qa_pipeline({
    'context': "Manuel Romero has been working hardly in the repository hugginface/transformers lately",
    'question': "For which company has worked Manuel Romero?"
})

# Output:

{
  "answer": "hugginface/transformers",
  "end": 79,
  "score": 0.11613431826808274,
  "start": 56
}

Created by Manuel Romero/@mrm8488 | LinkedIn

Made with ♥ in Spain

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご