🚀 TAPAS mini model fine-tuned on WikiTable Questions (WTQ)
This TAPAS mini model, fine-tuned on WikiTable Questions (WTQ), offers two usable versions. It's designed to handle table-related question - answering tasks, leveraging pre - training and fine - tuning on multiple datasets.
✨ Features
- Two Model Versions: The default version corresponds to the
tapas_wtq_wikisql_sqa_inter_masklm_mini_reset
checkpoint of the original Github repository. The non - default no_reset
version corresponds to tapas_wtq_wikisql_sqa_inter_masklm_mini
(intermediate pre - training, absolute position embeddings).
- Pre - training and Fine - tuning: Pre - trained on MLM and intermediate pre - training, then fine - tuned in a chain on SQA, WikiSQL and WTQ.
- Relative Position Embeddings: The default version uses relative position embeddings, resetting the position index at every cell of the table.
📚 Documentation
Results
Model description
TAPAS is a BERT - like transformers model pretrained on a large corpus of English data from Wikipedia in a self - supervised fashion. It was pretrained on raw tables and associated texts with an automatic process to generate inputs and labels. The pre - training objectives are:
- Masked language modeling (MLM): The model randomly masks 15% of the words in the input (a flattened table and associated context), then predicts the masked words. This allows it to learn a bidirectional representation of a table and associated text.
- Intermediate pre - training: To encourage numerical reasoning on tables, the model was additionally pre - trained on a balanced dataset of millions of syntactically created training examples. It must predict whether a sentence is supported or refuted by the contents of a table.
Fine - tuning is done by adding a cell selection head and aggregation head on top of the pre - trained model, and jointly training these heads with the base model on SQa, WikiSQL and finally WTQ.
Intended uses & limitations
You can use this model for answering questions related to a table. For code examples, refer to the documentation of TAPAS on the HuggingFace website.
Training procedure
Preprocessing
The texts are lowercased and tokenized using WordPiece with a vocabulary size of 30,000. The model inputs are of the form:
[CLS] Question [SEP] Flattened table [SEP]
The authors first converted the WTQ dataset into the format of SQA using automatic conversion scripts.
Fine - tuning
The model was fine - tuned on 32 Cloud TPU v3 cores for 50,000 steps with a maximum sequence length of 512 and a batch size of 512. Fine - tuning takes around 10 hours. The optimizer used is Adam with a learning rate of 1.93581e - 5 and a warmup ratio of 0.128960. An inductive bias is added such that the model only selects cells of the same column, reflected by the select_one_column
parameter of TapasConfig
. See the paper (tables 11 and 12) for more details.
BibTeX entry and citation info
@misc{herzig2020tapas,
title={TAPAS: Weakly Supervised Table Parsing via Pre-training},
author={Jonathan Herzig and Paweł Krzysztof Nowak and Thomas Müller and Francesco Piccinno and Julian Martin Eisenschlos},
year={2020},
eprint={2004.02349},
archivePrefix={arXiv},
primaryClass={cs.IR}
}
@misc{eisenschlos2020understanding,
title={Understanding tables with intermediate pre-training},
author={Julian Martin Eisenschlos and Syrine Krichene and Thomas Müller},
year={2020},
eprint={2010.00571},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
@article{DBLP:journals/corr/PasupatL15,
author = {Panupong Pasupat and
Percy Liang},
title = {Compositional Semantic Parsing on Semi-Structured Tables},
journal = {CoRR},
volume = {abs/1508.00305},
year = {2015},
url = {http://arxiv.org/abs/1508.00305},
archivePrefix = {arXiv},
eprint = {1508.00305},
timestamp = {Mon, 13 Aug 2018 16:47:37 +0200},
biburl = {https://dblp.org/rec/journals/corr/PasupatL15.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
📄 License
This model is released under the Apache 2.0 license.