🚀 TAPAS base model fine-tuned on WikiSQL (in a supervised fashion)
This model is a fine-tuned version of the TAPAS base model on WikiSQL in a supervised manner. It offers two usable versions, enabling users to handle various table-related tasks.
✨ Features
- Two Model Versions: The default version corresponds to the
tapas_wikisql_sqa_inter_masklm_base_reset
checkpoint of the original Github repository. The non - default no_reset
version corresponds to tapas_wikisql_sqa_inter_masklm_base
, using absolute position embeddings.
- Pretraining and Fine - tuning: Pretrained on MLM and intermediate pre - training, then fine - tuned on SQA and WikiSQL in a chain.
- Relative Position Embeddings: The default version uses relative position embeddings, resetting the position index at every cell of the table.
📚 Documentation
Model description
TAPAS is a BERT - like transformers model pretrained on a large corpus of English data from Wikipedia in a self - supervised fashion. It was pretrained with two objectives:
- Masked language modeling (MLM): Given a (flattened) table and associated context, the model randomly masks 15% of the words in the input and then predicts the masked words. This allows the model to learn a bidirectional representation of a table and associated text.
- Intermediate pre - training: To encourage numerical reasoning on tables, the model was pre - trained on a balanced dataset of millions of syntactically created training examples. It must predict whether a sentence is supported or refuted by the contents of a table.
Fine - tuning is done by adding a cell selection head and aggregation head on top of the pre - trained model and jointly training these heads with the base model on SQA and WikiSQL.
Intended uses & limitations
You can use this model for answering questions related to a table. For code examples, refer to the documentation of TAPAS on the HuggingFace website.
Training procedure
Preprocessing
The texts are lowercased and tokenized using WordPiece with a vocabulary size of 30,000. The model inputs are in the form:
[CLS] Question [SEP] Flattened table [SEP]
The authors converted the WikiSQL dataset into the format of SQA using automatic conversion scripts.
Fine - tuning
The model was fine - tuned on 32 Cloud TPU v3 cores for 50,000 steps with a maximum sequence length of 512 and a batch size of 512. Fine - tuning takes around 10 hours. The optimizer used is Adam with a learning rate of 6.17164e - 5 and a warmup ratio of 0.1424. See the paper (tables 11 and 12) for more details.
BibTeX entry and citation info
@misc{herzig2020tapas,
title={TAPAS: Weakly Supervised Table Parsing via Pre-training},
author={Jonathan Herzig and Paweł Krzysztof Nowak and Thomas Müller and Francesco Piccinno and Julian Martin Eisenschlos},
year={2020},
eprint={2004.02349},
archivePrefix={arXiv},
primaryClass={cs.IR}
}
@misc{eisenschlos2020understanding,
title={Understanding tables with intermediate pre-training},
author={Julian Martin Eisenschlos and Syrine Krichene and Thomas Müller},
year={2020},
eprint={2010.00571},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
@article{DBLP:journals/corr/abs-1709-00103,
author = {Victor Zhong and
Caiming Xiong and
Richard Socher},
title = {Seq2SQL: Generating Structured Queries from Natural Language using
Reinforcement Learning},
journal = {CoRR},
volume = {abs/1709.00103},
year = {2017},
url = {http://arxiv.org/abs/1709.00103},
archivePrefix = {arXiv},
eprint = {1709.00103},
timestamp = {Mon, 13 Aug 2018 16:48:41 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-1709-00103.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
📄 License
This model is licensed under the apache - 2.0 license.
Property |
Details |
Model Type |
TAPAS base model fine - tuned on WikiSQL |
Training Data |
WikiSQL, SQA |
Tags |
tapas |