Tapex-base-finetuned-wtq Open-source Model - Free and Accurate Answers for Table Question-answering Tasks

Tapex Base Finetuned Wtq

Developed by microsoft

TAPEX is a table pre-training model that learns through neural SQL executors, specifically designed for table question answering tasks.

Question Answering System

Transformers

EnglishOpen Source License:MIT #Table Question Answering #SQL Execution Pretraining #Complex Table Reasoning

Downloads 162

Release Time : 3/30/2022

Model Overview

TAPEX is a table pre-training model based on the BART architecture, which learns table reasoning capabilities by executing SQL queries. The model is fine-tuned on the WikiTableQuestions dataset and is suitable for answering complex table-related questions.

Model Features

Table Reasoning Capability

Learns through neural SQL executors, capable of handling complex table reasoning tasks.

BART-based Architecture

Combines the advantages of bidirectional encoders and autoregressive decoders, suitable for generative tasks.

Case-insensitive Processing

The model accepts case-insensitive inputs, enhancing flexibility in practical applications.

Model Capabilities

Table Question Answering

Complex Query Processing

Table Data Reasoning

Use Cases

Data Query

Query Specific Data

Query specific information from a table, such as 'In which year did Beijing host the Olympics?'

Returns accurate query results, e.g., '2008'

Complex Conditional Query

Handles complex queries involving multiple conditions, such as 'What were the first and last movies Greenstreet appeared in?'

Returns multiple qualified answers, e.g., 'The Maltese Falcon, Malaya'

🚀 TAPEX (base-sized model)

TAPEX is a pre - training approach that endows existing models with table reasoning skills. It was proposed in a research paper, aiming to solve complex table - related questions.

🚀 Quick Start

TAPEX was proposed in TAPEX: Table Pre-training via Learning a Neural SQL Executor by Qian Liu, Bei Chen, Jiaqi Guo, Morteza Ziyadi, Zeqi Lin, Weizhu Chen, Jian - Guang Lou. The original repo can be found here.

✨ Features

TAPEX (Table Pre - training via Execution) is a conceptually simple and empirically powerful pre - training approach to empower existing models with table reasoning skills. TAPEX realizes table pre - training by learning a neural SQL executor over a synthetic corpus, which is obtained by automatically synthesizing executable SQL queries.

TAPEX is based on the BART architecture, the transformer encoder - encoder (seq2seq) model with a bidirectional (BERT - like) encoder and an autoregressive (GPT - like) decoder.

This model is the tapex - base model fine - tuned on the WikiTableQuestions dataset.

💻 Usage Examples

Basic Usage

You can use the model for table question answering on complex questions. Some solveable questions are shown below (corresponding tables now shown):

Question	Answer
according to the table, what is the last title that spicy horse produced?	Akaneiro: Demon Hunters
what is the difference in runners - up from coleraine academical institution and royal school dungannon?	20
what were the first and last movies greenstreet acted in?	The Maltese Falcon, Malaya
in which olympic games did arasay thondike not finish in the top 20?	2012
which broadcaster hosted 3 titles but they had only 1 episode?	Channel 4

Advanced Usage

Here is how to use this model in transformers:

from transformers import TapexTokenizer, BartForConditionalGeneration
import pandas as pd

tokenizer = TapexTokenizer.from_pretrained("microsoft/tapex-base-finetuned-wtq")
model = BartForConditionalGeneration.from_pretrained("microsoft/tapex-base-finetuned-wtq")

data = {
    "year": [1896, 1900, 1904, 2004, 2008, 2012],
    "city": ["athens", "paris", "st. louis", "athens", "beijing", "london"]
}
table = pd.DataFrame.from_dict(data)

# tapex accepts uncased input since it is pre-trained on the uncased corpus
query = "In which year did beijing host the Olympic Games?"
encoding = tokenizer(table=table, query=query, return_tensors="pt")

outputs = model.generate(**encoding)

print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
# [' 2008.0']

How to Eval

Please find the eval script here.

📚 Documentation

BibTeX entry and citation info

@inproceedings{
    liu2022tapex,
    title={{TAPEX}: Table Pre-training via Learning a Neural {SQL} Executor},
    author={Qian Liu and Bei Chen and Jiaqi Guo and Morteza Ziyadi and Zeqi Lin and Weizhu Chen and Jian-Guang Lou},
    booktitle={International Conference on Learning Representations},
    year={2022},
    url={https://openreview.net/forum?id=O50443AsCP}
}

📄 License

This model is released under the MIT license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご