Open-source tapex-large-finetuned-wtq model - Providing efficient solutions specifically for table reasoning tasks

Tapex Large Finetuned Wtq

Developed by microsoft

TAPEX is a table pre-training model that learns via neural SQL executor, based on the BART architecture, specifically designed for table reasoning tasks.

Question Answering System

Transformers

EnglishOpen Source License:MIT #Table Reasoning #SQL Execution Pretraining #Complex QA

Downloads 2,431

Release Time : 3/10/2022

Model Overview

TAPEX achieves table pre-training by learning a neural SQL executor on a synthetic corpus, aiming to endow existing models with table reasoning capabilities.

Model Features

Table Pre-training

Learns via neural SQL executor, equipping the model with powerful table reasoning capabilities.

BART-based Architecture

Combines the advantages of bidirectional encoder and autoregressive decoder, suitable for sequence-to-sequence tasks.

Complex Question Answering

Capable of handling complex queries and reasoning problems involving tabular data.

Model Capabilities

Table Question Answering

Table Reasoning

Complex Query Processing

Use Cases

Data Querying

Table Data Question Answering

Answers complex questions about table data, such as comparisons, calculations, and finding specific information.

Can accurately answer questions like 'In which year did Beijing host the Olympics?'

🚀 TAPEX (large-sized model)

TAPEX is a pre - training approach that endows existing models with table reasoning skills. It was proposed in a research paper and can be used for complex table question - answering tasks.

✨ Features

Table Reasoning Empowerment: TAPEX enables existing models to acquire table reasoning skills through a pre - training approach.
Based on BART Architecture: It builds on the BART architecture, a transformer encoder - decoder (seq2seq) model with a bidirectional encoder and an autoregressive decoder.
Fine - tuned on Dataset: This tapex - base model is fine - tuned on the WikiTableQuestions dataset.

📦 Installation

No specific installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

Basic Usage

from transformers import TapexTokenizer, BartForConditionalGeneration
import pandas as pd

tokenizer = TapexTokenizer.from_pretrained("microsoft/tapex-large-finetuned-wtq")
model = BartForConditionalGeneration.from_pretrained("microsoft/tapex-large-finetuned-wtq")

data = {
    "year": [1896, 1900, 1904, 2004, 2008, 2012],
    "city": ["athens", "paris", "st. louis", "athens", "beijing", "london"]
}
table = pd.DataFrame.from_dict(data)

# tapex accepts uncased input since it is pre-trained on the uncased corpus
query = "In which year did beijing host the Olympic Games?"
encoding = tokenizer(table=table, query=query, return_tensors="pt")

outputs = model.generate(**encoding)

print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
# [' 2008.0']

📚 Documentation

Model description

TAPEX (Table Pre - training via Execution) is a conceptually simple and empirically powerful pre - training approach to empower existing models with table reasoning skills. TAPEX realizes table pre - training by learning a neural SQL executor over a synthetic corpus, which is obtained by automatically synthesizing executable SQL queries.

TAPEX is based on the BART architecture, the transformer encoder - decoder (seq2seq) model with a bidirectional (BERT - like) encoder and an autoregressive (GPT - like) decoder.

This model is the tapex - base model fine - tuned on the WikiTableQuestions dataset.

Intended Uses

You can use the model for table question answering on complex questions. Some solveable questions are shown below (corresponding tables now shown):

Question	Answer
according to the table, what is the last title that spicy horse produced?	Akaneiro: Demon Hunters
what is the difference in runners - up from coleraine academical institution and royal school dungannon?	20
what were the first and last movies greenstreet acted in?	The Maltese Falcon, Malaya
in which olympic games did arasay thondike not finish in the top 20?	2012
which broadcaster hosted 3 titles but they had only 1 episode?	Channel 4

How to Eval

Please find the eval script here.

BibTeX entry and citation info

@inproceedings{
    liu2022tapex,
    title={{TAPEX}: Table Pre-training via Learning a Neural {SQL} Executor},
    author={Qian Liu and Bei Chen and Jiaqi Guo and Morteza Ziyadi and Zeqi Lin and Weizhu Chen and Jian-Guang Lou},
    booktitle={International Conference on Learning Representations},
    year={2022},
    url={https://openreview.net/forum?id=O50443AsCP}
}

📄 License

This project is licensed under the MIT license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご