Open-source Tapex-large-sql-execution model - Empowering table reasoning tasks, free deployment, highly practical!

Tapex Large Sql Execution

Developed by microsoft

TAPEX is a model that achieves table pretraining by learning neural SQL executors, based on the BART architecture, specifically designed for table reasoning tasks.

Large Language Model

Transformers

EnglishOpen Source License:MIT #Table QA #SQL Execution Simulation #Table Reasoning

Downloads 68

Release Time : 3/10/2022

Model Overview

TAPEX implements table pretraining by learning neural SQL executors on a synthetic corpus, which is obtained by automatically synthesizing executable SQL queries. It is mainly used for table QA and table fact verification tasks.

Model Features

Neural SQL Execution

Capable of simulating neural SQL execution, i.e., executing SQL queries on given tables.

Table Pretraining

Performs table pretraining by executing SQL queries, enhancing table reasoning capabilities.

Based on BART Architecture

Utilizes BART's Transformer encoder-decoder structure, combining the advantages of bidirectional encoding and autoregressive decoding.

Model Capabilities

Table QA

Table Fact Verification

SQL Query Execution

Use Cases

Data Query

Table Data Query

Executes SQL queries on structured table data to retrieve specific information.

Accurately returns query results, such as the year query in the example.

Data Analysis

Table Data Analysis

Performs complex analysis and reasoning on table data.

🚀 TAPEX (large-sized model)

TAPEX is a pre - training approach that endows existing models with table reasoning skills. It was proposed in a research paper, aiming to solve problems in table - related tasks such as question answering and fact verification.

🚀 Quick Start

TAPEX was proposed in TAPEX: Table Pre-training via Learning a Neural SQL Executor by Qian Liu, Bei Chen, Jiaqi Guo, Morteza Ziyadi, Zeqi Lin, Weizhu Chen, Jian - Guang Lou. The original repo can be found [here](https://github.com/microsoft/Table - Pretraining).

✨ Features

TAPEX (Table Pre - training via Execution) is a conceptually simple and empirically powerful pre - training approach to empower existing models with table reasoning skills. TAPEX realizes table pre - training by learning a neural SQL executor over a synthetic corpus, which is obtained by automatically synthesizing executable SQL queries.

TAPEX is based on the BART architecture, the transformer encoder - encoder (seq2seq) model with a bidirectional (BERT - like) encoder and an autoregressive (GPT - like) decoder.

💻 Usage Examples

Basic Usage

Here is how to use this model in transformers:

from transformers import TapexTokenizer, BartForConditionalGeneration
import pandas as pd

tokenizer = TapexTokenizer.from_pretrained("microsoft/tapex-large-sql-execution")
model = BartForConditionalGeneration.from_pretrained("microsoft/tapex-large-sql-execution")

data = {
    "year": [1896, 1900, 1904, 2004, 2008, 2012],
    "city": ["athens", "paris", "st. louis", "athens", "beijing", "london"]
}
table = pd.DataFrame.from_dict(data)

# tapex accepts uncased input since it is pre-trained on the uncased corpus
query = "select year where city = beijing"
encoding = tokenizer(table=table, query=query, return_tensors="pt")

outputs = model.generate(**encoding)

print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
# ['2008']

Advanced Usage

⚠️ This model checkpoint is ONLY used for simulating neural SQL execution (i.e., employ TAPEX to execute a SQL query on a given table), and you CANNOT use this model for fine - tuning on downstream tasks. The one that can be used for fine - tuning is at [here](https://huggingface.co/microsoft/tapex - large).

This separation of two models for two kinds of intention is because of a known issue in BART large, and we recommend readers to see [this comment](https://github.com/huggingface/transformers/issues/15559#issuecomment - 1062880564) for more details.

📚 Documentation

You can use the raw model for simulating neural SQL execution, i.e., employ TAPEX to execute a SQL query on a given table. However, the model is mostly meant to be fine - tuned on a supervised dataset. Currently TAPEX can be fine - tuned to tackle table question answering tasks and table fact verification tasks. See the model hub to look for fine - tuned versions on a task that interests you.

📄 License

This project is licensed under the MIT license.

BibTeX entry and citation info

@inproceedings{
    liu2022tapex,
    title={{TAPEX}: Table Pre-training via Learning a Neural {SQL} Executor},
    author={Qian Liu and Bei Chen and Jiaqi Guo and Morteza Ziyadi and Zeqi Lin and Weizhu Chen and Jian-Guang Lou},
    booktitle={International Conference on Learning Representations},
    year={2022},
    url={https://openreview.net/forum?id=O50443AsCP}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご