đ TableLlama: Towards Open Large Generalist Models for Tables
TableLlama is an open - source large generalist model designed for various table - based tasks. It is trained on a well - curated instruction tuning dataset for tables, enabling it to handle up to 8K context.
đ Quick Start
You can use the TableLlama models through Huggingface's Transformers library. For more advanced usage, check our Github repo: https://osu-nlp-group.github.io/TableLlama/
⨠Features
- Tailored for Tables: Specifically designed to handle various table - based tasks.
- Large Context Handling: Can handle up to 8K context.
- Trained on Comprehensive Data: Trained on the đ¤ TableInstruct Dataset, which covers a variety of real - world tables and realistic tasks.
đĻ Installation
No specific installation steps are provided in the original document.
đģ Usage Examples
Basic Usage
You can use the models through Huggingface's Transformers library.
Advanced Usage
Check our Github repo for more advanced use: https://osu-nlp-group.github.io/TableLlama/
Prompt Format
Below is an instruction that describes a task, paired with an input that provides further context. Write a response that
appropriately completes the request.
### Instruction:
{instruction}
### Input:
{input}
### Question:
{question}
### Response:
đ Documentation
Project Page
https://osu-nlp-group.github.io/TableLlama/
Paper
https://arxiv.org/abs/2311.09206
Dataset
https://huggingface.co/datasets/osunlp/TableInstruct/
Model
TableLlama - 7B
Data
The models are trained on the đ¤ TableInstruct Dataset, which includes a comprehensive table - based instruction tuning dataset that covers a variety of real - world tables and realistic tasks. We include 14 datasets of 11 tasks in total. Check out the dataset card for more details.
Training Procedure
The models are fine - tuned with the TableInstruct dataset using LongLoRA (7B), fully fine - tuning version as the base model, which replaces the vanilla attention mechanism of the original Llama - 2 (7B) with shift short attention. The training takes 9 days on a 48 80*A100 cluster. Check out our paper for more details.
Evaluation
The models are evaluated on 8 in - domain datasets of 8 tasks and 6 out - of - domain datasets of 4 tasks.
đ§ Technical Details
The models are fine - tuned with the TableInstruct dataset using LongLoRA (7B), fully fine - tuning version as the base model, which replaces the vanilla attention mechanism of the original Llama - 2 (7B) with shift short attention. The training takes 9 days on a 48 80*A100 cluster.
đ License
The project uses the cc - by - 4.0 license.
Limitations
We've tried our best to build table generalist models. However, we acknowledge that the models' performance may vary based on the complexity and specifics of the table tasks and datasets. Still not all table - based tasks can be covered comprehensively.
Citation
If you use the models, data, or code from this project, please cite the original paper:
@misc{zhang2023tablellama,
title={TableLlama: Towards Open Large Generalist Models for Tables},
author={Tianshu Zhang and Xiang Yue and Yifei Li and Huan Sun},
year={2023},
eprint={2311.09206},
archivePrefix={arXiv},
primaryClass={cs.CL}
}