TTM - Research - R2 open - source multivariate time series prediction model, with millions of parameters and above, free prediction, extremely practical.

Ttm Research R2

Developed by ibm-research

A compact pre-trained model for multivariate time series forecasting open-sourced by IBM Research, with parameter scales starting from 1 million, pioneering the concept of 'tiny' pre-trained time series forecasting models.

Climate Model

Safetensors

#Lightweight time series forecasting #Zero-shot learning #Multivariate forecasting

Downloads 400

Release Time : 10/3/2024

Model Overview

TTM is a lightweight time series forecasting model that outperforms benchmark models requiring billions of parameters in zero-shot and few-shot forecasting tasks, supporting point forecasting tasks with minute-to-hour level resolution.

Model Features

Lightweight and efficient

Parameter scale starts from just 1 million, far smaller than traditional time series forecasting models, and can run on a single GPU or laptop.

Zero-shot forecasting capability

Can be directly applied to new datasets without fine-tuning, outperforming multiple benchmark models requiring billions of parameters.

Rapid fine-tuning

Achieves competitive performance with just 5% of training data and a few minutes of fine-tuning.

Multi-scenario coverage

Provides multiple pre-trained model branches covering different context lengths (512/1024/1536) and forecast lengths (96/192/336/720).

Model Capabilities

Multivariate time series forecasting

Zero-shot forecasting

Few-shot fine-tuning

Minute-level resolution forecasting

Hour-level resolution forecasting

Use Cases

Time series forecasting

Electricity load forecasting

Predict future electricity demand changes over several hours

Outperforms traditional benchmark models in zero-shot settings

Traffic flow forecasting

Predict future traffic flow changes over time periods

Achieves high accuracy with minimal data fine-tuning

🚀 Tiny Time Mixer (TTM) Research-Use Model Card

TinyTimeMixers (TTMs) are compact pre - trained models for Multivariate Time - Series Forecasting, open - sourced by IBM Research. With model sizes starting from 1M params, TTM (accepted in NeurIPS 24) introduces the concept of the first - ever “tiny” pre - trained models for Time - Series Forecasting. This model offers high - performance forecasting with minimal computational resources.

This model card contains the model - weights for research - use only and full reproducibility of our results published in our paper. If you are looking for TTM model weights for commercial and enterprise use, please refer to our granite releases [here](https://huggingface.co/ibm - granite/granite - timeseries - ttm - r2).

TTM outperforms several popular benchmarks demanding billions of parameters in zero - shot and few - shot forecasting. It is a lightweight forecaster, pre - trained on publicly available time series data with various augmentations. TTM provides state - of - the - art zero - shot forecasts and can easily be fine - tuned for multi - variate forecasts with just 5% of the training data to be competitive. Refer to our paper for more details.

The current open - source version supports point forecasting use - cases specifically ranging from minutely to hourly resolutions (e.g., 10 min, 15 min, 1 hour). Note that zeroshot, fine - tuning and inference tasks using TTM can easily be executed in 1 GPU machine or in laptops too!

🚀 Quick Start

To get started with TTM, you can follow the usage examples below. First, make sure you understand the model's capabilities and limitations.

✨ Features

Focused Pre - trained Models: Each pre - trained TTM is tailored for a particular forecasting setting (governed by the context length and forecast length), resulting in more accurate results.
Lightweight and Fast: With extremely small model sizes and high speed, TTM can be easily deployed without demanding a large amount of resources.
State - of - the - Art Performance: Outperforms popular benchmarks in zero/few - shot forecasting while significantly reducing computational requirements.
Multivariate Forecasting: Supports both channel independence and channel - mixing approaches, as well as exogenous and categorical data infusion.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

# Load Model from HF Model Hub mentioning the branch name in revision field
model = TinyTimeMixerForPrediction.from_pretrained(
                "https://huggingface.co/ibm/TTM", revision="main"
            ) 

# Do zeroshot
zeroshot_trainer = Trainer(
        model=model,
        args=zeroshot_forecast_args,
    )

zeroshot_output = zeroshot_trainer.evaluate(dset_test)

Advanced Usage

# Freeze backbone and enable few - shot or finetuning:
# freeze backbone
for param in model.backbone.parameters():
  param.requires_grad = False

finetune_forecast_trainer = Trainer(
        model=model,
        args=finetune_forecast_args,
        train_dataset=dset_train,
        eval_dataset=dset_val,
        callbacks=[early_stopping_callback, tracking_callback],
        optimizers=(optimizer, scheduler),
    )
finetune_forecast_trainer.train()
fewshot_output = finetune_forecast_trainer.evaluate(dset_test)

📚 Documentation

Model Description

TTM falls under the category of “focused pre - trained models”. Instead of building one massive model supporting all forecasting settings, we construct smaller pre - trained models, each focusing on a specific forecasting setting. This approach ensures that our models remain extremely small and fast, facilitating easy deployment.

In this model card, we plan to release several pre - trained TTMs for common forecasting settings. We have also released our source code and pretraining scripts for users to pretrain models on their own. Pretraining TTMs is very easy and fast, taking less than a day compared to several days or weeks in traditional approaches.

Model Releases

512 - 96 - ft - r2: Given the last 512 time - points (context length), this model can forecast up to the next 96 time - points (forecast length) in the future. (branch name: main)
1024 - 96 - ft - r2: Given the last 1024 time - points (context length), this model can forecast up to the next 96 time - points (forecast length) in the future. (branch name: 1024 - 96 - ft - r2) [[Benchmarks]]
1536 - 96 - ft - r2: Given the last 1536 time - points (context length), this model can forecast up to the next 96 time - points (forecast length) in the future. (branch name: 1536 - 96 - ft - r2)
There are also models released for forecast lengths up to 720 timepoints. The branch names for these are as follows: 512 - 192 - ft - r2, 1024 - 192 - ft - r2, 1536 - 192 - ft - r2, 512 - 336 - r2, 512 - 336 - ft - r2, 1024 - 336 - ft - r2, 1536 - 336 - ft - r2, 512 - 720 - ft - r2, 1024 - 720 - ft - r2, 1536 - 720 - ft - r2
Use the [[get_model]](https://github.com/ibm - granite/granite - tsfm/blob/main/tsfm_public/toolkit/get_model.py) utility to automatically select the required model based on your input context length and forecast length requirement.
Currently, 3 context lengths (512, 1024, and 1536) and 4 forecast lengths (96, 192, 336, 720) are allowed. Users need to provide one of the 3 allowed context lengths as input but can provide any forecast lengths up to 720 in get_model() to get the required model.

Benchmarks

TTM outperforms popular benchmarks such as TimesFM, Moirai, Chronos, Lag - Llama, Moment, GPT4TS, TimeLLM, LLMTime in zero/few - shot forecasting while significantly reducing computational requirements. Moreover, TTMs are lightweight and can be executed even on CPU - only machines, enhancing usability and fostering wider adoption in resource - constrained environments.

TTM - B referred in the paper maps to the 512 context models.
TTM - E referred in the paper maps to the 1024 context models.
TTM - A referred in the paper maps to the 1536 context models.

Note that the Granite TTM models are pre - trained exclusively on datasets with clear commercial - use licenses approved by our legal team. As a result, the pre - training dataset used in this release differs slightly from the one used in the research paper, which may lead to minor variations in model performance compared to the published results.

Benchmarking Scripts: [here](https://github.com/ibm - granite/granite - tsfm/blob/main/notebooks/hfdemo/tinytimemixer/full_benchmarking/research - use - r2.sh)

Recommended Use

Data Scaling: Users have to externally standard scale their data independently for every channel before feeding it to the model. Refer to TSP for data scaling.
Resolution Support: The current open - source version supports only minutely and hourly resolutions (e.g., 10 min, 15 min, 1 hour). Other lower resolutions (e.g., weekly or monthly) are currently not supported as the model needs a minimum context length of 512 or 1024.
Context Length: Enabling any upsampling or prepending zeros to virtually increase the context length for shorter - length datasets is not recommended as it will impact the model performance.

Model Details

For more details on TTM architecture and benchmarks, refer to our paper.

TTM - 1 currently supports 2 modes:

Zeroshot forecasting: Directly apply the pre - trained model on your target data to get an initial forecast (with no training).
Finetuned forecasting: Finetune the pre - trained model with a subset of your target data to further improve the forecast.

Since TTM models are extremely small and fast, it is very easy to finetune the model with your available target data in a few minutes to get more accurate forecasts.

The current release supports multivariate forecasting via both channel independence and channel - mixing approaches. Decoder Channel - Mixing can be enabled during fine - tuning for capturing strong channel - correlation patterns across time - series variates, a critical capability lacking in existing counterparts.

In addition, TTM also supports exogenous infusion and categorical data infusion.

Model Sources

Repository: https://github.com/ibm - granite/granite - tsfm/tree/main/tsfm_public/models/tinytimemixer
Paper: https://arxiv.org/pdf/2401.03955.pdf

Blogs and articles on TTM

Refer to our [wiki](https://github.com/ibm - granite/granite - tsfm/wiki)

🔧 Technical Details

The technical details are mainly covered in the paper. TTM's architecture and pre - training methods are designed to achieve high performance with small model sizes.

📄 License

The model is released under the cc - by - nc - sa - 4.0 license.

📖 Citation

Kindly cite the following paper if you intend to use our model or its associated architectures/approaches in your work:

@inproceedings{ekambaram2024tinytimemixersttms,
      title={Tiny Time Mixers (TTMs): Fast Pre - trained Models for Enhanced Zero/Few - Shot Forecasting of Multivariate Time Series},
      author={Vijay Ekambaram and Arindam Jati and Pankaj Dayama and Sumanta Mukherjee and Nam H. Nguyen and Wesley M. Gifford and Chandra Reddy and Jayant Kalagnanam},
      booktitle={Advances in Neural Information Processing Systems (NeurIPS 2024)},
      year={2024},
}

Model Card Authors

Vijay Ekambaram, Arindam Jati, Pankaj Dayama, Wesley M. Gifford, Sumanta Mukherjee, Chandra Reddy and Jayant Kalagnanam

IBM Public Repository Disclosure

All content in this repository including code has been provided by IBM under the associated open source software license and IBM is under no obligation to provide enhancements, updates, or support. IBM developers produced this code as an open source project (not as an IBM product), and IBM makes no assertions as to the level of quality nor security, and will not be maintaining this code going forward.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご