đ Sundial
Sundial is a family of generative time series foundation models capable of zero - shot predictions for both point and probabilistic forecasting.
đ Quick Start
pip install transformers==4.40.1 # Use this version and Python 3.10 for stable compatibility
import torch
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained('thuml/sundial-base-128m', trust_remote_code=True)
batch_size, lookback_length = 1, 2880
seqs = torch.randn(batch_size, lookback_length)
forecast_length = 96
num_samples = 20
output = model.generate(seqs, max_new_tokens=forecast_length, num_samples=num_samples)
print(output.shape)
More examples for predicting quantiles or confidence intervals are provided in this notebook.
⨠Features
- Generative Time Series Modeling: Sundial is a family of generative time series foundation models that can make zero - shot predictions for both point and probabilistic forecasting.
- Benchmark Performance:
- In May 2025, it got 1st MASE on the [GIFT - Eval](https://huggingface.co/spaces/Salesforce/GIFT - Eval) Benchmark.
- In May 2025, Sundial was accepted as ICML 2025 Spotlight (Top 2.6%).
- In February 2025, it achieved 1st MSE/MAE zero - shot performance on [Time - Series - Library](https://github.com/thuml/Time - Series - Library) datasets.
đ Documentation
Overall Architecture
The input time series is divided into patch tokens, which are embedded from original continuous values. The patch embeddings are fed into a decoder - only Transformer, a stable and speedup version that learns token representations. The model is optimized using our TimeFlow Loss, a parameterized loss function that models per - token probability distribution conditioned on the learned representations, and generates multiple plausible predictions under the flow - matching framework.
Model View
Sundial can be viewed as an ARMA model (Auto - Regression and Moving - Average). Transformer learns auto - regressive token representations. Conditioned on them, TimeFlow transforms random noises into non - deterministic predictions.
đ Evaluation
We evaluate performance on the following benchmarks:
- [GIFT - Eval (1st MASE)](https://cdn - uploads.huggingface.co/production/uploads/64fbe24a2d20ced4e91de38a/3BxatwayhK5GAoqMf1oHv.png) [[Leaderboard]](https://huggingface.co/spaces/Salesforce/GIFT - Eval).
- [Time - Series - Library (1st MSE/MAE)](https://cdn - uploads.huggingface.co/production/uploads/64fbe24a2d20ced4e91de38a/5VqnFwWTWoYz877Zkluiw.png).
- [FEV Leaderboard](https://cdn - uploads.huggingface.co/production/uploads/64fbe24a2d20ced4e91de38a/mrKL9QmX - aX8rCiwxKgmA.png).
âąī¸ Inference Time
- Hardware: Apple M1 Pro CPU (16 GB)
Lookback Length |
Prediction Length |
# Generated Samples |
Inference Time |
Accelerate By |
672 |
16 |
1 |
249ms |
- |
2880 |
16 |
1 |
510ms |
FlashAttention |
2880 |
720 |
1 |
510ms |
Multi - Patch Prediction |
2880 |
1440 |
1 |
789ms |
KV Cache |
2880 |
720 |
20 |
949ms |
Shared Condition |
đ Specification
Property |
Details |
Architecture |
Causal Transformer (Decoder - only) |
Pre - training Scale |
1032B time points |
Context Length |
up to 2880 |
ReNorm |
Default = True |
Patch Length |
16 |
Multi - Patch Prediction Length |
720 |
Parameter Count |
128M |
Number of Layers |
12 |
Precision |
FP32 |
Speedup |
With KV Cache & FlashAttention & Shared Condition |
đ Acknowledgments
This work was supported by the National Natural Science Foundation of China (62022050 and U2342217), the BNRist Innovation Fund (BNR2024RC01010), and the National Engineering Research Center for Big Data Software.
The model is mostly built from the Internet public time series dataset, which comes from different research teams and providers. We sincerely thank all individuals and organizations who have contributed the data. Without their generous sharing, this model would not have existed.
đ Citation
@article{liu2025sundial,
title={Sundial: A Family of Highly Capable Time Series Foundation Models},
author={Liu, Yong and Qin, Guo and Shi, Zhiyuan and Chen, Zhi and Yang, Caiyin and Huang, Xiangdong and Wang, Jianmin and Long, Mingsheng},
journal={arXiv preprint arXiv:2502.00816},
year={2025}
}
đ Contact
If you have any questions or want to use the code, feel free to contact:
- Yong Liu (liuyong21@mails.tsinghua.edu.cn)
- Guo Qin (qinguo24@mails.tsinghua.edu.cn)
đ License
This model is licensed under the Apache - 2.0 License.