đ Toto-Open-Base-1.0
Toto (Time Series Optimized Transformer for Observability) is a time - series foundation model. It's designed for multi - variate time series forecasting, with a focus on observability metrics. It can efficiently handle high - dimensional, sparse, and non - stationary data commonly found in observability scenarios.
Overview of Toto-Open-Base-1.0 architecture.
đ Quick Start
đĻ Installation
git clone https://github.com/DataDog/toto.git
cd toto
pip install -r requirements.txt
đģ Usage Examples
Basic Usage
Inference code is available on GitHub. Here's how to quickly generate forecasts using Toto:
import torch
from data.util.dataset import MaskedTimeseries
from inference.forecaster import TotoForecaster
from model.toto import Toto
DEVICE = 'cuda'
toto = Toto.from_pretrained('Datadog/Toto-Open-Base-1.0').to(DEVICE)
toto.compile()
forecaster = TotoForecaster(toto.model)
input_series = torch.randn(7, 4096).to(DEVICE)
timestamp_seconds = torch.zeros(7, 4096).to(DEVICE)
time_interval_seconds = torch.full((7,), 60*15).to(DEVICE)
inputs = MaskedTimeseries(
series=input_series,
padding_mask=torch.full_like(input_series, True, dtype=torch.bool),
id_mask=torch.zeros_like(input_series),
timestamp_seconds=timestamp_seconds,
time_interval_seconds=time_interval_seconds,
)
forecast = forecaster.forecast(
inputs,
prediction_length=336,
num_samples=256,
samples_per_batch=256,
)
mean_prediction = forecast.mean
prediction_samples = forecast.samples
lower_quantile = forecast.quantile(0.1)
upper_quantile = forecast.quantile(0.9)
For detailed inference instructions, refer to the inference tutorial notebook.
Advanced Usage
â ī¸ Important Note
For optimal speed and reduced memory usage, install xFormers and [flash - attention](https://github.com/Dao - AILab/flash - attention). Then, set use_memory_efficient
to True
.
⨠Features
- Zero - Shot Forecasting
- Multi - Variate Support
- Decoder - Only Transformer Architecture
- Probabilistic Predictions (Student - T mixture model)
- Causal Patch - Wise Instance Normalization
- Extensive Pretraining on Large - Scale Data
- High - Dimensional Time Series Support
- Tailored for Observability Metrics
- State - of - the - Art Performance on [GiftEval](https://huggingface.co/spaces/Salesforce/GIFT - Eval) and BOOM
đ Documentation
đ Model Information
Property |
Details |
Model Type |
Time - Series Foundation Model |
Training Data |
- Observability Metrics: ~1 trillion points from Datadog internal systems (no customer data) - Public Datasets: GiftEval Pretrain, Chronos datasets - Synthetic Data: ~1/3 of training data |
Available Checkpoints |
[Toto - Open - Base - 1.0](https://huggingface.co/Datadog/Toto - Open - Base - 1.0/blob/main/model.safetensors) with 151M parameters, [Config](https://huggingface.co/Datadog/Toto - Open - Base - 1.0/blob/main/config.json), 605 MB size, Initial release with SOTA performance |
đ Additional Resources
đ License
The model is licensed under the apache - 2.0 license.
đ Citation
If you use Toto in your research or applications, please cite us using the following:
@misc{toto2025,
title={This Time is Different: An Observability Perspective on Time Series Foundation Models},
author={TODO},
year={2025},
eprint={arXiv:TODO},
archivePrefix={arXiv},
primaryClass={cs.LG}
}