Skip to content

Daily cycle and nondaily cycle data

The graph below shows the production of a typical solar farm during one day. The goal of this use case is to forecast this production. To forecast the production at 3:00, the model production(t)=0 probably produces good results; the sun simply does not shine at night. Even so, this model would obviously fail to accurately forecast production at 12:00. The required model complexity differs a lot depending on the time that is being forecated to.


Such datasets have a "daily cycle", as opposed to those that are "nondaily cycle". The difference between these two types of datasets can easily be spotted in the image below.

Daily cycle


Nondaily cycle


How TIM detects a daily cycle

First of all, a daily cycle dataset can only be a dataset that is regularly sampled and has a sampling period between 10 minutes and 12 hours. The lower cap is there because the frequency of sampling causes high variance in the signal and building a separate model for each time would become ineffective. The higher cap is there because there should be at least 2 samples during a day for it to make sense to distinguish different behaviors inside a day.

For datasets that satisfy this condition, a simple autocorrelation analysis is done to see whether the autocorrelation decreases with time or whether it has some regular spikes on a daily basis. This is a strong indication that the dataset follows a daily cycle.

What TIM does differently for daily cycle datasets

The biggest distinction between daily cycle and nondaily cycle datasets is that TIM builds different models to handle separate times of the day for daily cycle datasets, as described in the beginning of this section. This has many consequences, such as different backtesting and different behavior for the model chaining.

Keep in mind that this is a dataset property and therefore the whole Model Zoo will consist of either time-specific models (daily cycle data) or time-generic models (nondaily cycle data).

If one wishes to override how TIM should handle the dataset, this can be done using the configuration.