Skip to content

Raw and aggregated predictions


When TIM is deployed in production, it looks at the current situation and selects the most appropriate model from the ModelZOO.

For example, consider a ModelZOO prepared to forecast at both 07:00 and 10:00, for both one day ahead and two days ahead. If a user wants to make a forecast at "2019-03-02 07:05:35" for "2019-03-03 15:00:00", TIM will automatically recognize that the data availability corresponds to the 07:00 scenario and that the desired forecast corresponds to the one day ahead scenario. TIM stores this information, as well as the corresponding prediction, in a JSON-file called "prediction".

TIM's ability to recognize the current situation is important; the desired forecast could also have been made in other situations, for example using the data availability scenario at 10:00 the previous day with the two days ahead scenario. This would result in less accurate forecasts, as this mode would ignore the most recent available data. TIM's ability to recognize situations thus allows to seamlessly select the best possible model to ensure the best possible forecasts are made. When using TIM to regularly make forecasts based on deployed models, TIM is able to automate all of this.

However, the situation gets more complicated when doing so-called backtesting. A user might be interested in a model's performance, before actually deploying it in production. Therefore, models are often tested on historical data before they are deployed.

In the situation described above, for each timestamp four forecasts would be made: a two day ahead forecast using the data availability from 07:00 two days earlier, a two day ahead forecast using the data availability from 10:00 two days earlier, a one day ahead forecast using the data availability from 07:00 the previous day and a one day ahead forecast using the data availability from 10:00 the previous day.

The most accurate prediction for each of the timestamps will again be stored in the JSON-file called "prediction", although this does not have much value during backtesting, since the actual value is also known.

The following image can provide a convenient overview of the scenarios discussed as examples below.

Aggregated and Raw.png

Raw predictions

Raw predictions are stored in the same way they would be stored in a real production. For each time of forecasting, each possible forecast will be stored. A unique key for each time of forecasting is used to assign all the respective forecasts to it.

Aggregated predictions

Aggregated predictions are stored in a more advanced way, with the use of two keys: one representing the relative time to the forecast in days and one representing the time of forecasting.

In the previous example this would result in one day ahead forecasts made at 07:00, two days ahead forecasts made at 07:00, one day ahead forecasts made at 10:00 and two day ahead forecasts made at 10:00. This way of storing forecasts is often the most convenient, as it allows to easily answer common questions such as "What is the day ahead accuracy?".