Model Rebuilding

As mentioned in the overview section, TIM can take a Model Zoo as one of its inputs and modify it to better suit the current forecasting situation (offered by the rebuild-model method). This section discusses several ways of using this functionality to handle a real production deployment. TIM offers three modes of rebuilding a model, as described in the configuration parameter rebuilding policy.

The easiest "rebuild all" way

The easiest way to make sure your production pipeline works well is to build every required model from scratch by calling build-model and using all available data. TIM will make sure that every single model has the highest possible accuracy. Alternatively, one can achieve this when rebuilding (a) model(s) by configuring the rebuilding policy setting to "All". This will cause the rebuilding process to ignore all input models and proceed in the same way.

"rebuildingPolicy": {
  "type": "All"
}

There are two disadvantages to this way of working:

the model building time: interpreting existing models is much faster than building new ones;
data traffic: for evaluating existing models, the entire dataset usually doesn't have to be sent to TIM; only the most recent part of the data should suffice.

The most convenient "rebuild new situations" way

The most convenient way to ensure a working production pipeline is by configuring the rebuilding policy setting to "new situations". The main advantage of this way of working is overcoming the first disadvantage of the previous option. When a user selects this option, TIM will gradually enrich the existing Model Zoo with new models if there is a need to build them. This need may arise from a changed prediction horizon, different predictor availabilities or a different time of day corresponding to the last target timestamp (in the case of daily cycle data). Read more about this topic in the section about different situations.

"rebuildingPolicy": {
  "type": "NewSituations"
}

This approach also has two disadvantages:

data traffic: see the previous approach;
deteriorating accuracy: as is the nature of time-series data, one cannot (should not) build a model and then continue to use it for long periods of time, because the dynamics in the underlying data often change significantly.

The safest "rebuild older than" way

The safest way to ensure a working production pipeline is to consider the age of the models in the Model Zoo. This approach overcomes the second disadvantage of the previous approach. A user can define what should be considered as an "old" model. If TIM recognizes that an "old" model has to be used to forecast, a new model will be built instead and replace the old one.

"rebuildingPolicy": {
  "type": "OlderThan",
  "time": {
    "baseUnit": "Day",
    "value": 7
  }
}

This approach still has one disadvantage:

data traffic: see the previous approaches.

Things to bear in mind when rebuilding

Several problems may occur when chaining models using different data-related settings. The parameter columns can change the number of predictors, and the parameter inSampleRows the number of rows to build models (train) on.

When using more predictors: TIM will consider this as a new situation and build new models accordingly.
When using fewer predictors: This can work fine, provided that the Model Zoo does not use any of the omitted predictors. If it does, TIM will consider this as a new distinct situation.
When using fewer rows: TIM will return a warning if any new models have to be built with fewer rows for training. A user should pay extra attention to the results if this happens, because this may result in a lower than expected accuracy.

The easiest "rebuild all" way​

The most convenient "rebuild new situations" way​

The safest "rebuild older than" way​

Things to bear in mind when rebuilding​

The easiest "rebuild all" way

The most convenient "rebuild new situations" way

The safest "rebuild older than" way

Things to bear in mind when rebuilding