Skip to content

Root cause analysis and how to read it

Root cause analysis

Root cause analysis can be a valuable source of information provided over your forecasting result. One can trigger its calculation for an already finished forecasting job to see why the forecast looks the way it does and how the Model Zoo is constructed.

Structure of the output

datetime date_from model_index bin term_1 term_2 term_3 term_N yhat_1 yhat_2 yhat_3 yhat_N
2014-10-28T03:00:00.0 2014-10-28 1 1 2546 900 943.05 624 1943 1984 1987 1996
2014-10-28T04:00:00.0 2014-10-27 7 2 2451 5000 5409.6 2195 2104 2089
2014-10-28T04:00:00.0 2014-10-27 6 2 2103 200 65363.4 2211 2190 2168
2014-10-28T04:00:00.0 2014-10-27 5 2 2301 100 543.5 545 2189 2154 2167 2153
2014-10-28T04:00:00.0 2014-10-28 4 1 2225 432 983 2567 2592 2598
2014-10-28T04:00:00.0 2014-10-28 3 1 2155 4355 1235.6 2532 2490 2487

What information can I obtain from the root cause analysis?

First of all, each forecast is always generated by a different model within a Model Zoo. To look at the forecast and understand how it was constructed, one should always limit the view to only other forecasts generated by the same exact model. Each model is additive in its terms, therefore it is easy to see the impact of each term on the forecast individually. There are 2 possible views - nominal and relative. The first one gives precise information about how much each term contributed to the forecasted value. The relative view decomposes the forecast in a slightly different manner that can give you a better geometrical understanding of how the model is build gradually.

  1. Nominal term_i Value of i-th term of the model with model_index used to obtain the forecast . The term can be found in the Model Zoo by model_index and the order number i. It is importnat to mention that the term_1 of the model with the model_index 1 is different than the term_1 of the model with the model_index 2 - they are two separate models and have different terms.

  2. Relative yhat_i Essentially gives you a forecast which you would obtain if the model only consisted of the first i terms (different from the sum of the first i terms). The advantage is that the forecasting error decreases with the increasing i and you can see the gradual build-up of the model. Nominal view of terms does not satisfy this property. We can see how important individual terms are for the final forecast and how they influence it. If something goes wrong, we can also easily identify which term is responsible.