Root cause analysis
Root cause analysis is added at the end of the tabular output described in anomaly detection outputs overview. We can see columns RCA_Influencer 1, RCA_Influencer 2, ..., RCA_Influencer N at the end of the table below, which represent the root cause analysis output:
|datetime||model_index||...||normal_behavior||...||RCA_Influencer 1||RCA_Influencer 2||...||RCA_Influencer N|
RCA brings additional information to a customer concerning anomalies. Without RCA, you could see the actual vs normal behavior values, anomalies, influencers and anomaly indicator - see in the following picture :
It is quite obvious that there is something unusual happening on 23 May. Anomaly indicator went above the threshold, the difference between normal behavior and actual is more prominent, and it's marked with red dots signalizing anomaly. But what was behind this increase in normal behavior remained unclear. Thus, the primary motivation for RCA was to propagate information exploring what drives normal behavior.
RCA should bring a customer :
- transparency, explainability, trust in results and confidence in critical decisions
- a deeper understanding of what drives normal behavior
- to explore the possible reason behind the anomaly
- make the final decision about anomaly candidate based on analysis
What information can I obtain from the root cause analysis ?¶
What drives normal behavior¶
RCA output reveals the involvement of each influencer in normal behavior for a given data point. It is a straightforward way how to see what drives normal behavior.
For a given timestamp t, the sum of the influencers equals to normal behavior value.
What drives normal behavior change¶
Using the RCA output you can additionally create differences between most recent normal behaviors before anomalous point adhering to reasonable filtering (that secure that differences are calculated only between the same models(model_index) and are not affected by anomalies except the analyzed anomalous point). Especially useful is to calculate the difference between anomalous point and the most recent not anomalous point - this gives you information about the involvements of each influencer on the change of normal behavior.
We are getting now :
For a given timestamp t, the sum of the influencers changes equals to normal behavior change.