Error Measures

When training a model, an appropriate error measure has to be chosen. The model is then trained by adapting it in such a way that the chosen error measure is minimized, i.e. that the model makes as little errors as possible. Several error measures are discussed in detail below.


The mean absolute error (MAE) is a statistical measure of difference between two continuous variables. Assume X and Y are variables of paired observations that express the same phenomenon. Examples include predicted versus observed values, observations at an initial time versus at a subsequent time and results of a certain measurement technique versus results of an alternative measurement technique. The MAE describes the mean of the absolute difference between each pair of observations. Graphically, in a scatter plot of n points where point i has coordinates (xi, yi), the MAE is the average vertical distance between each point and the identity line (y = x), as well as the average horizontal distance between each point and the identity line.

The Mean Absolute Error is given by the following formula:



The root-mean-square deviation (RMSD) or root-mean-square error (RMSE) (sometimes root-mean-squared error) is a frequently used measure of the differences between predicted values and observed values. The RMSD represents the square root of the second sample moment of the differences between predicted values and observed values or the quadratic mean of these differences. It is calculated as the square root of the average of the squared errors. The RMSD is always non-negative, and a value of 0 (almost never achieved in practice) would indicate a perfect fit to the data. In general, the lower the RMSD, the better the model's fit to the data.

When calculated over the training data these deviations are called residuals; when computed out-of-sample they are called errors (or prediction errors). The RMSD aggregates the magnitudes of the errors in predictions for various times into a single measure of predictive power. It is a measure of accuracy, suited for comparison of forecasting errors of different models for a particular dataset and not for comparison between datasets, as it is scale-dependent.

The effect of each error on the RMSD is proportional to the size of the squared error; thus larger errors have a disproportionately large effect on the RMSD. Consequently, the RMSD is sensitive to outliers.

The RMSD of predicted values at time t of a regression's dependent variable, with variables observed over T samples, is computed for T different predictions as the square root of the mean of the squared deviations:



The mean absolute percentage error (MAPE), also known as the mean absolute percentage deviation (MAPD), is a statistical measure of the prediction accuracy of a forecasting method. It is used in trend estimation and as a loss function for regression problems in machine learning. It usually expresses accuracy as a percentage, and is defined by the following formula:


where A(t) is the actual value at time t and F(t) is the forecasted value at time t. The difference between A(t) and F(t) is divided by A(t). The absolute value of the result is then summed over time and divided by the number of fitted points n (the amount of observations (and forecasts) in the given time frame). Multiplying the results by 100% transforms it into a percentage error.


The AUC is known as the area under the receiver operating characteristic curve. In data mining, this measure is often used to measure the overall performance of classifiers regardless of the threshold between the true positives and true negatives.

In the context of anomaly detection, the model is a classifier which treats anomalies as the positive class. The AUC is equivalent to the probability that an anomaly detector assigns a higher score (i.e. a higher chance of belonging to the positive class) to anomalies than it does to normal points. Since the AUC is cutoff independent (thus independent of the chosen sensitivity), it measures detection accuracy regardless of the number of anomalies in the dataset.

[Hand and Till 2001] provide the following simple approach to calculate an AUC: