Data Specifications

General rules

All TIM tools work with datasets and share common rules to follow. You can learn about them here.

CSV-files

TIM Connector works with CSV-files. Any CSV-file TIM needs to use, should use either a comma or a semicolon as a delimiter. The file should contain a header, labelling the different columns by their respective meanings. All headers must be unique and cannot start with an underscore. For anomaly detection, anomalies can be labelled via a column with _LABEL header and 1 / 0 values, where 1 indicates an anomaly. It is optional - in case _LABEL column is added, AUC is calculated. This column is used for calculation of AUC only, it is not included in the model building process.

Ranges Selection

You can specify for both model building and its validation period, which ranges of data you would like to include. Remaining chunks of data will not be used for that specific task.

Example:

data:
  rows:
  - from: 2009-01-01 00:00:00
    to:   2009-06-30 23:00:00
  - from: 2009-09-01 00:00:00
    to:   2010-12-31 23:00:00

Historical actuals versus historical forecasts

It is often preferable to use historical actuals for training and switch to historical forecasts for evaluation, to better represent the results that can be expected. TIM Connector has this feature. Historical forecasts of already present variables must be named with the prefix ‘F_’. If for example Temperature is the header for historical actuals, then the header for its corresponding historical forecasts should be F_Temperature.

Example

The following table contains an exemplary dataset following all the limitations described above.

Timestamps Load Temp1 °C Temp2 °C F_Temp1 °C
2017-01-01 00:00:00 140.55 1.2 1.6 1.3
2017-01-01 01:00:00 125.40 1.3 36.8 1.9
2017-01-01 02:00:00 111.95 1.5 1.5 1.4
2017-01-01 03:00:00 102.04 0.9 1.6 1.2
2017-01-01 04:00:00 95.71 0.1 1.2 0.7
... ... ... ... ...
2017-12-31 23:00:00 0.5 0.8 -0.7