Data Formatting

TIM Forecasting in Excel uses the data within the active Excel worksheet to make the desired forecasts. In order to successfully achieve this, this data needs to meet certain conditions, which are discussed below. These conditions conform with the data specifications stated under General TIM Settings in Dataset Properties. However, the restrictions put on timestamps in Excel are less strict than those mentioned in Dataset Properties.

The structure of the dataset

The dataset should contain headers, labelling the different columns by their respective meanings. These headers should be placed in the first row of the dataset. All headers must be unique and cannot start with an underscore.

The first column of the dataset should contain the timestamps, other columns can contain data about the target variable and additional predictors. There needs to be at least one other column, i.e. the column containing the target variable.

Timestamps

The timestamps, contained in the first column of the dataset, need to follow a date-time format that is recognised by Excel. TIM in Excel can interpret any of these recognised date-time formats and will automatically represent the forecasted timestamps in the same format, to preserve the consistency of the dataset.

If the timestamps in Excel are represented as plain text, users will need to indicate the correct date-time format that is followed by these timestamps in Excel. In order to do so, the following steps need to be completed:

  1. Select all timestamps in Excel (i.e. the entire first column of data, except for the header).
  2. Right-click on the selection, and then select 'Format cells'.
  3. Click the category 'Custom' and type in the correct format. For example: to represent text such as 2019-11-13 12:30:00, the format yyyy-mm-dd hh:mm:ss should be typed in. This will convert the timestamps to a recognised datetime format.

If it is unclear to the user in which format the timestamps are represented, a similar procedure can help to determine this. To check the current format of the timestamps, the following steps need to be completed:

  1. Select all timestamps in Excel (i.e. the entire first column of data, except for the header).
  2. Right-click on the selection, and then select 'Format cells'.
  3. The category and format that is represented here, is the current format of the timestamps.

If the format is not either Date or a Custom date-time format, it needs to be changed. See earlier on this page for the steps describing the procedure of changing the format.

Sampling rate

Although the sampling is comprehensively described in Dataset Properties, it is mentioned here to make sure the necessary information isn't missed by users. The sampling rate (defined as the time between subsequent observations), should be constant. Thus, all timestamps should be evenly spaced. The only possible exception is when a gap in measurements occurs. The frequency of observations in the dataset cannot exceed the sampling rate. For more information about sampling rates, please refer to Dataset Properties.

Target and predictor data

The target and predictor data should be rendered as numerical values. The approach described in 'The timestamps' above can also be used to change the formatting of this data to the category 'Number'.

By default, the second column of the dataset is expected to contain the target variable. If any additional columns are present, they are interpreted as additional predictors. However, users can select any of these columns (i.e. any column except for the first one, containing the timestamps) to be the target variable. The remaining columns will then be interpreted as additional predictors. It is possible to choose whether or not to include each predictor for forecasting by (de)selecting the checkbox next to this predictor in the list of predictors.

Multiple datasets in one Excel file

Sometimes, users might wish to observe multiple datasets in the same Excel file, through the use of worksheets. TIM in Excel will automatically recognise when a user changes between worksheets, and the content of the taskpane (list of predictors, target variable) will automatically be adapted to match the currently active worksheet.

Dataset range recognition and manual selection

TIM in Excel automatically extracts the range which (most likely) contains your dataset. This range is assumed to be the used range (i.e. the range containing data). Empty columns and columns that seem to contain notes are automatically excluded from the possible target and selector lists. If multiple datasets seem to be present in the same worksheet (separated by empty columns), TIM in Excel will assume the desired dataset is contained in the first range it comes across, i.e. the one closest to the beginning of the worksheet (cell A1).

Upon starting/reloading the add-in or moving to another worksheet, the user will be asked to confirm the automatically extracted range, or indicate which range they would prefer. If at any point a user is not completely satisfied by the range of the dataset, they can manually change it. The chosen range must be continuous, i.e. it must be uniquely identified by its address in the format upper left cell : lower right cell.