Datasets

Datasets are a prerequisite to any further efforts, as they are required for both model building and forecasting/anomaly detection.

Datasets-screen

On the main Datasets-screen, a list of available (i.e. already uploaded) datasets is displayed. The table below gives an overview of the parameters that are shown to describe a dataset:

Parameter Meaning
Name A name the user gives to a new dataset when creating it
ID A numeric identifier assigned to a dataset by TIMTM Studio upon its creation
Predictors The amount of predictors in the dataset
Records The amount of records (rows) in the dataset, i.e. the length of the dataset
Sampling Rate The time between subsequent observations in the dataset
Target The name of the target variable
Last Updated The date and time of the latest update of the dataset
Last Target The date and time of the latest value for the target variable

Actions regarding datasets

Users can manage datasets on the Datasets-screen. A user can perform the following actions on this screen:

  • browse/see the list of existing datasets,
  • see a basic description of each dataset,
  • inspect a dataset by displaying line-charts per vector (column),
  • search for a dataset by name,
  • upload a new dataset,
  • update an existing dataset and
  • delete an existing dataset.

Some of these actions are explained in more detail below.

Uploading a new dataset

To upload a new dataset, a user has to go through the following steps:

  1. Click the "Add New Dataset"-button in the top right corner of the screen.
  2. Type in the desired name for the dataset.
  3. Select either the CSV-tab or the SQL-tab, depending on the data source.
  4. Define the details for the dataset as follows:
  5. CSV-file

    image61.png

    Click the "Browse"-button and select the file to be uploaded. The selected CSV-file must meet the following criteria:

    • ";" or "," must be used as column separator,
    • "." must be used as decimal places separator,
    • the first column must contain the timestamps formatted in ISO 8601 as "YYYY-MM-DD hh:mm:ss" in UTC,
    • the second column must contain the target variable (i.e. the variable to be predicted),
    • the remaining columns contain predictors and
    • ideally, the dataset is continuous.
    • SQL

    image62.png

    Enter the following details:

    • the name of the dataset,
    • the database management system (currently PostgreSQL, MySQL/MariaDB and Microsoft SQL are supported),
    • the host: the name of the server the database is running on,
    • the port: the server port under which the database is available (default values typically are 5432 for PostgreSQL, 3306 for MySQL/MariaDB and 1433 for Microsoft SQL),
    • the username,
    • the password,
    • the name of the database and
    • the name of the table.
    • Click the "Add Dataset"-button to upload the selected dataset into TIM Studio.

Inspecting a dataset

To inspect a dataset, a user should click the >-icon next to it (in the View"-column) in the list of available datasets. The area with the line chart will then be expanded below. Below the chart, boxes with the name of the target variable and the different predictors in the dataset are displayed. To switch individual variables that are rendered in the chart on and off, select or deselect them. To zoom in to a specific interval, select a particular area in a chart (press the left mouse button, hold, drag and release). To zoom out, double click anywhere in the chart.

Updating an existing dataset

A CSV-file

To update an existing dataset by use of a CSV-file, a user has to go through the following steps:

  1. In the list of datasets, click the arrow-icon (image76.png) next to the relevant dataset in the "Update"-column (i.e. the last column).
  2. In the dialogue-window, select the file to be uploaded.
  3. The newly uploaded dataset will then replace the existing one.

SQL

To update an existing dataset by use of SQL, a user has to go through the following steps:

  1. In the list of datasets, click the arrow-icon (image76.png) next to the relevant dataset in the "Update"-column (i.e. the last column).
  2. The dataset is updated from the SQL source following these rules:
  3. If the timestamp of the data point in the SQL table (i.e. the row) is already present in the dataset in TIMTM Studio, then the datapoint in TIMTM Studio is updated with the new value.
  4. If the timestamp of the data point in the SQL table (i.e. the row) is not yet present in the dataset in TIMTM Studio, then this data point is added to TIMTM Studio dataset.

Deleting an existing dataset

To delete an exiting dataset, a user has to go through the following steps:

  1. In the list of datasets, click the bin-icon (image55.png) next to the relevant dataset in the "Update"-column (i.e. the last column).
  2. Confirm the deletion by clicking the "Delete"-button.