Skip to content

Datasets

Datasets.png

Users can upload, perview, manage and explore datasets in TIM Studio. Users will be able to leverage TIM’s time series database, and TIM Studio will give users an overview of metadata and statistics that might be relevant in the data exploration and preparation phase.

The repository also includes version management, so you can keep track of your version history and update your datasets as new data becomes available.

The datasets overview

In the datasets overview, a user will find a list of all the datasets they have access to. Relevant metadata - such as the workspace a dataset belongs to, the number of observations and variables it contains, the (estimated) sampling period, the last timestamp and when it was last updated - can be seen directly from this overview.

DatasetsOverview.png

In this page, new datasets can be uploaded, and existing datasets can be downloaded, updated, edited and deleted. Datasets can also be downloaded from this page.

  • Uploading: Uploading a new dataset allows the user to select the data source set the dataset's name and description. During the uploading process, users will get to see a preview of the dataset, allowing them to visually check whether they selected the right dataset and set the correct properties.
  • CSV: If the user opts to upload a CSV file, they will be able to set its properties: the column separator, the decimal separator, the timestamp format and the timestamp column).
  • SQL: If the user opts to connect to an SQL table, they will be able to set the connection properties: the database name, the database type (supported types include PostgreSQL, MySQL, MariaDB and SQL_Server), the host, the user name, the password, the port and the table name.

  • Downloading: Downloading a dataset allows a user to extract a CSV file containing this dataset, to upload the same dataset in another Workspace, or to be used for any purposes that do not involve TIM Studio.

  • Updating: Updating a dataset allows the user to upload a new version of an existing dataset, adding new observations or overwriting existing observations. During the updating process, users will get to see a preview of the dataset, allowing them to visually check whether they selected the right dataset and set the correct properties.

  • Editing: Editing a dataset allows the user to update its name and its description.

  • Deleting: Deleting a dataset will also permanently delete all of its versions.

The dataset in detail

In a dataset's detail page, all of the information regarding the dataset can be found. This includes the name, the description, when it was created and when it was last updated, the estimated sampling period, the number of observations and variables it contains and how many (and which) versions of this dataset exist. The page also contains a detailed graph and table, enabling users to explore the data itself in detail. Users can find an overview of the dataset's statistics, including relevant information for each of the variables (minimum, maximum and average values, as well as the amount of missing observations). To quickly navigate to where the dataset is used in the structure of the TIM repository, the page also shows the use cases that center around it.

DatasetDetail.png

From this page, a user can easily browse a specific use case of interest. They can also browse through the different versions of the dataset. Additionally, a user can update the dataset (i.e. upload a new version), as well as edit and delete the dataset. To start building models on this dataset, the user can also link an existing use case to this dataset and add a new use case linked to this dataset.

  • Updating: Updating a dataset allows the user to upload a new version of an existing dataset, adding new observations or overwriting existing observations. During the updating process, users will get to see a preview of the dataset, allowing them to visually check whether they selected the right dataset and set the correct properties.

  • Editing: Editing a dataset allows the user to update its name and its description.

  • Deleting: Deleting a dataset will also permanently delete all of its versions.

  • Linking to a use case: Linking a dataset to an existing use case is possible if the relevant workspace contains at least one use case that does not have a linked dataset. By creating this link, the user can start creating experiments and executing jobs in the linked use case.

  • Adding a linked use case: Adding a use case allows the user to create a new use case that already contains a linked dataset, namely the one that the process was initiated from. This empowers the user to start creating experiments and executing jobs in this newly created use case right away.