Single Asset Wind Production

Problem Description¶

This section considers a single asset problem: the production of an individual wind farm or even an individual wind turbine. Both can be described with one single pair of GPS-coordinates, thus they can be considered as one single asset. Typical forecasting scenarios in this situation range from forecasting a few hours ahead to forecasting a few days ahead (usually 1h – 36h or 1h – 48h ahead). The data sampling rate in these scenarios tends to be 15 minutes, 30 minutes or 1 hour. Typical metrics used to evaluate forecast quality are the MAE and RMSE. To evaluate the percentual error nMAE, nRMSE, rMAE or rRMSE are used.

Data Recommendation Template¶

It is essential for wind production forecasting to have good windspeed forecasts. For single asset modelling, it is recommended to have wind speed and wind direction forecasts at the hub high of the individual wind turbines. TIM can generate good production forecast with a combination of wind speeds at different heights and the wind direction at least one of the heights. The best practice is to use historical actuals of meteorological predictors for model building and use meteorological forecasts for out-of-sample validation. This ensures the highest quality available data is used at every stage. Other meteorological predictors such as wind gusts, temperature, irradiation and pressure can also improve the quality of the forecast. It is recommended to only include those predictors for further fine-tuning of the model(s).

TIM Setup¶

TIM requires no setup of its mathematical internals and works well in business user mode. All that is required from a user is to let TIM know a forecasting routine and a desired prediction horizon. TIM can automatically recognize appropriate values for these mathematical internals, for example by recognizing that there is no weekly pattern. In some cases, however, (e.g. short datasets) it can be difficult to recognize this and therefore it is recommended to switch off the weekdays dictionary, as common sense already explains that it will not contribute to the quality of the models in this scenario. If the availability of the target variable is higher than 6 hours, target offsets do not improve the model's quality and may thus be switched off. This will allow faster model building, while achieving the same accuracy.

Demo example¶

Data description¶

A prepared dataset for this problem, along with a prepared configuration YAML-file, can be downloaded here.

Target¶

The data used in this example is assembled from a wind farm in Spain. The GPS-coordinates of this wind farm are 43.3544, -7.8811. Production data is available and can be downloaded from the following web page: http://www.sotaventogalicia.com/en/real-time-data/historical. The production of this wind farm is the target variable. It corresponds to the second column in the CSV-file, right after column with timestamps. In this case, the name of the target is Energy. This dataset contains hourly data.

Predictor candidates¶

The meteorological predictors used in this scenario are the wind speeds at heights of 10m, 80m, 100m and 120m and the wind direction at a height of 100m. In this demo, historical actuals are used for both model building and out-of-sample forecasting. Data used in this example range from 2018-04-16 to 2019-04-14.

Forecasting scenario¶

This example considers the scenario of one day ahead. Each day at 12:05, forecasts from 13:00 of the same day to the end of the next day are desired. The last available target value corresponds to the energy produced between 11:00 and 12:00, since the data is already updated at 12:05. The interval 11:00 – 12:00 is denoted with timestamp 12:00 in the original dataset, but TIM's convention is to denote the interval 11:00-12:00 as 11:00. Therefore, the timestamps in the dataset are moved one hour back. That means the last available timestamp is the one from 11:00 (11:00 - 12:00). Forecasts are made in hour 12:00 (12:00 – 13:00), for values from 13:00 to the end of the following day (day + 1). Thus, the target availability is S-1 (sample - 1) and the forecasting horizon is S+1 to D+1 (sample + 1 to day + 1). The meteorological predictors are available for every hour of the forecast.

Model building and validation¶

The model is built using data from the range 2018-04-16 00:00:00 - 2018-12-14 23:00:00. Out-of-sample forecasts are made for the range 2018-12-15 00:00:00 - 2019-04-14 23:00:00. In this demo dataset, out-of-sample validation is performed using historical actuals of the meteorological data. A more representative validation may be obtained by using historical forecasts of the meteorological data instead.

Demonstration¶

TIM Studio¶

This section covers the use of TIM Studio to solve the challenge described above. Additional information regarding TIM Studio can be found here.

Select workspace¶

In the Workspaces screen, select the workspace in which the dataset should be added. If there is no available workspace, create one by clicking "Add Workspace". In this solution template, the workspace called "TIM Solution Templates" is used.

In the Datasets screen click on Add New Dataset. Stay in the tab CSV-File and insert name of the dataset. In this example, the dataset is called Wind_single. Click "Browse" and select the dataset from the computer. Click "Add Dataset" to confirm.

Model building definition¶

Go to the Model Building Definition screen in the panel on the left. Click "Add New Definition" and fill in the desired definition name. In this demo, the MBD is called Wind_single day-ahead. In the next screen, select the dataset that was previously uploaded (Wind_single).

In step 2, define the desired forecasting scenario. In this example, the model is used each day at 12:00. Therefore, leave all Weekdays ranges checked on. Then, set Hour ranges to 12 and leave Minutes ranges and Seconds ranges at 00. Look into the section about the Cron notation for more details. Since forecasts are to be made starting one sample ahead until the end of the next day, set "Forecast from" to Sample with offset 1 and set "Forecast to" to Day with offset 1. Look into the section about the relative time notation to learn more about this.

Click "Next" to advance to the next step. It is also possible to already finalize all settings at this point, in which case everything else would be set up automatically. In this example, some more changes will be made to the data updates in the third step. The target variable, Energy, is updated at the fifth minute of each hour. Click on the small arrow next to Energy and change the settings of this variable. Leave all Weekdays ranges checked on. Select "All" for the Hour, since this variable is updated each hour, and set the Minutes ranges to 5, since the data is updated in the fifth minute of every hour. Leave the Second ranges at 0. Then, set "Update until" to Sample with offset -1, since the target variable is updated with a delay of one sample (at 12:00 it is updated until 11:00).

Leave the default settings for all other predictors, i.e. they are set to update at 8:20 until Day+1. Since forecasts are made at 12:00 and forecasts of the predictor values will be available for the next day, these default settings are alright; at what time they update exactly does not matter in this case.

Click "Next" to advance to step 4. Here, training regions can be selected. Since the goal is to move on to back-testing, this screen will be left in its default settings (i.e. Use All Data). Click "Next" to advance to the next screen.

In this fifth step, the mathematical settings can be changed. Weekdays will be switched off in TIM Transformations, since this dictionary) is not relevant for wind forecasting. Then, click "Finalize" to complete the model building definition.

Experiments¶

Click "Experiments" in the panel on the left to move on to backtesting. Then, click "Make experiment" next to the correct model building definition (Wind_single day-ahead).

Click "Build Model" and select the appropriate training range, i.e. 2018-04-16 00:00:00 - 2018-12-14 23:00:00.

The in-sample prediction as well as the Model Tree Map become visible.

Click "Validate model" and select the correct Out-of-sample period for backtesting, i.e. 2018-12-15 00:00:00 - 2019-04-14 23:00:00.

This generates the aggregated forecasts for D+0 and D+1.

TIM Connector¶

This section covers the use of TIM Connector to solve the challenge described above. Additional information regarding TIM Connector can be found in the respective section.

2. Create folder with dataset¶

Create a folder, e.g. TIM_Datasets, which contains the dataset folder (Wind_single) with the dataset file [data.csv] and the configuration file [conf.yaml].

TIM_Datasets/Wind_single/ data.csv conf.yaml

The YAML configuration file defines the forecasting scenario described above.

Model building: Model building considers the range 2018-04-16 00:00:00 - 2018-12-14 23:00:00. The target variable, Energy, is updated with a delay of one sample (at 12:00 only until 11:00). It is updated at the fifth minute of each hour.

Configuration: The model is used repeatedly each day at 12:00. Forecasts are made starting with the sample ahead up to the end of the next day. All features except calendar variables (WeekRest) are used; for more information about feature dictionaries take a look at the appropriate section.

Forecasting: Out-of-sample forecasts are made for the range 2018-12-15 00:00:00 - 2019-04-14 23:00:00.

The content of the exemplary YAML configuration file is shown below:

version: "1.0"
type: Forecasting

modelBuilding:
data:
rows:
- from: 2018-04-16 00:00:00
to:   2018-12-14 23:00:00
- uniqueName: Energy
updateUntil:
baseUnit: Sample
offset: -1
updateTime:
- type: Hour
value: "*"
- type: Minute
value: "5"

configuration:
usage:
usageType: Repeating
usageTime:
- type: Day
value: "*"
- type: Hour
value: "12"
- type: Minute
value: "0"
predictionFrom:
baseUnit: Sample
offset: 1
predictionTo:
baseUnit: Day
offset: 1
features: [MovingAverage, Intercept, PiecewiseLinear, TimeOffsets, Polynomial, PeriodicComponents]

forecasting:
configuration:
predictionScope:
type: Ranges
ranges:
- from: 2018-12-15 00:00:00
to:   2019-04-14 23:00:00

3. Call connector from the command line¶

Using a terminal, change the current directory to TIM Connector's builddir with the command: > cd pathToConnector\builddir. Then, call the connector with the following command: > pathToConnector\timconnect.exe path\to\TIM_Datasets.

4. Fill in user credentials¶

Following the previous command, the user will be prompted to fill in their user credentials. Fill in the correct information and click "OK" to continue.

Output in console:

Output in folder: Predictions Report/timeStamp/conf/prediction.csv Errors Report/timeStamp/conf/accuracy.txt
The following accuracies were reported by TIM: Model building stage: RMSE = 1473.86, MAE = 1028.11
Validation stage: RMSE = 1911.4, MAE = 1332.54