GEFCom 2014 Electricity Price
- Problem description
- Data Recommendation Template
- TIM Setup
- Demo example
- Predictor candidates
- Forecasting scenario
- Model building and validation
The Global Energy Forecasting Competition (GEFCom) is a competition conducted by a team led by Dr. Tao Hong that invites submissions around the world for forecasting energy demand. GEFCom was first held in 2012 on Kaggle, and the second GEFCom was held in 2014 on CrowdANALYTIX. Tangent Works participated in the 2017 competition using TIM and was among the winning teams. Before this competition we first tried using TIM on the 2014s problems. In this solution you can learn how to use TIM to solve one of them.
The topic of the probabilistic price forecasting track was to forecast the probabilistic distribution (in quantiles) of the electricity price for one zone on a rolling basis 24 hours ahead. Contestants were asked to provide forecasts for 15 rounds = different days. Incremental price and load data was provided in each of the rounds.
Data Recommendation Template¶
In the price forecasting track of the competition that we will look at, only 2 predictors were available - forecasts of electricity loads from two different zones. However, in general, there are more predictors that influence the price and should be included as well if possible. These are mostly meteorological data and their influence from country to country depending on the composition of the energy sources. For example in countries with lots of renewables, solar related forecasts (like global horizontal irradiation) matter a lot.
TIM requires no setup of TIM's mathematical internals and works well in business user mode. All that is required from a user is to let TIM know a forecasting routine and desired prediction horizon.
Prepared dataset can be downloaded here.
The target variable is, of course, electricity price and the data are measured hourly from the beginning of year 2011 to the end of year 2013.
Zonal load forecasts from 2 different areas.
Timestamp is the first column and each value of the timestamp is the beginning of the period it corresponds to i.e. ‘Price’ in the row with timestamp 2011-01-01 00:00:00 corresponds to an average of Price during period between 2011-01-01 00:00:00 and 2011-01-01 01:00:00.
In this example we will simulate day ahead scenario as was used in the competition. Each day at 23:59 we wish to have forecasts for each hour of the next day. Last target value will be from the same exact hour 23:00 and both load predictors will be available for every hour of our prediction.
Model building and validation¶
Model is built using a range between 2011-01-01 00:00:00 and 2013-06-15 23:00:00. Out-of-sample forecasts are made on the ranges as described in the table above. The proper way of emulating the competition setup would be to create 14 different models and their day ahead forecasts (the building period would become bigger by new data every time). But for the simplicity, we will keep the building period static.
This section covers the use of TIM Studio to solve the challenge described above. Additional information regarding TIM Studio can be found here.
In the Workspaces screen, select the workspace in which the dataset should be added. If there is no available workspace, create one by clicking "Add Workspace". In this solution template, the workspace called "TIM Solution Templates" is used.
In the Datasets screen click on Add New Dataset. Stay in the tab CSV-File and insert name of the dataset. In this example, the dataset is called "electricity price_ISO". Click "Browse" and select the dataset from the computer. Click "Add Dataset" to confirm.
To get a feel for what TIM is capable of, try going straight to the forecasting screen. You can observe last 10 days of this dataset. By clicking on a flash icon next to its name and filling "Day" with offset 10 you can produce 10 days ahead forecast. Just like that - no additional information needed and after couple of seconds you can arrive to a forecast.
However, to understand how accurate TIM is, we will go back to the original backtest scenario where we first build our model on one period of data and then deploy it on another.
Model building definition¶
Go to the Model Building Definition screen in the panel on the left. Click "Add New Definition" and fill in the desired definition name. In this demo, the MBD is called "day-ahead". In the next screen, select your dataset.
In step 2, define the desired forecasting scenario. In this example, the model is used each day at 23:59. Therefore, leave all Weekdays ranges checked on. Then, set Hour ranges to 23, Minutes to 59 and Seconds to 00. Look into the section about the Cron notation for more details. Since forecasts are to be made for the whole next day, set "Forecasts from" to Day with offset 1 and set "Forecast to" to Day with offset 1. Look into the section about the relative time notation to learn more about this notation.
Click "Next" to advance to the next step. It is also possible to already finalize all settings at this point, in which case everything else would be set up automatically. In this example, some more changes will be made to the data updates in the third step. Let's suppose, that the target variable is updated at the same exact time as we forecast - 23:59. Click on the small arrow next to Price and change the settings of this variable. Leave all Weekdays ranges checked on. set Hour ranges to 23, Minutes to 59 and Seconds to 00. Then, set "Update until" to Sample with offset 0, since the target is fully updated when we forecast.
Leave the default settings for all other predictors, i.e. they are set to update at 8:20 until Day+1. Since forecasts are made at 23:59 and forecasts of the predictor values will be available for the next day, these default settings are alright; at what time they update exactly does not matter in this case.
Then, click "Finalize" to complete the model building definition. Everything else we can keep to be automatic.
Click "Experiments" in the panel on the left to move on to backtesting. Then, click "Make experiment" next to the correct model building definition ("day-ahead").
Click "Build Model" and select the appropriate training range, i.e. 2011-01-01 00:00:00 - 2013-06-15 23:00:00.
The in-sample prediction as well as the Model Tree Map become visible.
Click "Validate model" and select the Out-of-sample period for backtesting, i.e. 2013-06-16 00:00:00 - 2013-04-29 23:00:00.
This generates the aggregated forecasts for D+1. Now you can compare your in-sample and out-of-sample accuracy.
1. Download TIM Connector¶
You can find download links in TIM Connector's section.
2. Create folder with dataset¶
Create folder e.g. gefcom2014 with dataset file [electricity price_ISO.csv] and configuration file [conf.yaml]. gefcom2014/ electricity price_ISO.csv conf.yaml YAML config defines forecasting scenario described above.
Model building: Model building is at range from 2011-01-01 00:00:00 and 2013-06-15 23:00:00. Target named Price is updated with no delay every day at 23:59:00. Predictors are automatically set to be updated that time up until the end of the forecasting horizon.
Configuration: Model is used repeatedly each day at 23:59. Forecast are made for each hour of the next day. Forecasting: Out-of-sample forecasts are multiple listed ranges.
version: "1.0" type: Forecasting modelBuilding: data: rows: - from: 2011-01-01 00:00:00 to: 2013-06-15 23:00:00 updates: - uniqueName: Price updateUntil: baseUnit: Sample offset: 0 updateTime: - type: Day value: "*" - type: Hour value: "23" - type: Minute value: "59" configuration: usage: usageType: Repeating usageTime: - type: Day value: "*" - type: Hour value: "23" - type: Minute value: "59" predictionFrom: baseUnit: Day offset: 1 predictionTo: baseUnit: Day offset: 1 **forecasting:** configuration: predictionScope: type: Ranges ranges: - from: 2013-06-16 23:00:00 to: 2013-06-16 23:00:00 - from: 2013-06-17 23:00:00 to: 2013-06-17 23:00:00 - from: 2013-06-24 23:00:00 to: 2013-06-24 23:00:00 - from: 2013-07-04 23:00:00 to: 2013-07-04 23:00:00 - from: 2013-07-09 23:00:00 to: 2013-07-09 23:00:00 - from: 2013-07-13 23:00:00 to: 2013-07-13 23:00:00 - from: 2013-07-16 23:00:00 to: 2013-07-16 23:00:00 - from: 2013-07-18 23:00:00 to: 2013-07-18 23:00:00 - from: 2013-07-19 23:00:00 to: 2013-07-19 23:00:00 - from: 2013-07-20 23:00:00 to: 2013-07-20 23:00:00 - from: 2013-07-24 23:00:00 to: 2013-07-24 23:00:00 - from: 2013-07-25 23:00:00 to: 2013-07-25 23:00:00 - from: 2013-12-07 23:00:00 to: 2013-12-07 23:00:00 - from: 2013-12-08 23:00:00 to: 2013-12-08 23:00:00 - from: 2013-12-17 23:00:00 to: 2013-12-17 23:00:00
3. Call connector from the command line (terminal)¶
First, change the directory to TIM Connector's builddir with the command:
> cd pathToConnector\builddir.
Then, call the connector with the following command:
> pathToConnector\timconnect.exe path\to\gefcom2014\conf.yaml.
4. Fill in user credentials¶
Following the previous command, the user will be prompted to fill in their user credentials. Fill in the correct information and click "OK" to continue.
Output in console:
Output in folder:
The following accuracies were reported by TIM:
Model building stage: MAPE = 8.46%, RMSE = 7.32, MAE = 4.41
Validation stage: MAPE = 13.72%, RMSE = 12.21, MAE = 7.57