Predicting demand i.e., how many units of a certain product will be sold at certain times, belongs to one of the most important tasks in retail. It is linked to cost incurred, or even sales opportunities, thus having impact on P&L. The implications when such forecast is incorrect are huge. If there is too much of a particular product in warehouse, it means that too much capital is allocated in inventory that just sits there and blocks room (and capital) that could have been used elsewhere (or smaller warehouse might be required). On the other hand, if there is too little, it will run out of stock too soon, which may turn to lower revenue.
It is not only about the hard numbers though. Reducing situations when some product is out of stock helps to preserve positive customer experience. Here's the list of other areas impacted by demand forecasting:
We can see there are various Use cases that can be supported by demand forecasting. We bring two of them as an example.
|Business objective:||Availability of product(s) in demand|
|KPI:||Revenue per given product in given timeframe|
|Business objective:||Minimize turnaround time of items on stock|
|Value:||Maximize value of warehouse|
|KPI:||Average turnaround time for given product|
import logging import pandas as pd import plotly as plt import plotly.express as px import plotly.graph_objects as go import numpy as np import json import datetime import tim_client
Credentials and logging
(Do not forget to fill in your credentials in the credentials.json file)
with open('credentials.json') as f: credentials_json = json.load(f) # loading the credentials from credentials.json TIM_URL = 'https://timws.tangent.works/v4/api' # URL to which the requests are sent SAVE_JSON = False # if True - JSON requests and responses are saved to JSON_SAVING_FOLDER JSON_SAVING_FOLDER = 'logs/' # folder where the requests and responses are stored LOGGING_LEVEL = 'INFO'
level = logging.getLevelName(LOGGING_LEVEL) logging.basicConfig(level=level, format='[%(levelname)s] %(asctime)s - %(name)s:%(funcName)s:%(lineno)s - %(message)s') logger = logging.getLogger(__name__)
credentials = tim_client.Credentials(credentials_json['license_key'], credentials_json['email'], credentials_json['password'], tim_url=TIM_URL) api_client = tim_client.ApiClient(credentials) api_client.save_json = SAVE_JSON api_client.json_saving_folder_path = JSON_SAVING_FOLDER
[INFO] 2021-01-15 13:50:24,436 - tim_client.api_client:save_json:66 - Saving JSONs functionality has been disabled [INFO] 2021-01-15 13:50:24,438 - tim_client.api_client:json_saving_folder_path:75 - JSON destination folder changed to logs
Data contain units sold for particular product category (jackets and shorts). It is enhanced with information about holiday, weather, lockdown and other factors.
Data are sampled on daily basis and can contain occasional gaps.
|Min_TEMP_C||Min. temperature for particular day||Predictor||t+2|
|Max_TEMP_C||Max. temperature for particular day||Predictor||t+2|
|Clear||Indicator of weather - clear sky||Predictor||t+2|
|Clouds||Indicator of weather - clouds||Predictor||t+2|
|Drizzle||Indicator of weather - drizzle||Predictor||t+2|
|Fog||Indicator of weather - fog||Predictor||t+2|
|Mist||Indicator of weather - mist||Predictor||t+2|
|Rain||Indicator of weather - rain||Predictor||t+2|
|Snow||Indicator of weather - snow||Predictor||t+2|
|Thunderstorm||Indicator of weather - thunderstorm||Predictor||t+2|
|Christmas||Christmas period indicator (binary)||Predictor||t+2|
|Black Friday||Black Friday period indicator (binary)||Predictor||t+2|
|New Year||New year period indicator (binary)||Predictor||t+2|
|Peak Period||The most active period indicator (binary)||Predictor||t+2|
|Lockdown||Indicator of lockdown (binary)||Predictor||t+2|
For predictors with availability of t+N where N>0 it is assumed that forecast values are available.
TIM detects forecasting situation from current "shape" of data, e.g., if last target value is available for Jan 21st, it will start forecasting as of Jan 22nd. It will also take Jan 21st timestamp as reference point against which availability per each column in dataset is determined – this rule is then followed for back-testing when calculating results for the out-of-sample interval.
In our case all predictors values were available 2 steps ahead.
We want to back-test forecasting of units sold for the day after tomorrow and will do it every 2 days. Prediction horizon set to 2 days ahead will produce out-of-sample values predicted with rolling window of 2 automatically.
CSV files used in experiments can be downloaded here.
dataset1 ='data_jackets.csv' dataset2 ='data_shorts.csv'
data = tim_client.load_dataset_from_csv_file( dataset1 , sep=',')
NaN at the end of target (UNITS) column shows the reality of data at the time of forecasting, e.g. on October 4th, with the last available data point for October 3rd, we want to predict UNITS for October 5th.
|DATE||UNITS||Min_TEMP_C||Max_TEMP_C||Clear||Clouds||Drizzle||Fog||Mist||Rain||Snow||Thunderstorm||Christmas||Black Friday||New Year||Peak Period||Lockdown|
Visualisation of data.
timestamp_column = 'DATE' target_column = 'UNITS'
fig = plt.subplots.make_subplots(rows=1, cols=1, shared_xaxes=True, vertical_spacing=0.02) fig.add_trace( go.Scatter( x = data.loc[:, timestamp_column ], y=data.loc[:, target_column ], name = target_column, line=dict(color='blue')), row=1, col=1) fig.update_layout(height=500, width=1000) fig.show()