Contact center volumes forecasting (week-ahead)¶


Title:	Contact center volumes forecasting (week-ahead)
Author:	Tangent Works
Industry:	Contact centers, Shared services centers
Area:	Workforce management
Type:	Forecasting

Description¶

Contact centers rely on pool of resources ready to help customers when they reach out via call, email, chat, or other channel. For contact centers, predicting volume of incoming requests at specific times is critical input to resource scheduling (very short- and short-term horizon) and resource management (mid to long term horizons). For short term forecasts, typical task would be predicting volumes for the next 7 days, hour by hour. High quality forecast would bring confidence that FTEs (full time equivalent - indicates workload of an employed person) planned for the next week are just right for delivering on SLAs. Not to mention other benefits, such as higher confidence when planning absence (due to vacation, education etc.), or improving morale of employees who would not face overload from "sudden" volume peaks.

To build a high-quality forecast, it is necessary to gather relevant, and valid data with predictive power. In such case it is possible to employ ML technology like TIM RTInstantML that can build models for time-series data in fraction of time.

We will showcase how TIM can predict volumes of requests for next 7 days on hourly basis in our sample use case.

Business parameters¶


Business objective:	Reduce risk of resources shortage
Business value:	Optimal resources planning
KPI:	-


Business objective:	Reduce risk of not meeting SLAs
Business value:	Better customer relations, lower/no penalties
KPI:	-


Business objective:	Reduce effort on forecasting
Business value:	Gain capacity of high skilled personnel
KPI:	-

import logging
import pandas as pd
import plotly as plt
import plotly.express as px
import plotly.graph_objects as go
import numpy as np
import json

import tim_client

with open('credentials.json') as f:
    credentials_json = json.load(f)                     # loading the credentials from credentials.json

TIM_URL = 'https://timws.tangent.works/v4/api'          # URL to which the requests are sent

SAVE_JSON = False                                       # if True - JSON requests and responses are saved to JSON_SAVING_FOLDER
JSON_SAVING_FOLDER = 'logs/'                            # folder where the requests and responses are stored

LOGGING_LEVEL = 'INFO'

level = logging.getLevelName(LOGGING_LEVEL)
logging.basicConfig(level=level, format='[%(levelname)s] %(asctime)s - %(name)s:%(funcName)s:%(lineno)s - %(message)s')
logger = logging.getLogger(__name__)

credentials = tim_client.Credentials(credentials_json['license_key'], credentials_json['email'], credentials_json['password'], tim_url=TIM_URL)
api_client = tim_client.ApiClient(credentials)

api_client.save_json = SAVE_JSON
api_client.json_saving_folder_path = JSON_SAVING_FOLDER

[INFO] 2021-08-17 09:48:33,191 - tim_client.api_client:save_json:66 - Saving JSONs functionality has been disabled
[INFO] 2021-08-17 09:48:33,192 - tim_client.api_client:json_saving_folder_path:75 - JSON destination folder changed to logs

Dataset¶

Dataset contains information about request volumes, temperature, holiday, no. of regular customers, marketing campaign, no. of customers for which contract will expire within next 30 or 60 days, no. of invoices sent, flag whether particular timestamp is day when invoices are sent, flag if contact center is open at given timestamp.

Sampling¶

Hourly.

Data¶

Structure of CSV file:

Column name	Description	Type	Availability
Date	Timestamp	Timestamp column
Volumes	No. of requests	Target	t+0
Temperature	Temperature in Celsius	Predictor	t+168
PublicHolidays	Binary flag for holidays	Predictor	t+168
IsOpen	Binary flag to show if contact center is open at given timestamp	Predictor	t+168
IsMktingCampaign	Binary flag to show if product team is running marketing campaign at given timestamp	Predictor	t+168
ContractsToExpireIn30days	No. of regular contracts that will expire within 30 days	Predictor	t+168
ContractsToExpireIn60days	No. of regular contracts that will expire within 60 days	Predictor	t+168
RegularCustomers	No. of active contracts for regular customers	Predictor	t+168
InvoiceDay	Binary flag to show if invoices are sent at given timestamp	Predictor	t+168
InvoicesSent	No. of invoices sent at given timestamp	Predictor	t+168

Data situation¶

We want to predict volume for the next 7 days, for each hour. Time of prediction is at 23:00 every day. This situation is reflected in values present in CSV file. TIM will simulate this situation throughout the whole out-of-sample interval to calculate accuracy metrics.

CSV file used in experiments can be downloaded here.

Source¶

This is synthetic dataset generated by simulating outcome of events relevant to operations of contact center.

data = tim_client.load_dataset_from_csv_file('data2B.csv', sep=',')          
                       
data

target_column = 'Volumes'

timestamp_column = 'Date'

Visualization¶

fig = go.Figure()

fig.add_trace( go.Scatter( x=data.iloc[:]['Date'], y=data.iloc[:][ target_column ] ) )     

fig.update_layout( width=1300, height=700, title='Volumes' )

fig.show()

Engine settings¶

Parameters that need to be set:

predictionTo defines prediction horizon and is set to 7*24 as we want to predict volumes for the next 7 days.
backtestLength - defines length of out-of-sample interval.

We also ask for additional data from engine to see details of sub-models so we define extendedOutputConfiguration parameter as well.

back_test_length = int( data.shape[0] * .33 )

prediction_horizon_samples = 7*24

configuration_backtest = {
    'usage': {                                 
        'predictionTo': { 
            'baseUnit': 'Sample',              
            'offset': prediction_horizon_samples                   
        },
        'backtestLength': back_test_length             
    },
    'extendedOutputConfiguration': {
        'returnExtendedImportances': True      
    }
}

Experiment iteration(s)¶

backtest = api_client.prediction_build_model_predict(data, configuration_backtest)
              
backtest.status

'Finished'

backtest.result_explanations

[]

Insights - inspecting ML models¶

Simple and extended importances are available for you to see to what extent each predictor contributes to explanation of variance of target variable.

simple_importances = backtest.predictors_importances['simpleImportances']
simple_importances = sorted(simple_importances, key = lambda i: i['importance'], reverse=True) 

simple_importances = pd.DataFrame.from_dict( simple_importances )

fig = go.Figure()

fig.add_trace( go.Bar( x = simple_importances['predictorName'],
                      y = simple_importances['importance'] ) )

fig.update_layout(
        title='Simple importances',
        width = 1200,
        height = 700
)

fig.show()

extended_importances = backtest.predictors_importances['extendedImportances']
extended_importances = sorted(extended_importances, key = lambda i: i['importance'], reverse=True) 

extended_importances = pd.DataFrame.from_dict( extended_importances )

extended_importances[ extended_importances['time']=='11:00:00' ]

fig = go.Figure()

fig.add_trace( go.Bar( x = extended_importances[ extended_importances['time'] == '11:00:00' ]['termName'],
                      y = extended_importances[ extended_importances['time'] == '11:00:00' ]['importance'] ) )

fig.update_layout(
        title='Features generated from predictors used by model for 11:00',
        width = 1200,
        height = 700
)

fig.show()

Evaluation of results¶

Results for out-of-sample interval.

# Helper function, merges actual and predicted values together
def create_eval_df( predictions ):
    data2 = data.copy()
    data2[ timestamp_column ] = pd.to_datetime( data2[ timestamp_column ]).dt.tz_localize('UTC')
    data2.rename( columns={ timestamp_column: 'Timestamp' }, inplace=True)
    data2.set_index( 'Timestamp', inplace=True)

    eval_data = data2[ [ target_column ] ].join( predictions, how='inner' )

    return eval_data

In-sample¶

for i in range(0,7):
    print('Day:', i+1, 'RMSE:', backtest.aggregated_predictions[i]['accuracyMetrics']['RMSE'] )

Day: 1 RMSE: 1514.396105849587
Day: 2 RMSE: 1598.8268294503896
Day: 3 RMSE: 1607.5194335875638
Day: 4 RMSE: 1628.1118796911414
Day: 5 RMSE: 1611.4158701397384
Day: 6 RMSE: 1602.467316085566
Day: 7 RMSE: 1613.7498960976457

# backtest.aggregated_predictions[0]['type'], backtest.aggregated_predictions[6]['type']

edf = create_eval_df( backtest.aggregated_predictions[0]['values'] )

fig = go.Figure()

fig.add_trace( go.Scatter( x = edf.index, y=edf['Prediction'], name='In-Sample') )     
fig.add_trace( go.Scatter( x = edf.index, y=edf[ target_column ], name='Actual') )    

fig.update_layout( width=1200, height=700, title='Actual vs. predicted'  )

fig.show()

Out-of-sample¶

for i in range(7,14):
    print('Day:',i-6,'RMSE:',backtest.aggregated_predictions[i]['accuracyMetrics']['RMSE'] )

Day: 1 RMSE: 2060.2152459019217
Day: 2 RMSE: 2177.948830850947
Day: 3 RMSE: 2229.520122467636
Day: 4 RMSE: 2262.223084425699
Day: 5 RMSE: 2211.5367573571143
Day: 6 RMSE: 2201.0999567584945
Day: 7 RMSE: 2229.889504434829

#backtest.aggregated_predictions[7]['type'], backtest.aggregated_predictions[13]['type']

edf = create_eval_df( backtest.aggregated_predictions[7]['values'] )

fig = go.Figure()

fig.add_trace( go.Scatter( x = edf.index, y=edf['Prediction'], name='Out-of-Sample') )     
fig.add_trace( go.Scatter( x = edf.index, y=edf[ target_column ], name='Actual') )   

fig.update_layout( width=1200, height=700, title='Actual vs. predicted' )

fig.show()

Summary¶

We demonstrated how TIM can automate forecasting of key inputs to resource planning/scheduling - predicting volume of requests.

Predictors used, and their quality, play vital role in building of such forecasting system, thus it is assumed that cooperation with LoBs (lines of business) that possess relevant information, preferably with forecasted values, is established.

Contact centers that support multiple channels that customers can use to submit query may benefit from forecasts for various perspectives. With TIM RTInstantML it is possible to build new model and make predictions for various perspectives, e.g. volume per channel (incoming calls, messages from social media, emails etc.), volumes per region, consolidated volumes, and other. Equally, need for various prediction horizons does not mean any additional burden for TIM, depending on sampling of your data, you can predict from minutes to years ahead.

	Date	Volumes	Temperature	PublicHolidays	IsOpen	IsMktingCampaign	ContractsToExpireIn30days	ContractsToExpireIn60days	RegularCustomers	InvoiceDay	InvoicesSent
0	2012-05-13 08:00:00	0.0	7.5	0	0	0	23663	42428	56976	0	0
1	2012-05-13 09:00:00	0.0	7.5	0	0	0	23665	42428	56976	0	0
2	2012-05-13 10:00:00	0.0	7.5	0	0	0	23667	42428	56976	0	0
3	2012-05-13 11:00:00	0.0	8.6	0	0	0	23669	42428	56976	0	0
4	2012-05-13 12:00:00	0.0	8.6	0	0	0	23670	42428	56976	0	0
...	...	...	...	...	...	...	...	...	...	...	...
23099	2014-12-31 19:00:00	NaN	-0.6	0	0	0	57389	58832	79186	0	0
23100	2014-12-31 20:00:00	NaN	-0.6	0	0	0	57390	58833	79187	0	0
23101	2014-12-31 21:00:00	NaN	-0.6	0	0	0	57390	58834	79188	0	0
23102	2014-12-31 22:00:00	NaN	-0.3	0	0	0	57391	58834	79189	0	0
23103	2014-12-31 23:00:00	NaN	-0.3	0	0	0	57392	58835	79190	0	0

	time	type	termName	importance
20	11:00:00	Interaction	IsOpen & RegularCustomers	16.62
31	11:00:00	Interaction	IsOpen & Temperature	13.18
50	11:00:00	Interaction	IsOpen & RegularCustomers(t-32)	10.95
57	11:00:00	Interaction	IsOpen & Temperature(t-7)	9.73
80	11:00:00	Interaction	cos(2πt / 12.0 hours) & cos(2πt / 2.0 hours)	7.12
87	11:00:00	Interaction	IsOpen & Temperature(t-21)	6.47
88	11:00:00	Interaction	IsOpen & cos(2πt / 2.0 hours)	6.44
93	11:00:00	Interaction	IsOpen & ContractsToExpireIn30days(t-42)	5.98
100	11:00:00	Interaction	IsOpen & sin(2πt / 3.0 hours)	5.49
151	11:00:00	Interaction	IsOpen & IsMktingCampaign	2.88
174	11:00:00	Interaction	IsOpen & IsOpen(t-1)	2.24
181	11:00:00	Interaction	IsOpen(t-1) & cos(2πt / 3.0 hours)	2.08
202	11:00:00	Interaction	IsOpen & IsMktingCampaign(t-142)	1.72
208	11:00:00	Interaction	cos(2πt / 2.0 hours) & sin(2πt / 12.0 hours)	1.64
217	11:00:00	Predictor	IsOpen	1.53
233	11:00:00	Interaction	Volumes(t-168) & IsMktingCampaign	1.37
238	11:00:00	Interaction	IsOpen & cos(2πt / 3.0 hours)	1.33
242	11:00:00	Interaction	IsMktingCampaign(t-142) & IsMktingCampaign	1.22
252	11:00:00	Interaction	IsOpen & cos(2πt / 8.0 hours)	1.04
260	11:00:00	Interaction	cos(2πt / 3.0 hours) & IsOpen(t-1)	0.99