Contact center volumes forecasting (quarter-ahead)¶


Title:	Contact center volumes forecasting (quarter-ahead)
Author:	Tangent Works
Industry:	Contact centers, Shared services centers
Area:	Workforce management
Type:	Forecasting

Description¶

Contact centers rely on pool of resources ready to help customers when they reach out via call, email, chat, or other channel. For contact centers, predicting volume of incoming requests at specific times is critical for resource scheduling (very short- and short-term horizon) and resource management (mid to long term horizons). It takes time action taken within workforce management framework becomes effective (and is reflected in financial reports eventually), moving people around, hiring, upskilling, or down-sizing pool of resources takes weeks if not longer. Because of this, forecast for longer horizons is needed, starting from one to more months.

To build a high-quality forecast, it is necessary to gather relevant, and valid data with predictive power. In such case it is possible to employ ML technology like TIM RTInstantML that can build models for time-series data in fraction of time.

In our sample use case, we will showcase how TIM can predict volumes of requests for the next quarter, for each week ahead.

Business parameters¶


Business objective:	Reduce risk of resources shortage
Business value:	Optimal resources planned
KPI:	-

Business parameters¶


Business objective:	Reduce effort on forecasting
Business value:	Free capacity of high skilled personnel
KPI:	-

import logging
import pandas as pd
import plotly as plt
import plotly.express as px
import plotly.graph_objects as go
import numpy as np
import json

import tim_client
import os

with open('credentials.json') as f:
    credentials_json = json.load(f)                     # loading the credentials from credentials.json

TIM_URL = 'https://timws.tangent.works/v4/api'          # URL to which the requests are sent

SAVE_JSON = False                                       # if True - JSON requests and responses are saved to JSON_SAVING_FOLDER
JSON_SAVING_FOLDER = 'logs/'                            # folder where the requests and responses are stored

LOGGING_LEVEL = 'INFO'

level = logging.getLevelName(LOGGING_LEVEL)
logging.basicConfig(level=level, format='[%(levelname)s] %(asctime)s - %(name)s:%(funcName)s:%(lineno)s - %(message)s')
logger = logging.getLogger(__name__)

credentials = tim_client.Credentials(credentials_json['license_key'], credentials_json['email'], credentials_json['password'], tim_url=TIM_URL)
api_client = tim_client.ApiClient(credentials)

api_client.save_json = SAVE_JSON
api_client.json_saving_folder_path = JSON_SAVING_FOLDER

[INFO] 2021-08-17 10:28:15,725 - tim_client.api_client:save_json:66 - Saving JSONs functionality has been disabled
[INFO] 2021-08-17 10:28:15,726 - tim_client.api_client:json_saving_folder_path:75 - JSON destination folder changed to logs

Dataset¶

Dataset contains aggregated (per week) information about request volumes, temperature, holiday, no. of regular customers, marketing campaign, no. of customers for which contract will expire within next 30 or 60 days, no. of invoices sent, invoicing days, hours open.

Sampling¶

Weekly.

Data¶

Structure of CSV file:

Column name	Description	Type	Availability
Date	Timestamp	Timestamp column
Sum of Volumes	Sum of all requests in given week	Target	t+0
Avg temperature	Mean temperature	Predictor	t+13
Hours of public holidays	Public holiday days in given week x 24	Predictor	t+13
Hours open	Total hours center was/will be open to requests	Predictor	t+13
Hours of mkting campaign	How many hours campaign run/will run	Predictor	t+13
Avg contracts to expire in 30 days	Average no. of regular contracts that will expire within 30 days	Predictor	t+13
Avg contracts to expire in 60 days	Average no. of regular contracts that will expire within 60 days	Predictor	t+13
Avg no. of regular customers	Average no. of active contracts for regular customers	Predictor	t+13
No. of invoicing hours	Total hours during which invoice were/will be sent	Predictor	t+13
No. of invoices	No. of invoices sent	Predictor	t+13

Data situation¶

We want to predict total volume of requests for the next quarter (13 weeks) for each week. We assume to have forecasted values for predictors available. This situation in data is reflected in values present in CSV file. To simulate out-of-sample period thoroughly (i.e. to use always the latest model for each forecasting), each forecasting situation has its own CSV file reflecting data situation relevant at respective forecasting.

CSV files used in experiments can be downloaded here as ZIP package.

Source¶

This is synthetic dataset generated by simulating outcome of events relevant to operations of contact center.

# Sample from the first CSV file
data = tim_client.load_dataset_from_csv_file('dataL/data2LB1.csv', sep=',')          
                       
data

target_column = 'Sum of Volumes'  # sum of requests per given week

timestamp_column = 'Date'

Visualization¶

fig = go.Figure()

fig.add_trace( go.Scatter( x=data.iloc[:]['Date'], y=data.iloc[:][ target_column ] ) )     

fig.update_layout( width=1300, height=700, title='Sum of Volumes' )

fig.show()

Engine settings¶

Parameters that need to be set:

predictionTo defines prediction horizon and is set to 13 as we want to predict volumes for the next quarter.
backtestLength - defines length of out-of-sample interval, in our case 0, as we want to evalute out-of-sample results by simulating production forecasting with respective datasets.

We also ask for additional data from engine to see details of sub-models so we define extendedOutputConfiguration parameter as well.

back_test_length = 0

prediction_horizon = 13

configuration_backtest = {
    'usage': {                                 
        'predictionTo': { 
            'baseUnit': 'Sample',              
            'offset': prediction_horizon                   
        },
        'backtestLength': back_test_length             
    },
    'extendedOutputConfiguration': {
        'returnExtendedImportances': True      
    }
}

Experiment iteration(s)¶

Experiment for the first CSV file, in the next section we will simulate 40 production forecasts.

backtest = api_client.prediction_build_model_predict( data, configuration_backtest )
  
backtest.status

'FinishedWithWarning'

backtest.result_explanations

[{'index': 1,
  'message': 'Predictor Avg contracts to expire in 60 days contains an outlier or a structural change in its most recent records.'},
 {'index': 2,
  'message': 'Predictor Avg no. of regular customers contains an outlier or a structural change in its most recent records.'}]

Insights - inspecting ML models¶

Simple and extended importances are available for you to see to what extent each predictor contributes to explanation of variance of target variable.

simple_importances = backtest.predictors_importances['simpleImportances']
simple_importances = sorted(simple_importances, key = lambda i: i['importance'], reverse=True) 

simple_importances = pd.DataFrame.from_dict( simple_importances )

# simple_importances

fig = go.Figure()

fig.add_trace( go.Bar( x = simple_importances['predictorName'],
                       y = simple_importances['importance'] ) )

fig.update_layout(
        title='Simple importances',
        width = 1200,
        height = 700
)

fig.show()

extended_importances = backtest.predictors_importances['extendedImportances']
extended_importances = sorted(extended_importances, key = lambda i: i['importance'], reverse=True) 

extended_importances = pd.DataFrame.from_dict( extended_importances )

fig = go.Figure()

fig.add_trace( go.Bar( x = extended_importances[ extended_importances['time'] == '[11]' ]['termName'],
                      y = extended_importances[ extended_importances['time'] == '[11]' ]['importance'] ) )

fig.update_layout(
        title='Features generated from predictors used by model for 11th week in prediction horizon',
        width = 1200,
        height = 700
)

fig.show()

# Helper function, merges actual and predicted values together
def create_eval_df( predictions, prediction_only = False ):
    data2 = None        
    if prediction_only:
        data2 = tim_client.load_dataset_from_csv_file('data2L.csv', sep=',')  
    else:
        data2 = data.copy() 

    data2[ timestamp_column ] = pd.to_datetime( data2[ timestamp_column ]).dt.tz_localize('UTC')
    data2.rename( columns={ timestamp_column: 'Timestamp' }, inplace=True)
    data2.set_index( 'Timestamp', inplace=True)

    eval_data = data2[ [ target_column ] ].join( predictions, how='inner' )

    return eval_data

Evaluation of results¶

In-sample¶

edf = create_eval_df( backtest.aggregated_predictions[0]['values'] )

backtest.aggregated_predictions[0]['accuracyMetrics']['MAPE']

5.7336983263846335

fig = go.Figure()

fig.add_trace( go.Scatter( x = edf.index, y=edf['Prediction'], name='In-Sample') )     
fig.add_trace( go.Scatter( x = edf.index, y=edf[ target_column ], name='Actual') )    

fig.update_layout( width=1200, height=700,  title='Actual vs. predicted (in-sample)'  )

fig.show()

Out-of-sample result for one forecasting situation¶

edf = create_eval_df( backtest.prediction, True )

fig = go.Figure()

fig.add_trace( go.Scatter( x = edf.index, y=edf['Prediction'], name='Prediction') )     
fig.add_trace( go.Scatter( x = edf.index, y=edf[ target_column ], name='Actual') )    

fig.update_layout( width=1200, height=700, title='Actual vs. predicted' )

fig.show()

Simulation of 40 production forecasts¶

results = list()
mapes = list()

configuration_backtest

{'usage': {'predictionTo': {'baseUnit': 'Sample', 'offset': 13},
  'backtestLength': 0},
 'extendedOutputConfiguration': {'returnExtendedImportances': True}}

datadir = 'dataL'

for fname in os.listdir(datadir):
    fpath_ = os.path.join( datadir, fname )
    # print( fpath_ )

    data_ = tim_client.load_dataset_from_csv_file( fpath_, sep=',' )      

    backtest_ = api_client.prediction_build_model_predict( data_, configuration_backtest )
              
    # print( backtest_.status )  

    edf_ = create_eval_df( backtest_.prediction, True )
    edf_['err_pct'] = abs( edf_[ target_column ] - edf_[ 'Prediction' ] ) / edf_[ target_column ]

    results.append( edf_ )
    mapes.append( edf_['err_pct'].mean() )

Mean MAPE value

pd.DataFrame(mapes).describe()

fig = go.Figure()

fig.add_trace( go.Bar( x = list(range(len(mapes))), y= mapes, name='MAPE') )     

fig.update_layout( width=1200, height=700, title='MAPE per forecast' )

fig.show()

Summary¶

We demonstrated how TIM can be used to predict volumes for mid-term forecasting with weekly data.

Having relevant data with predictive power available at the time of forecasting is prerequisite to any ML/AI solution, however not every ML solution can build new model in fraction of time, adapting to the most recent reality reflected in data.

Contact centers that support multiple channels that customers can use to submit query may benefit from forecasts for various perspectives. With TIM RTInstantML it is possible to build new model and make predictions for various perspectives, e.g. volume per channel (incoming calls, messages from social media, emails etc.), volumes per region, consolidated volumes, and other. Equally, need for various prediction horizons does not mean any additional burden for TIM, depending on sampling of your data, you can predict from minutes to years ahead.

	Date	Sum of Volumes	Avg temperature	Avg contracts to expire in 30 days	Avg contracts to expire in 60 days	Avg no. of regular customers	Hours of public holidays	Hours open	Hours of mkting campaign	No. of invoicing hours	No. of invoices
0	2012-05-20	1009189.0	11.698214	23887.994048	42446.601190	56967.791667	0	66	0	24	56940
1	2012-05-27	781528.0	17.617857	24220.125000	42385.767857	56932.523810	24	66	0	0	0
2	2012-06-03	961166.0	14.111905	32031.994048	42457.255952	56918.892857	0	66	0	0	0
3	2012-06-10	968952.0	15.504762	42451.392857	42429.035714	56732.327381	24	55	0	0	0
4	2012-06-17	1192165.0	16.595833	42491.714286	42412.315476	56597.630952	0	66	0	24	56652
...	...	...	...	...	...	...	...	...	...	...	...
132	2014-11-30	NaN	0.479762	56540.351190	57953.000000	78312.869048	0	66	0	0	0
133	2014-12-07	NaN	-0.514881	56777.446429	58175.309524	78446.595238	0	66	0	0	0
134	2014-12-14	NaN	2.689881	56885.446429	58218.291667	78650.113095	0	66	0	0	0
135	2014-12-21	NaN	5.559524	57093.029762	58483.880952	78788.196429	0	66	0	24	78674
136	2014-12-28	NaN	2.175595	57308.446429	58685.148810	78906.744048	72	33	0	0	0

	0
count	40.000000
mean	0.075285
std	0.022402
min	0.037157
25%	0.053423
50%	0.081945
75%	0.094086
max	0.109226