Predicting car traffic for on-ramp in LA¶

Title: Predicting car traffic for on-ramp in LA
Author: Michal Bezak - Tangent Works
Industry: Transportation, Smart cities and infrastructure
Area: Traffic management
Type: Forecasting

Description¶

Smart traffic solutions are becoming increasingly important and they play vital role in making our cities (and infrastructure) smarter. They comprise of multiple parts, spanning from hardware, software, and in recent years also AI/ML.

With predictions of utilization (and potential congestion) of particular segments on (road) infrastructure it is possible to better optimize routes taken thus cut time necessary to transport goods, people etc.

Value derived from such capability can be estimated with proxy indicators such as time of people saved, or expenses on fuel consumed etc.

Business parameters¶

Business objective: Cut time required to deliver goods in certain area
Business value: Higher utilization of vans (measured by goods delivered in given time-frame); Shorter time of delivery
KPI: -
In [2]:
import logging
import pandas as pd
import plotly as plt
import plotly.express as px
import plotly.graph_objects as go
import numpy as np
import json
import datetime
import copy
import math

import tim_client

Credentials and logging

(Do not forget to fill in your credentials in the credentials.json file)

In [3]:
with open('credentials.json') as f:
    credentials_json = json.load(f)                     # loading the credentials from credentials.json

TIM_URL = 'https://timws.tangent.works/v4/api'          # URL to which the requests are sent

SAVE_JSON = False                                       # if True - JSON requests and responses are saved to JSON_SAVING_FOLDER
JSON_SAVING_FOLDER = 'logs/'                            # folder where the requests and responses are stored

LOGGING_LEVEL = 'INFO'
In [4]:
level = logging.getLevelName(LOGGING_LEVEL)
logging.basicConfig(level=level, format='[%(levelname)s] %(asctime)s - %(name)s:%(funcName)s:%(lineno)s - %(message)s')
logger = logging.getLogger(__name__)
In [5]:
credentials = tim_client.Credentials(credentials_json['license_key'], credentials_json['email'], credentials_json['password'], tim_url=TIM_URL)
api_client = tim_client.ApiClient(credentials)

api_client.save_json = SAVE_JSON
api_client.json_saving_folder_path = JSON_SAVING_FOLDER
[INFO] 2021-01-15 11:56:53,486 - tim_client.api_client:save_json:66 - Saving JSONs functionality has been disabled
[INFO] 2021-01-15 11:56:53,489 - tim_client.api_client:json_saving_folder_path:75 - JSON destination folder changed to logs

Dataset¶

Dataset is a combination of two (original) datasets:

  • volumes (sensor data),
  • event times of games played at nearby stadium.

It was enhanced with holidays information relevant for given timestamps as well.

Loop sensor data was collected for the Glendale on ramp for the 101 North freeway in Los Angeles, US. It is close enough to the stadium to see unusual traffic after a Dodgers game, but not so close and heavily used by game traffic.

Sampling and gaps¶

Data are sampled on 5-minutes basis and can contain gaps.

Data¶

Column name Description Type Availability
TS Timestamp Timestamp column
volume Number of cars measured for the previous five minutes Target t-1
event Binary value indicating whether match was played at nearby stadium Predictor t+6

Forecasting situations¶

Our goal is to predict next 30 minutes of traffic.

CSV file used in experiment can be downloaded here.

Source¶

Data were published at UCI Machine Learning Repository. Loop sensor measurements were obtained from the Freeway Performance Measurement System PeMS.

In [10]:
data = tim_client.load_dataset_from_csv_file('data.csv', sep=',')

We can see that last 6 data points are NaN i.e. are missing in dataset because we want to back-test predictions of the next 30 minutes (5 min x 6 = 30 min.)

In [11]:
data.tail(7)
Out[11]:
TS Volume Event Holiday
49728 2005-09-30 23:35:00 15.0 0 0
49729 2005-09-30 23:40:00 NaN 0 0
49730 2005-09-30 23:45:00 NaN 0 0
49731 2005-09-30 23:50:00 NaN 0 0
49732 2005-09-30 23:55:00 NaN 0 0
49733 2005-10-01 00:00:00 NaN 0 0
49734 2005-10-01 00:05:00 NaN 0 0
In [24]:
data.shape
Out[24]:
(49735, 4)

Zoom in closer to see events (red) lines in chart below.

In [14]:
data_for_chart_event = data['Event'].apply( lambda x: None if x==0 else x )*40
In [15]:
fig = plt.subplots.make_subplots(rows=1, cols=1, shared_xaxes=True, vertical_spacing=0.02)  

fig.add_trace(go.Scatter(x = data.loc[:, "TS"], y=data.loc[:, "Volume"], name = "Volume", line=dict(color='blue')), row=1, col=1) 

fig.add_trace(go.Scatter(x = data.loc[:, "TS"], y=data_for_chart_event, name = "Event",  line=dict(color='red')), row=1, col=1) 

fig.update_layout(height=500, width=1000)                           

fig.show()