Uploaded by Forest He

LSTM-BNN for Traffic Flow Prediction

advertisement
KSCE Journal of Civil Engineering (2024) 28(1):363-374
pISSN 1226-7988, eISSN 1976-3808
www.springer.com/12205
DOI 10.1007/s12205-023-2457-y
Transportation Engineering
A Hybrid Framework Combining LSTM NN and BNN for Short-term Traffic
Flow Prediction and Uncertainty Quantification
Yinpu Wang
a
, Siping Ke
a
, Chengchuan An
a
, Zhenbo Lu
a
, and Jingxin Xia
a
a
Intelligent Transportation System Research Center, Southeast University, Nanjing 211189, China
ARTICLE HISTORY
ABSTRACT
Received 14 January 2023
Accepted 25 August 2023
Published Online 18 October 2023
Short-term traffic flow prediction plays a critical role in Intelligent Transportation System (ITS),
and has attracted continuous attention. Previous studies have focused on improving the
prediction accuracy of mean traffic flow. Due to the dynamics and propagation of traffic
system, reliable traffic control and induction measures have been considered to be dependent
on prediction intervals of short-term traffic flows. The current parametric models used to
quantify uncertainty in traffic flow prediction cannot well capture the nonlinear patterns of
traffic flow series, and may not apply to situations without long-term continuous observations.
This paper proposes a hybrid framework combining long short-term memory neural network
(LSTM NN) and Bayesian neural network (BNN) for real-time traffic flow prediction and
uncertainty quantification based on sequence data. Caltrans Performance Measurement
System (PeMS) traffic flow data for 6 freeways in Sacramento city is aggregated at 15-min
intervals to evaluate the proposed model. Compared to the SARIMA-GARCH model, the
proposed LSTM-BNN model outperforms in predicting both the mean and interval of the
traffic flow. Especially, the experiments show that the LSTM-BNN model is superior during the
daytime and under non-seasonal traffic conditions. The proposed LSTM-BNN model can be
utilized in ITS for making reliable management decisions.
KEYWORDS
Short-term traffic flow prediction
Uncertainty quantification
Long short-term memory neural
network
Bayesian neural network
Intelligent transportation system
1. Introduction
Short-term traffic flow is a critical parameter of the Intelligent
Transportation System (ITS) (Luo et al., 2018). Using the future
near-term traffic flow as input, active traffic management and
control, including traffic signal timing optimization and freeway
ramp control, could be developed to alleviate traffic congestion
and air pollution. In both research and application domains, the
prediction of short-term traffic flow has drawn extensive attention.
Traffic management relying solely on the mean of the traffic
flow may not be reliable due to the dynamics and propagation of
traffic flow (Long et al., 2008; Treiber and Kesting, 2013). In this
regard, the uncertainty quantification of traffic flow prediction is
still a vital yet challenging task in ITS development.
To date, there are still limited studies on quantifying uncertainty
in short-term traffic flow prediction. In contrast to the conventional
parametric models, Bayesian neural network (BNN) has shown
its superiority in enhancing the accuracy and reliability of the
prediction with its capability to measure predictive uncertainty.
CORRESPONDENCE Jingxin Xia
xiajingxin@seu.edu.cn
ⓒ 2024 Korean Society of Civil Engineers
Provided with the predictive uncertainty, more reliable applications
can be archived in the ITS domain: 1) traffic managers can know
the best and worst consequences of the decisions they will make;
2) travelers can make travel choices according to the upper and
lower bounds of traffic conditions. This study establishes a hybrid
framework combining long short-term memory neural network
(LSTM NN) and BNN for traffic flow prediction and uncertainty
quantification. The contributions of this study are summarized
below:
1. A hybrid framework combining LSTM and BNN is designed
for short-term traffic flow prediction and uncertainty
quantification. The LSTM-BNN framework does not require
building complex statistical models, and can accurately
predict both the mean and interval of traffic flow in real-time
based on short-time sequences.
2. This is the first neural network-based study to measure
uncertainty in traffic flow prediction. The proposed framework
can capture the nonlinear uncertainty patterns of traffic flow
series. This work highlights the importance of uncertainty
Intelligent Transportation System Research Center, Southeast University, Nanjing 211189, China
364 Y. Wang et al.
quantification in short-term traffic flow prediction and
demonstrates the potential of BNN in addressing
transportation problems.
The remaining part of the paper is organized into the following
sections. Section 2 reviews the related literature for traffic flow
prediction. Section 3 provides the details of the hybrid framework,
which is composed of a mean prediction module and an interval
prediction module. In Section 4, the deployment of the case
study is illustrated and the performance is evaluated. Finally, the
conclusions and future work are separately discussed in Section 5.
2. Literature Review
There are two primary approaches to predicting traffic flow, i.e.,
parametric and nonparametric models. Time series methods and
Kalman filtering methods predominate among the parametric
models. Ahmed and Cook (1979) used the autoregressive integrated
moving average (ARIMA) (0,1,3) model to forecast traffic flow
on freeways in Los Angeles, Detroit, and Minneapolis surveillance
systems. Hamed et al. (1995) found that the ARIMA (0,1,1)
model adequately forecasted 1-min traffic volume and verified
the model’s performance during the peak morning period on 5
urban arterials in Amman, Jordan. Williams (1999) and Williams
and Hoel (2003) justified the application of time series methods
to traffic flow forecasting by considering the Wold decomposition
theorem, and a seasonal ARIMA (SARIMA) model was presented
based on 15-min traffic flow data on a London motorway. In the
model, weakly smooth transitions were generated with weekly
seasonal differentials, and weekly patterns were set as the key
seasonal influences. For the SARIMA model, Williams and Hoel
(2003) and Smith et al. (2002) found that parameters (1,0,1)(0,1,1)672
performed significantly best using 15-min aggregated traffic
flow data. To better forecast real-time traffic flow online, the Kalman
filtering algorithm is widely adopted. Okutani and Stephanedes
(1984) built two Kalman filtering models to forecast weekly and
daily traffic flows. Based on vehicle speed data from Texas,
USA, Ye et al. (2006) delved into the unbiased Kalman filter
prediction method and found that the method can highly meet the
accuracy requirements of real-time traffic flow prediction. Apart
from the mean traffic flow prediction, few scholars have explored
the use of parametric methods for uncertainty quantification in
traffic flow forecasting. Xia et al. (2013) presented a vector
autoregressive (VAR) plus multivariate generalized autoregressive
conditional heteroscedasticity (MGARCH) method for shortterm traffic flow forecasting. The presented MGARCH model
produces reliable time-varying confidence intervals for traffic
flow. Guo et al. (2014) designed an adaptive Kalman filtering for
the SARIMA-GARCH model to generate real-time traffic flow
level and associated interval prediction. The proposed method
shows better performance when the real-time traffic flow is
highly unstable.
Parametric models might have two limitations: 1) they are
applicable to linear systems, so their accuracy still could be
improved; 2) a complete long sequence is required to predict
near-term traffic flow, e.g., SARIMA(1,0,1)(0,1,1)672 model needs
more than a full week of observation to forecast traffic flow for
the next 15 minutes. Traffic flow sequences are often inherently
nonlinear and have missing values. Therefore, some studies have
explored the use of nonparametric models for predicting traffic
flow.
In terms of nonparametric models, Clark et al. (1993) conducted
a forecasting model utilizing a back-propagation (BP) neural
network. Park et al. (1998) developed a radial basis function (RBF)
neural network to predict traffic volume and concluded that RBF
neural network outperformed the BP neural network and consumed
less computational time. Due to the capability of capturing the
temporal trends of sequence data, recurrent neural network
(RNN) has recently been increasingly chosen to predict traffic
flow (Tedjopurnomo et al., 2020). Amongst RNN-based methods,
LSTM NN is by far the most popular one because it overcomes
the shortcomings of conventional RNN in modeling long-term
dependencies of sequence data (Hochreiter and Schmidhuber,
1997). Shao and Soong (2016) explored applying LSTM to
short-term traffic flow prediction and pointed out that LSTM can
learn more abstract representations in the sequence of non-linear
traffic flow. Jia et al. (2017) utilized the LSTM model to forecast
urban traffic flow taking into account rainfall factor. The model
yielded better prediction performance based on rainfall intensity
data and arterial traffic flow in Beijing, China. Some researchers
treated the evolution of traffic flow as a spatial-temporal process
and combined LSTM with other neural networks to improve
prediction accuracy. Wu and Tan (2016) and Du et al. (2017)
respectively designed a hybrid neural network model that
combined LSTM NN and convolutional neural networks (CNN)
for traffic flow forecasting. The accuracy and effectiveness of
their developed models were demonstrated experimentally compared
to other traditional neural network models. To utilize the advantage
of the attention mechanisms that can select the relatively critical
information for the current task from all inputs (Vaswani et al.,
2017), Tang et al. (2021) presented an attention-based LSTM
learning structure with the genetic algorithm to forecast entrancelevel traffic volume. Experiment results showed that the attention
mechanism can diminish the influence of cumulative errors
generated from long traffic flow sequences and further improve
forecast accuracy and stability. Though the neural network-based
models are performing increasingly better in traffic flow mean
prediction, to our best knowledge, they have not dealt with
uncertainty quantification in traffic flow forecasting.
3. Methodology
3.1 Overall Framework
Figure 1 describes the proposed hybrid framework for short-term
traffic flow prediction and uncertainty quantification. The hybrid
framework consists of two key modules. The first is a mean
prediction module based on LSTM NN, which is well-suitable
for making a mean prediction based on time series data (Hochreiter
and Schmidhuber, 1997). The second is an interval prediction
KSCE Journal of Civil Engineering 365
Fig. 1. The Overall Framework
module built on BNN, which can measure uncertainty by combining
the neural network with Bayesian inference (Blundell et al., 2015;
Kendall and Gal, 2017; Zhu and Laptev, 2017). There is a consensus
that a large number of Bayesian layers is quite redundant in
accounting for uncertainty (Zeng et al., 2018; Jospin et al., 2022).
Therefore, this study seeks to quantify the uncertainty in traffic
flow prediction using only a few Bayesian fully connected layers.
These two modules take advantage of the strengths of LSTM
NN and BNN respectively.
The traffic flow data needs to be preprocessed before being
fed into the two modules. The traffic flow data is first aggregated
according to the length of the time period. The inputs of the
proposed model are the traffic flows collected from periods
Fig. 2. Mean Prediction Module Using LSTM NN
before the current moment, and the output is the traffic flow for
the upcoming 1 time-step. The historical traffic flow data is utilized
to train the neural networks, and the future data is predicted
based on the trained modules. The outputs of these two modules
constitute the predicted traffic flow. The structures and training
algorithms of these two modules are described below.
3.2 Traffic Flow Mean Prediction Using LSTM NN
3.2.1 The Structure of the LSTM NN
LSTM NN, as a variant of RNN, is a powerful deep learning
method to handle sequence data. It can capture long-term temporal
trends of traffic flow sequence data and has been popular in the
366 Y. Wang et al.
field of traffic flow prediction for nearly a decade (Tedjopurnomo et
al., 2020). To improve the convergence speed and the accuracy
of the LSTM NN model, the traffic flow data is standardized
before training. As shown in Fig. 2, the traffic flow mean prediction
module includes several LSTM NN layers. The square nodes
represent the LSTM neurons, and the circular nodes represent the
input data and hidden output states of the LSTM NN layer. The
input data consists of a sequence of traffic flows for the t periods
before the future period t + 1. The number of LSTM NN hidden
layers and the number of LSTM neurons in each layer are
determined by the characteristics of actual traffic flow. The grey
circular node represents the output of the LSTM model, i.e., the
mean of the traffic flow for the period t + 1.
A fundamental element of the LSTM NN is the memory cell
consisting of three gates, i.e., a forget gate ft + 1 , an input gate
it + 1 , and an output gate ot + 1 (Gers and Schmidhuber, 2000).
The three gates are utilized to determine and update the cell state
Ct + 1 . The relevant definitions are given as
ft + 1 = sigmod  Wf dt + 1 + Wf ht + bf  ,
(1)
it + 1 = sigmod  Wi dt + 1 + Wi ht + bi  ,
(2)
ot + 1 = sigmod  Wo dt + 1 + Wo ht + bo  ,
(3)
C
(4)
t+1
= tanh  Wc dt + 1 + Wc ht + bc 
Ct + 1 = ft + 1 Ct + it + 1 Ct+ 1 ,
(5)
ht + 1 = ot + 1  tanh  Ct + 1  ,
(6)
where dt+1 is the input data at time interval t + 1, i.e., the traffic
flow sequence for the t periods; ht is the hidden state of the
memory cell at time interval t, Wf, Wi, Wo, Wc are the weight
matrices; bf, bi, bo, bc are the bias vectors.
3.2.2 Backpropagation for Training LSTM NN
LSTM NN can be regarded as a probabilistic model P  y x w  .
Given an input x, an LSTM NN uses a set of parameters or
weights w to assign a probability to each possible output y. For
the purpose of regression in traffic flow prediction, P  y x w  is
a Gaussian distribution that corresponds to a squared loss. The
weights w, based on maximum likelihood estimation (MLE), can
be calculated as
w
MLE
= arg maxw log P  D w  = arg minw – log P  D w  , (7)
where –log P  D w  is the loss of the LSTM NN. Backpropagation
is chosen for training the LSTM NN, where log P  D w  is
assumed to be differentiable in w. The gradient of w is calculated
as
 –log P  D|w 
w = --------------------------------- .
w
(8)
The LSTM NN is trained by Adam optimizer (Kingma and
Ba, 2014). The optimizer is a stochastic optimization method
with high computational efficiency, simple implementation, low
memory requirement, and no impact on the diagonal reconstruction
Fig. 3. Interval Prediction Module Using BNN
of the gradient (Kingma and Ba, 2014). The weights w, based on
the Adam optimizer, are optimized to minimize the prediction
loss as
w  w – 1 w ,
(9)
where 1 is the learning rate of the LSTM NN.
3.3 Traffic Flow Interval Prediction Using BNN
3.3.1 The Structure of the BNN
BNN, as a variant of the standard neural network, combines the
neural network with Bayesian inference. Since its weights, biases,
and outputs are considered variables, a BNN can be regarded as
an ensemble of multiple neural networks. Before training the
BNN model, the traffic flow data is standardized for the same
purpose as the mean prediction module. As shown in Fig. 3, the
traffic flow interval prediction module includes several fully
connected BNN layers. The circular nodes represent the input
data and hidden output states of the BNN layer. The input data is
the traffic flow sequence for the t periods before the future period
t + 1. The hidden layers are fully connected BNN layers with
nonlinear activation functions. The number of layers and the
number of neurons in each hidden layer also need to be set on the
basis of the characteristics of actual traffic flow. The grey square
node represents the output of the BNN model, i.e., the interval of
the traffic flow for the future period t + 1.
3.3.2 Bayesian by Backpropagation for Training BNN
BNN seeks to find the posterior distribution P  w|D  of the
weights on the training traffic flow data. This distribution predicts
queries about future traffic flow data by taking expectations.
Given observation x, the predictive distribution of future traffic
flow y can be written as
P  ŷ|x̂  = EPw|D  P  ŷ|x̂ w   ,
(10)
where D is the training traffic flow data and w is the weights. In
Eq. (10), the expectation EP w|D  P  ŷ|x̂ w   is intractable for
neural networks of any real traffic flow data.
To approximate Bayesian processes in neural networks, three
ways are usually applied, including Monte Carlo dropout (MC
dropout) method, Markov chain Monte Carlo (MCMC) method,
KSCE Journal of Civil Engineering 367
and Bayesian by backpropagation method. The Bayesian by
backpropagation, one of the variational inference methods, is
adopted in this study for the following reasons: 1) the MC dropout
method might not fully capture the uncertainty associated with
the model predictions and is only applicable to models with
dropout layer(s) (Chan et al., 2020); 2) the MCMC method requires
storing a very large number of samples and is suitable for small
and average models (Blei et al., 2017); 3) the Bayesian by
backpropagation method fits any parametric distribution as posterior
and is applicable to large-scale models (Blundell et al., 2015).
The method finds the best parameters  on a distribution
q  w|  to approximate the true posterior distribution P  w|D  .
By minimizing the Kullback-Leibler dispersion (KL divergence)
(Kullback and Leibler, 1951) of the two distributions, the
approximation can be achieved. Combined with Bayes’ theorem,
parameters  can be calculated as
 = arg minKL  q  w| ||P  w|D  
q  w| - dw
= arg min  q  w|  log ----------------P  w|D 
(11)
= arg minKL  q  w| ||P  w   – Eqw| log P  D|w .
The loss function of BNN is further simplified as follows
F  D   = KL  q  w| ||P  w   – Eqw| log P  D|w  .
(12)
Equation (12) is a sum of a data-dependent part and a priordependent part that can be respectively referred to as likelihood
cost and complexity cost (Blundell et al., 2015). The Monte Carlo
sampling is adopted due to the high computationally complexity
cost, thus the prediction loss of Eq. (12) can be approximated as
u
   – 2  ,
(17)
   – 2  ,
(18)
where 2 is the learning rate of the BNN.
4. Case Study
To assess the effectiveness of the proposed LSTM-BNN model,
the Caltrans Performance Measurement System (PeMS) dataset
collected on freeways is utilized for the case study. The
SARIMA(1,0,1)(0,1,1)672-GARCH(1,1) model, a state-of-the-art
method for traffic flow prediction and uncertainty quantification,
is selected as the benchmark. For comparison purposes, the
prediction intervals for both models are computed at the 95%
significant level.
4.1 Data Description
q  w|  dw
= arg min  q  w|  log -----------------------------P  w P  D|w 
n
posterior parameters  =     are optimized as
i
i
F  D     i = 1 log q  w |  – log P  w  – log P  D|w  (13)
i
where w is the ith Monte Carlo sample from the variational
i
2
posterior q  w |  . For w  N     , where  is the mean and
 is the standard deviation, the Monte Carlo sampling directly
2
from N     makes  and  non-differentiable. To solve the
non-differentiable problem, a reparametrization trick (Kingma
and Welling, 2019) is utilized to guarantee the operation of
backpropagation. The  is parameterized as  = log  1 + exp     ,
so  is non-negative.  =     are the variational posterior
parameters, and the posterior sample of the weights w is
w = t     =  +   =  + log  1 + exp      ,
The Caltrans PeMS dataset is widely used in traffic parameter
prediction tasks (Xu et al., 2013; Guo et al., 2019; Li et al., 2019;
Yao et al., 2022). The data is collected every 30 seconds from
over 15,000 individual vehicle detector stations (VDSs) located
throughout California. Sacramento, as the capital of California,
benefits from a well-developed freeway network that facilitates
Table 1. The Detailed Information for Selected VDSs in Sacramento
City
Number
VDS
Freeway
Location name
1
2
3
4
5
6
314909
318626
318282
312220
312694
312857
I5-N
I5-S
US50-E
US50-W
SR51-N
SR51-S
WB Florin Rd
4
Seamas Ave
4
25th St
4
NB Howe Ave
4
51NB at J ST
4
51SB at Elvas Underpass 3
(14)
where  is a random variable. The gradients of the mean  and
the standard deviation parameter  are therefore calculated as
f  w   f  w  
 = ------------------- + ------------------- ,

w
(15)
f  w  

f  w  
 = ------------------- ---------------------------- + ------------------- .
w 1 + exp  – 

(16)
In this module, Adam optimizer is also chosen for training
BNN with large-scale data and parameters. The variational
Fig. 4. The locations of the Selected VDSs
Lanes
368 Y. Wang et al.
Fig. 5. The Traffic Flow Distributions for 6 Selected VDSs
efficient travel within the city and connects Sacramento to other
regions. The traffic flow data for these freeways is readily available
and well-documented in Caltrans PeMS. As shown in Table 1,
we use 6 VDS traffic flow data from 6 freeways in Sacramento.
The specific locations of these VDSs on the freeways are presented
in Fig. 4. These particular VDSs are strategically chosen from
major freeways to ensure representative coverage of the city. The
prediction results obtained from our study have the potential to
assist Sacramento traffic authorities in active traffic management
and control. The study data was collected from September 1st,
2018 to November 30th, 2018.
The raw data was aggregated at 15-min intervals. Fig. 5 displays
the traffic flow distribution for the selected VDSs, with median
flows ranging from approximately 750 to 1000, 15th percentile
flows ranging from approximately 150 to 400 and 85th percentile
flows ranging from approximately 1050 to 1400. The traffic flow
distributions of all 6 VDSs are different, which can effectively
verify the model's validity. The first two months of data were
used to train the models and the last month of data was utilized
for evaluation. The training data was transformed to generate the
training set and validation set, and the test data was transformed
to generate the test set. The data transformation is conducted
according to the period number t of the input data. In this case
study, t ranged from 1 to 20. The optimal t for each module is
determined in Section 3.3.
4.2 Case Study Design
Traffic flow prediction is commonly formulated as a regression
problem in the literature (Lv et al., 2015; Yu et al., 2018; Pavlyuk,
2019; Razail et al., 2021). Thus, four regression measures are
chosen to evaluate the performance of the baseline and proposed
model. Mean absolute error (MAE) and mean absolute percentage
error (MAPE) are utilized to determine the accuracy of the mean
of the predicted traffic flow. Mean interval width (MIW) and
kickoff percentage (KP) are used to quantify the uncertainty of
predicted traffic flow intervals. The four performance measures
are defined as
1
ˆ
MAE = ----  i  T fi – fi ,
N
(19)
ˆ
fi – fi
1
-  100% ,
MAPE = ----  --------N i  T fi
(20)
upper
lower
1
MIW = ----  i  T f̂ i – f̂ i  ,
N
(21)
KN
KP = --------  100% ,
N
(22)
where N is the number of overall prediction samples, KN is the
number of kickoffs, with a kickoff indicating that the real flow
lies outside of the prediction interval. In period i, fi is the actual
15-min traffic flow, f̂i is the mean predicted 15-min traffic flow,
upper
lower
and f̂ i
are the upper and lower bounds of a prediction
f̂ i
interval. All performance measures are expected to be small.
The performance of the LSTM-BNN model and the SARIMAGARCH model are compared by VDS based on all performance
metrics. To demonstrate the detailed performance of the two
models, all metrics will also be computed by the time of day. The
daytime is simply defined as from 6:00 am to 7:00 pm, while the
remainder of the time is considered nighttime. In addition, we
define non-seasonal traffic conditions as traffic flow at the
current period changes by more than 15% compared to the one
during the same period of the previous day. The model performance
under unseasonal traffic conditions is also evaluated.
4.3 Hyperparameter Tuning of the LSTM-BNN Model
To enhance the model performance, hyperparameter tuning was
performed on the two LSTM-BNN modules. Both the mean
prediction module and the interval prediction module contain
hyperparameters such as the length of the input layer, the number
of hidden layers, the number of memory cells in each layer, etc.
Fig. 6 shows the main experiments of the hyperparameter
turning. For LSTM NN of the mean prediction module, MAE is
selected as the turning measure. The traffic flows for the past 10
periods are suitable as the input layer for learning the traffic flow
sequence pattern. It demonstrates that an LSTM layer of 10
memory cells is optimal. The batch size of 500 and the learning
rate of 0.001 are appropriate for propagation through the network.
For BNN of the interval prediction module, KP is chosen as the
turning measure based on a fixed MIW. The optimal length of
the input layer is 10. One LSTM layer of 30 memory cells is
sufficient to obtain a good performance. The batch size of 1000 and
the initial learning rate of 0.01 are suitable for training the network.
The values of the key hyperparameters of the LSTM-BNN
model are summarized in Table 2.
4.4 Performance Evaluation
4.4.1 Accuracy Evaluation
Figure 7 presents the overall accuracy of traffic flow prediction
for the LSTM-BNN model and the SARIMA-GARCH model,
including MAE and MAPE measures. For all 6 VDSs, the MAE
of the LSTM-BNN model is 47.8 veh/15 min, and the MAE of
the SARIMA-GARCH model is 58.7 veh/15 min. The MAPE of
KSCE Journal of Civil Engineering 369
Fig. 6. Hyperparameter Tuning of the LSTM-BNN Model: (a) LSTM-The Length of the Input Layer, (b) LSTM-The Number of LSTM Layers, (c) LSTM-The
Number of Memory Cells in Each Layer, (d) LSTM-Batch Size, (e) LSTM-Learning Rate, (f) BNN-The Length of the Input Layer, (g) BNN-The
Number of BNN Layers, (h) BNN-The Number of Memory Cells in Each Layer, (i) BNN-Batch Size, (j) BNN-Learning Rate
the BNN model is 8.9%, and the MAPE of the SARIMAGARCH model is 10.6%. For each VDS, the MAEs and MAPEs
of the LSTM-BNN model are smaller than those of the SARIMAGARCH model. Overall, the LSTM-BNN model tends to perform
better than the SARIMA-GARCH model in predicting the mean
15-min traffic flow.
The detailed accuracy performances of 15-min traffic flow
prediction by time of day are shown in Fig. 8. For most times of
the day, both MAEs and MAPEs of the LSTM-BNN model are
smaller than those of the SARIMA-GARCH model. During the
daytime, the accuracy improvement of the LSTM-BNN model is
more pronounced compared to the SARIMA-GARCH model. It
370 Y. Wang et al.
Table 2. Hyperparameters of the LSTM-BNN Model
Module
Hyperparameters
Values
Mean prediction
(LSTM NN)
The length of the input layer
The number of LSTM layers
The number of memory cells in each layer
Batch size
Learning rate
10
1
10
500
0.001
Interval
prediction
(BNN)
The length of the input layer
The number of BNN layers
The number of memory cells in each layer
Batch size
Learning rate, step_size, gamma
10
1
30
1000
0.01, 5000,
0.5
may be due to the fact that the training data contains more
high-traffic scenes during the daytime than low-traffic scenes
during the nighttime, and the LSTM-BNN model is more
likely to learn the traffic flow sequence patterns during the
daytime.
4.4.2 Uncertainty Quantification
Figure 9 displays the overall uncertainty of traffic flow prediction
for the LSTM-BNN model and the SARIMA-GARCH model,
including MIW and KP measures. For all 6 VDSs, the MIW of
the LSTM-BNN model is 310.4 veh/15 min, and the MIW of the
SARIMA-GARCH model is 314.5 veh/15 min. For each VDS,
the LSTM-BNN model predicts narrower MIW than the SARIMAGARCH model. For all 6 VDSs, the KP of the LSTM-BNN
model is 5.1%, and the KP of the SARIMA-GARCH model is
5.4%. For each VDS, the KPs of the BNN model are not larger
than those of the SARIMA-GARCH model. Together, these results
suggest that LSTM-BNN performs better in predicting the
interval of the traffic flow.
The uncertainty of 15-min traffic flow prediction in terms of
time of day is presented in Fig. 10. It can be found that the MIWs
of the LSTM-BNN model, ranging approximately from 250 veh/
15 min to 400 veh/15 min, are more stable than those of the
SARIMA-GARCH model. During the nighttime with low traffic
flow, based on the wider predicted MIWs, the LSTM-BNN model
predicts smaller KPs. During the daytime with heavy traffic flow,
the KPs of the LSTM-BNN model are closer to those of the
benchmark, while the MIWs of the LSTM-BNN model are
much narrower. Consistent with the conclusions of the accuracy
evaluation, the LSTM-BNN model outperforms during the
daytime with high traffic flow.
4.4.3 Extend Evaluation under Non-Seasonal Traffic
Conditions
To further evaluate the efficacy of the proposed LSTM-BNN
model, Table 3 lists several typical examples of traffic flow
prediction and uncertainty quantification under non-seasonal traffic
conditions. In this work, non-seasonal traffic conditions are simply
defined as the traffic flow at the current period changes by more
than 15% compared to the one in the same period of the previous
day. It is apparent from this table that the AEs and APEs predicted
by the LSTM-BNN model are much smaller than those of the
benchmark under all listed non-seasonal traffic conditions.
Meanwhile, the LSTM-BNN model predicts narrower IWs than
Fig. 7. Accuracy Comparison of Traffic Flow Prediction by VDS: (a) MAE, (b) MAPE
KSCE Journal of Civil Engineering 371
Fig. 8. Accuracy Comparison of Traffic Flow Prediction by Time of Day: (a) MAE, (b) MAPE
Fig. 9. Uncertainty Comparison of Traffic Flow Prediction by VDS: (a) MIW, (b) KP
the SARIMA-GARCH model. In the last column, “Yes” and
“No” represent whether the real flow lies in the prediction
intervals of the model. For all 12 non-seasonal traffic conditions,
8 real flows lie in the prediction intervals of the LSTM-BNN
model and 5 lie in those of the SARIMA-GARCH model. The
superiority of the LSTM-BNN model is mostly identified in
conditions with increased traffic flow. Overall, the findings reveal
that the proposed model outperforms the benchmark under nonseasonal traffic conditions. The possible reason is that the
LSTM-BNN model can capture the nonlinear relationship in the
traffic flow series, and the SARIMA-GARCH relies more on the
seasonal patterns of the traffic flow sequences.
372 Y. Wang et al.
Fig. 10. Uncertainty Comparison of Traffic Flow Prediction by Time of Day: (a) MIW, (b) KP
ㄴ
Table 3. Comparison of Traffic Flow Prediction under Unseasonal Traffic Conditions
VDS
Real flow
(veh/15 min)
Change
ratio (%)
Model
Predict flow Lower bound Upper bound AE
APE
(veh/15 min) (veh/15 min) (veh/15 min) (veh/15 min) (%)
Is real flow lies
IW
in the prediction
(veh/15 min)
interval?
314909
1034
-34%
LSTM-BNN
1374
SARIMA-GARCH
1511
314909
762
-28%
LSTM-BNN
788
SARIMA-GARCH
844
LSTM-BNN
SARIMA-GARCH
318626
318626
318282
318282
312220
312220
312694
312694
312857
312857
1480
1415
614
711
1030
1093
754
1079
1092
858
-20%
-22%
-49%
-37%
-20%
-17%
-32%
41%
23%
-20%
1310
1616
340
33%
306
No
1273
1750
477
46%
477
No
688
965
26
3%
277
Yes
339
1349
82
11%
1010
Yes
1455
1255
1720
25
2%
465
Yes
1653
1338
1968
173
12%
630
Yes
LSTM-BNN
1689
1499
1872
274
19%
373
No
SARIMA-GARCH
1709
1513
1905
294
21%
392
No
LSTM-BNN
988
738
1178
374
61%
440
No
SARIMA-GARCH
1098
812
1383
484
79%
571
No
LSTM-BNN
767
543
984
56
8%
441
Yes
SARIMA-GARCH
990
412
1568
279
39%
1156
Yes
LSTM-BNN
1145
875
1370
115
11%
495
Yes
SARIMA-GARCH
1501
739
2264
471
45.7%
1525
Yes
LSTM-BNN
1279
1079
1503
186
17%
424
Yes
SARIMA-GARCH
1640
1278
2001
547
50%
723
No
LSTM-BNN
1078
912
1219
324
43%
307
No
SARIMA-GARCH
1134
864
1403
380
50%
539
No
LSTM-BNN
965
850
1122
114
11%
272
Yes
SARIMA-GARCH
811
604
1020
268
25%
416
No
LSTM-BNN
1077
1008
1226
15
1%
218
Yes
SARIMA-GARCH
937
799
1075
155
14%
276
No
LSTM-BNN
956
768
1048
98
11%
262
Yes
SARIMA-GARCH
1025
814
1237
167
19%
423
Yes
KSCE Journal of Civil Engineering 373
5. Conclusions
Short-term traffic flow prediction is extensively studied for
effectively serving many intelligent traffic management applications.
However, only a limited number of parametric studies focus on
measuring uncertainty in traffic flow prediction. Over the past
decade, artificial intelligence techniques have become a successful
means of solving many transportation problems. This paper
proposes an LSTM-BNN framework for short-term traffic flow
prediction and uncertainty quantification. Caltrans PeMS traffic
flow data for 6 freeways in Sacramento city is aggregated at 15min intervals to evaluate the LSTM-BNN model. The SARIMAGARCH model is used as the benchmark. MAE, MAPE, MIW,
and KP are chosen as the performance measures. As for accuracy
evaluation, the MAE of the LSTM-BNN model is 47.8 veh/
15 min and the MAPE of the LSTM-BNN model is 8.9%. As for
uncertainty quantification, the MIW of the LSTM-BNN model is
310.4 veh/15 min and the KP of the LSTM-BNN model is 5.1%.
Experimental results present that the LSTM-BNN model
outperforms the SARIMA-GARCH model in both the 15-min
mean and interval of traffic flow prediction. Primarily, the LSTMBNN model is superior during the daytime and under non-seasonal
traffic conditions. In reality, the proposed LSTM-BNN model
can be utilized by ITS for making reliable management
decisions.
Future research would be of interest to explore combining
spatial-temporal neural network and BNN to measure networklevel uncertainty in traffic flow forecasting. Additionally, considering
the dynamics and propagation of the traffic system, it is meaningful
to investigate using BNN to address other traffic uncertainty
problems.
Acknowledgments
This study was funded by the National Natural Science Foundation
of China (No. 71971060). The authors want to thank the anonymous
reviewers for their useful comments and suggestions to improve
the quality of this paper.
ORCID
Yinpu Wang https://orcid.org/0000-0002-0280-6978
Siping Ke https://orcid.org/0000-0003-1599-2359
Chengchuan An https://orcid.org/0000-0002-5254-8751
Zhenbo Lu https://orcid.org/0000-0001-5887-872X
Jingxin Xia https://orcid.org/0000-0003-2298-3303
References
Ahmed MS, Cook AR (1979) Analysis of freeway traffic time-series
data by using Box-Jenkins techniques. Transportation Research
Record 722:1-9
Blei DM, Kucukelbir A, McAuliffe JD (2017) Variational inference: A
review for statisticians. Journal of the American Statistical Association
112(518):859-877, DOI: 10.1080/01621459.2017.1285773
Blundell C, Cornebise J, Kavukcuoglu K, Wierstra D (2015) Weight
uncertainty in neural networks. In International Conference on Machine
Learning 1613-1622, DOI: 10.48550/arXiv.1505.05424
Chan A, Alaa A, Qian Z, Schaar MVD (2020) Unlabelled data improves
Bayesian uncertainty calibration under covariate shift. Proceedings
of the 37th International Conference on Machine Learning, PMLR,
1392-1402, DOI: 10.48550/arXiv.2006.14988
Clark SD, Dougherty MS, Kirby HR (1993) The use of neural networks
and time series models for short term traffic forecasting: A comparative
study. In Transportation Planning Methods. Proceedings of Seminar
D Held at the Ptrc European Transport, Highways and Planning 21st
Summer Annual Meeting, 363
Du S, Li T, Gong X, Yang Y, Horng SJ (2017) Traffic flow forecasting
based on hybrid deep learning framework. 2017 12th International
Conference on Intelligent Systems and Knowledge Engineering
(ISKE), IEEE, Nanjing, 1-6, DOI: 10.1109/ISKE.2017.8258813
Gers FA, Schmidhuber J (2000) Recurrent nets that time and count.
Proceedings of the IEEE-INNS-ENNS International Joint Conference
on Neural Networks. IJCNN 2000. Neural Computing: New Challenges
and Perspectives for the New Millennium, IEEE, Como, Italy 3:189194, DOI: 10.1109/IJCNN.2000.861302
Guo J, Huang W, Williams BM (2014) Adaptive Kalman filter approach
for stochastic short-term traffic flow rate prediction and uncertainty
quantification. Transportation Research Part C: Emerging Technologies
43:50-64, DOI: 10.1016/j.trc.2014.02.006
Guo S, Lin Y, Feng N, Song C, Wan H (2019) Attention based spatialtemporal graph convolutional networks for traffic flow forecasting.
Proceedings of the AAAI Conference on Artificial Intelligence 33:922929, DOI: 10.1609/aaai.v33i01.3301922
Hamed MM, Al-Masaeid HR, Said ZMB (1995) Short-term prediction of
traffic volume in urban arterials. Journal of Transportation Engineering
121(3):249-254, DOI: 10.1061/(ASCE)0733-947X(1995)121:3(249)
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural
Computation 9(8):1735-1780, DOI: 10.1162/neco.1997.9.8.1735
Jia Y, Wu J, Xu M (2017) Traffic flow prediction with rainfall impact using
a deep learning method. Journal of Advanced Transportation 2017:
e6575947, DOI: 10.1155/2017/6575947
Jospin LV, Laga H, Boussaid F, Buntine W, Bennamoun M (2022) Handson Bayesian neural networks — a tutorial for deep learning. IEEE
Computational Intelligence Magazine 17(2):29-48, DOI: 10.1109/
MCI.2022.3155327
Kendall A, Gal Y (2017) What uncertainties do we need in Bayesian
deep learning for computer vision. Advances in Neural Information
Processing Systems, 30, DOI: 10.48550/arXiv.1703.04977
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization.
DOI: 10.48550/arXiv.1412.6980
Kingma DP, Welling M (2019) An introduction to variational autoencoders.
Foundations and Trends in Machine Learning 12(4):307-392, DOI:
10.1561/2200000056
Kullback S, Leibler RA (1951) On information and sufficiency. The
Annals of Mathematical Statistics 22(1):79-86, DOI: 10.1214/aoms/
1177729694
Li Z, Jiang S, Li L, Li Y (2019) Building sparse models for traffic flow
prediction: An empirical comparison between statistical heuristics and
geometric heuristics for Bayesian network approaches. Transportmetrica
B: Transport Dynamics 7(1):107-123, DOI: 10.1080/21680566.2017.
1354737
Long J, Gao Z, Ren H, Lian A (2008) Urban traffic congestion propagation
and bottleneck identification. Science in China Series F: Information
374 Y. Wang et al.
Sciences 51(7):948-964, DOI: 10.1007/s11432-008-0038-9
Luo X, Niu L, Zhang S (2018) An algorithm for traffic flow prediction
based on improved SARIMA and GA. KSCE Journal of Civil
Engineering 22(10):4107-4115, DOI: 10.1007/s12205-018-0429-4
Lv Y, Duan Y, Kang W, Li Z, Wang F-Y (2015) Traffic flow prediction
with big data: A deep learning approach. IEEE Transactions on
Intelligent Transportation Systems 16(2):865-873, DOI: 10.1109/
TITS.2014.2345663
Okutani I, Stephanedes YJ (1984) Dynamic prediction of traffic volume
through Kalman filtering theory. Transportation Research Part B:
Methodological 18(1):1-11, DOI: 10.1016/0191-2615(84)90002-X
Park B, Messer CJ, Urbanik T (1998) Short-term freeway traffic volume
forecasting using radial basis function neural network. Transportation
Research Record: Journal of the Transportation Research Board
1651(1):39-47, DOI: 10.3141/1651-06
Pavlyuk D (2019) Feature selection and extraction in spatiotemporal
traffic forecasting: A systematic literature review. European Transport
Research Review 11(1):6, DOI: 10.1186/s12544-019-0345-9
Razali NAM, Shamsaimon N, Ishak KK, Ramli S, Amran MFM, Sukardi S
(2021) Gap, techniques and evaluation: Traffic flow prediction using
machine learning and deep learning. Journal of Big Data 8(1):152,
DOI: 10.1186/s40537-021-00542-7
Shao H, Soong BH (2016) Traffic flow prediction with long short-term
memory networks (LSTMs). 2016 IEEE Region 10 Conference
(TENCON), 2986-2989, DOI: 10.1109/TENCON.2016.7848593
Smith BL, Williams BM, Oswald RK (2002) Comparison of parametric
and nonparametric models for traffic flow forecasting. Transportation
Research Part C: Emerging Technologies 10(4):303-321, DOI:
10.1016/S0968-090X(02)00009-8
Tang J, Zeng J, Wang Y, Yuan H, Liu F, Huang H (2021) Traffic flow
prediction on urban road network based on license plate recognition
data: Combining attention-LSTM with genetic algorithm.
Transportmetrica A: Transport Science 17(4):1217-1243, DOI:
10.1080/23249935.2020.1845250
Tedjopurnomo DA, Zheng B, Choudhury FM, Qin K (2020) A survey
on modern deep neural network for traffic prediction: Trends, methods
and challenges. IEEE Transactions on Knowledge and Data
Engineering, DOI: 10.1109/TKDE.2020.3001195
Treiber M, Kesting A (2013) Traffic flow dynamics. Traffic Flow Dynamics:
Data, Models and Simulation, Springer-Verlag Berlin Heidelberg
983-1000, DOI: 10.1007/978-3-642-32460-4
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN,
Kaiser L, Polosukhin I (2017) Attention is all you need. Advances in
Neural Information Processing Systems 30:6000-6010, DOI: 10.48550/
arXiv.1706.03762
Williams BM (1999) Modeling and forecasting vehicular traffic flow as
a seasonal stochastic time series process. PhD Thesis, University of
Virginia, Charlottesville, USA
Williams BM, Hoel LA (2003) Modeling and forecasting vehicular
traffic flow as a seasonal ARIMA process: Theoretical basis and
empirical results. Journal of Transportation Engineering 129(6):664672, DOI: 10.1061/(ASCE)0733-947X(2003)129:6(664)
Wu Y, Tan H (2016) Short-term traffic flow forecasting with spatialtemporal correlation in a hybrid deep learning framework. Computer
Vision and Pattern Recognition, DOI: 10.48550/arXiv.1612.01022
Xia J, Nie Q, Huang W, Qian Z (2013) Reliable short-Term traffic flow
forecasting for urban roads: Multivariate generalized autoregressive
conditional heteroscedasticity approach. Transportation Research
Record: Journal of the Transportation Research Board 2343(1):7785, DOI: 10.3141/2343-10
Xu C, Liu P, Wang W, Jiang X (2013) Development of a crash risk index
to identify real time crash risks on freeways. KSCE Journal of Civil
Engineering 17(7):1788-1797, DOI: 10.1007/s12205-013-0353-6
Yao R, Zhang W, Long M (2022) DLW-Net model for traffic flow
prediction under adverse weather. Transportmetrica B: Transport
Dynamics 10(1):499-524, DOI: 10.1080/21680566.2021.2008280
Ye Z, Zhang Y, Middleton DR (2006) Unscented Kalman filter method
for speed estimation using single loop detector data. Transportation
Research Record: Journal of the Transportation Research Board
1968(1):117-125, DOI: 10.1177/0361198106196800114
Yu B, Yin H, Zhu Z (2018) Spatio-temporal graph convolutional networks:
A deep learning framework for traffic forecasting. Proceedings of
the Twenty-Seventh International Joint Conference on Artificial
Intelligence, International Joint Conferences on Artificial Intelligence
Organization, Stockholm, Sweden, 3634-3640, DOI: 10.48550/
arXiv.1709.04875
Zeng J, Lesnikowski A, Alvarez JM (2018) The relevance of Bayesian
layer positioning to model uncertainty in deep Bayesian active
learning. Machine Learning, DOI: 10.48550/arXiv.1811.12535
Zhu L, Laptev N (2017) Deep and confident prediction for time series at
Uber. 2017 IEEE International Conference on Data Mining Workshops
(ICDMW), 103-110, DOI: 10.1109/ICDMW.2017.19
Download