Development and Evaluation of a Procedure for the Calibration of

advertisement
Development and Evaluation
of a Procedure for the Calibration
of Simulation Models
Byungkyu (Brian) Park and Hongtu (Maggie) Qi
behavior and whether one can trust the decisions made on the basis of
the simulations. Microscopic simulation models contain numerous
independent parameters that can be used to describe traffic flow characteristics, driver behavior, and traffic control operations. These models provide a default value for each parameter, but they also allow
users to change the values to represent local traffic conditions (3). The
process of adjusting and fine-tuning model parameters by using realworld data to reflect local traffic conditions is model calibration.
Simulation model-based analyses are often performed with default
parameter values or manually adjusted values. Rigorous calibration
procedures are often omitted because doing so requires a great deal of
time as well as a huge amount of field data. The findings and conclusions based on the uncalibrated or inappropriately calibrated models
could be misleading and even erroneous. Thus, proper calibration is
a crucial step in simulation applications.
To achieve adequate reliability of the simulation models, it is
important that a rigorous calibration procedure be applied before any
further study and analysis are conducted. Changes to the parameters
during calibration should be justified and defensible by the users.
Most of the calibration efforts were performed to achieve a reasonable correspondence between the field data and the output of the simulation model. Recently, more and more transportation researchers
and practitioners have realized the importance of model calibration
and have spent significant time and effort to demonstrate the validities of their models. However, it was found that simulation model calibration was often informally practiced among researchers and had
seldom been formally proposed as a procedure or guideline (3).
Therefore, proposing a general and practical procedure for simulation
model calibration is an urgent task.
Several researchers have discussed the general requirements of a
simulation model calibration procedure (3–9). Park and Schneeberger
proposed a general calibration procedure based on a linear regression
model (3). However, they did not consider the correlations among the
parameters. Sacks et al. demonstrated an informal validation process
and emphasized the importance of data quality and visualization (4).
Hellinga proposed general requirements for model calibration (5).
However, a formal procedure for conducting calibration was not
provided.
The objective of this paper is to propose and evaluate a general procedure for calibration of microscopic simulation models. The validity
of the proposed procedure was evaluated and demonstrated by use of
a case study with an actuated signalized intersection by using a widely
used microscopic traffic simulation model, Verkehr in Staedten Simulation (VISSIM). The remainder of the paper is organized as follows:
a brief background of the test site and simulation model, VISSIM, is
provided; the methodology of the proposed procedure is described
and is followed by the implantation and evaluation of the procedure
Microscopic traffic simulation models have been playing an important
role in the evaluation of transportation engineering and planning practices for the past few decades, particularly in cases in which field implementation is difficult or expensive to conduct. To achieve high fidelity
and credibility for a traffic simulation model, model calibration and validation are of utmost importance. Most calibration efforts reported in the
literature have focused on the informal practice, and they have seldom
proposed a systematic procedure or guideline for the calibration and
validation of simulation models. This paper proposes a procedure for
microscopic simulation model calibration. The validity of the proposed
procedure was demonstrated by use of a case study of an actuated signalized intersection by using a widely used microscopic traffic simulation model, Verkehr in Staedten Simulation (VISSIM). The simulation
results were compared with multiple days of field data to determine the
performance of the calibrated model. It was found that the calibrated
parameters obtained by the proposed procedure generated performance
measures that were representative of the field conditions, while the simulation results obtained with the default and best-guess parameters were
significantly different from the field data.
Microscopic traffic simulation models have been widely and increasingly applied for the evaluation of transportation system design, traffic operations, and management alternatives for the past few decades
because simulation is inexpensive, risk-free, and fast. As recognized
by the Highway Capacity Manual (HCM), simulation becomes a
valuable aid in assessing the performance of transportation systems (1). Furthermore, traffic simulation models sometimes provide significant advantages over traditional planning or analytical
models, such as Highway Capacity Software (HCS). For example,
the estimation of delay from HCS is often inappropriate when the
impacts of a large-scale system are considered or oversaturated conditions are prevalent (2). Other common reasons for the popularity
of simulation models include their attractive animations; their stochastic variability for the capture of real-world traffic conditions; and
their capabilities to model complex roadway geometries, such as
combined systems of urban streets and freeways.
When more and more people begin to rely on simulation as an
important evaluation tool, an essential concern is its credibility. Potential issues are whether these models accurately replicate realistic driver
B. Park, Department of Civil Engineering, University of Virginia, and Virginia
Transportation Research Council, P.O. Box 400742, Charlottesville, VA 22904-4742.
H. Qi, BMI-SG, 8330 Boone Boulevard, Suite 400, Vienna, VA 22182-2624.
Transportation Research Record: Journal of the Transportation Research Board,
No. 1934, Transportation Research Board of the National Academies, Washington,
D.C., 2005, pp. 208–217.
208
Park and Qi
209
by use of a case study; lastly, conclusions and recommendations are
provided.
TEST SITE AND SIMULATION MODEL
Test Site
An actuated signalized intersection was modeled in VISSIM to test
the proposed procedure for simulation model calibration. The intersection is located at the junction of U.S. Route 15 and U.S. Route 250
in Virginia. The test site is referred to as “Site 15” throughout the
remainder of this paper. As shown in Figure 1, it has four legs with a
single-lane approach for all directions, and there are exclusive rightturn bays 110 ft long on all approaches. The southbound approach
carries a relatively heavy traffic volume and has a large proportion
of left-turning vehicles during peak hours. This isolated actuated
intersection has two phases and functions, such as minimum and
maximum green times, added initial, passage time, and minimum gap.
Simulation Model
VISSIM, a microscopic and stochastic simulation model, was chosen
in this research for evaluation of the proposed procedure. VISSIM is
capable of simulating traffic operations on urban streets and freeways,
with a special emphasis on public transportation and multimodal
transportation (10). VISSIM consists of two different programs: the
traffic simulator and the signal state generator. The traffic simulator
is a microscopic simulation model that comprises car-following and
lane-changing logics, and it provides a simulation environment. The
simulator is capable of simulation at a resolution of up to 0.1 s. The
signal state generator allows users to define their own signal control
logic, including actuated signal control. VISSIM uses the psychophysical driver behavior model developed by Wiedemann (11), which
uses dynamic speeds and stochastic car-following models and, hence,
models driving behavior close to that detected in the field. In this
research, all simulation work was carried out with Version 3.70 of
VISSIM on a personal computer with a Pentium 2.53-GHz central
processing unit and 1.00 GB of random access memory.
PROPOSED PROCEDURE
The proposed procedure for simulation model calibration consists of
simulation model setup, initial calibration, feasibility test, parameter
calibration by use of a genetic algorithm (GA), and model evaluation,
as shown in Figure 2.
Simulation Model Setup
Simulation model setup comprises those tasks that are conducted
before the commencement of model calibration. These tasks include
the definition of the study scope and purpose, site selection, determination of measures of effectiveness (MOEs), field data collection, and
network coding.
Initial Evaluation
Once a simulation model is correctly set up, the simulation performance data based on the default parameter are obtained and compared
with the field data. If a close match between them is found, the default
model could be deemed appropriate for use for further analysis.
Otherwise, the subsequent steps must be performed.
Route 15
Route 250
FIGURE 1
Site 15 in VISSIM.
210
Transportation Research Record 1934
Simulation Model Setup
- Study scope and purpose
- Site selection
- Determination of measures of effectiveness
- Field data collection
- Network coding
Satisfied
Initial Evaluation on
Default Parameters
Unsatisfied
Initial Calibration
- Identification of calibration parameters
- Experimental design for calibration (Latin
Hypercube Design)
- Multiple runs
number of combinations to a practical amount while still reasonably
covering the entire parameter surface.
LHD provides an orthogonal array that randomly samples the entire
design space broken into regions of equal probability. This type of
sampling can be regarded as a stratified Monte Carlo sampling method
in which the pairwise correlations can be minimized to a small value,
which is essential for uncorrelated parameter estimates (12). LHD
is especially useful for exploring the interior of the parameter space
and for limiting the experiment to a fixed or a user-defined number
of combinations. This technique ensures that the entire range of
each parameter is sampled. In this study, a Latin Hypercube Sampling toolbox in MatLab (13) was used to generate a predetermined
number of scenarios by using the calibration parameters of each
simulation model.
Multiple Runs
Adjust key
parameter
ranges
Feasibility Test
- Statistical plots
- Analysis of variance
- Identification of key parameters
Feasibility Test
Passed?
No
To reduce stochastic variability, multiple runs are conducted for each
scenario from the experimental design. The number of replications
should be a function of the variability of the system, the acceptable
error level, and computation time constraints. The average performance measure and standard deviation are recorded for each of the
runs. The simulation results are used in the feasibility test.
Feasibility Test
Yes
Parameter
Calibration Using
Genetic Algorithm
Evaluation of the Parameter Sets
Unsatisfied - Distribution of simulation output
- Statistical test
- Visualization check
Satisfied
End
FIGURE 2
Methodology flowchart.
Initial Calibration
Identification of Calibration Parameters
All calibration parameters within the simulation model except those
that do not have any relevant impact on the simulation results are identified. For example, for a simple network without a route diversion
possibility, the parameters related to the route changes could be eliminated from the calibration parameter list. The acceptable ranges of
selected calibration parameters are determined.
Experimental Design for Calibration
The number of combinations among feasible controllable parameters
is extremely large, such that all possible scenarios cannot be evaluated in a reasonable amount of time. A Latin hypercube design (LHD)
algorithm (12), an experimental design method, is used to reduce the
The purpose of the feasibility test is to check whether the field data
are reasonably covered by the distribution of simulation results and,
hence, to determine whether the current parameter ranges are feasible. If field data fall within the middle 90% of the distribution of the
simulation results, defined as the acceptable range, the current parameter ranges are deemed sufficient to generate the field condition.
Otherwise, the ranges of key parameters are adjusted to achieve the
desired results. However, the adjusted value should be reasonable
and realistic. With the adjusted parameters, another set of a predetermined number of scenarios is generated and evaluated. This process
is repeated until the field data fall within the acceptable range.
Statistical Plots
With the performance measure recorded in each of the runs, statistical plots such as scatter plots and histograms are drawn to help
interpret the results. The plots include the distribution of performance measures from multiple runs and the performance measure
versus each calibration parameter. For the key parameters, the trend
is usually easy to observe.
Analysis of Variance
Analysis of variance (ANOVA), a more rigorous statistical method,
is applied to identify key parameters. ANOVA tests the null hypothesis that the means for several groups in the population are equal by
comparing the sample variance estimated from the group means with
that estimated within the groups (14).
In this study each pair of calibration parameter and simulation
results was examined by one-way ANOVA by use of a statistical
package, SPSS (15). It can identify statistically significant parameters
on the basis of the significance value of the F-test and can quantify the
Park and Qi
211
degree to which each parameter explains the performance measure on
the basis of the sum of squares (SSRs) between the groups. When the
P-value is smaller than the user-defined confidence level, the null
hypothesis is rejected, thereby indicating that group means are statistically different. Therefore, the parameters with small P values and
large SSRs are identified as key parameters.
evaluations of travel time distributions, statistical tests, and animations are satisfactory. Otherwise, either the previous step, which is
parameter calibration by use of a GA, or the initial calibration stage
must be reperformed.
IMPLEMENTATION OF PROPOSED PROCEDURE
Parameter Calibration by Use of GA
After the identification of appropriate parameters and their acceptable
ranges, a GA is applied to find the optimal parameter values. Analysis by use of a GA is a heuristic optimization technique based on the
mechanics of natural selection and evolution (16 ). It works with a
population of individuals, each of which represents a possible solution to a given problem. A fitness value is assigned to each individual
according to how good its represented solution is. Selection of the best
individuals from the current generation and mating of the best individuals to produce a new set of individuals produce a new population
of possible solutions. Three basic operators used in GA analysis are
reproduction, crossover, and mutation. The reproduction operator
selects individuals with higher fitness. The crossover operator creates
the next population from the intermediate population, whereas the
mutation operator is used to explore some areas that have not been
searched. Schema theorem and building blocks hypothesis are rigorous explanations of how the highly fit individuals survive to the next
generations (17 ).
In this study the relative error of the average travel time between
the simulation output and the field data was used as the fitness value
for the GA. The fitness function takes the form
FV =
TTfield − TTsim
TTfield
(1)
where
FV = fitness value,
TTfield = average travel time from the field, and
TTsim = average travel time from the simulation.
The proposed procedure was implemented and evaluated in a case
study by using VISSIM. This section describes the main steps in detail
and highlights the strategies of fine-tuning of the key parameters.
Simulation Model Setup
The tasks under the first step, simulation model setup, consist of the
definition of the study scope and purpose, site selection, determination of MOEs, field data collection, and network coding. The data
needed in this case study included the simulation input data and
MOEs for calibration purposes. To code Site 15 in the simulation
environment, input data such as traffic counts, the length of each
approach lane, intersection geometric characteristics, detector locations, and the posted speed limit were required. To account for dayto-day variability, data were collected during the evening peak
hour between 5:00 and 6:00 p.m. on multiple days, April 15 and
22 and May 13, 2003. The MOE used in this network was the average travel time on the southbound through and left-turn movements
because it directly reflected the level of service of the signalized
intersection. Furthermore, the collection of data from both the field
and the simulation was relatively easy.
Initial Evaluation
Once the simulation model was correctly set up, multiple simulations based on the default parameters were conducted and the average travel time was recorded. It was found that the average travel
time of 23.4 s from the simulation was far less than the field travel
time of 56.75 s. Thus, it was deemed that calibration was necessary
and that the following steps needed to be practiced.
Evaluation of Parameter Sets
Once GA finishes parameter calibration, multiple runs are separately
conducted for the default parameters and the GA-based parameters
for comparison. The best-guess parameter set is another parameter set
that could be included in the comparison. Here, the best guess is
derived by assignment of the most adequate value to each parameter
on the basis of engineering judgment and knowledge of local traffic
conditions. For each parameter set, a distribution of performance
measures is developed and compared with the field measures.
Statistical testing, for example, by the t-test and visualization
checking, is conducted. The t-test is used to verify whether the calibrated parameter set from GA can generate statistically significant
results and perform better than the default parameter set. In this study,
the performance measure used in the t-test was the average travel
time within the study period. Visualization is used to evaluate the
calibrated models. A model cannot be considered calibrated if the
animations are not realistic. Animations from multiple runs of each
model are viewed to ensure that the animations are acceptable. The
model is regarded as properly calibrated only if the results of the
Initial Calibration
Identification of Calibration Parameters
In the initial phase, users may not be able to identify correctly all the
critical parameters or the reasonable range for each parameter. Two
criteria could potentially help users select the parameter candidates to
be calibrated:
1. Conduct a trial-and-error test for each parameter. If the parameter does not affect the simulation result at all, it can be excluded
from the parameter candidate list.
2. Judge from a traffic engineer’s point of view. For instance,
because the study network has only one lane for each approach and
because the origin–destination was simple without any diversion possibility, parameters related to lane change or route choice behaviors
should not have an impact on the simulation result. Thus, the minimum headway (front and rear) and the waiting time before diffusion
were eliminated from the parameter list.
212
Transportation Research Record 1934
Users can determine the acceptable range for each parameter
selected on the basis of a review of the literature, including the manuals for simulation models and other existing research findings. So
that the calibration direction is not misled, the use of unrealistic
values in the range should be avoided. The feasibility of selected
parameters and their ranges was later verified through a feasibility test.
The following is the list of parameters and acceptable ranges determined initially. A detailed description of each parameter can be found
in the VISSIM manual (10).
Multiple Runs
Five random seeded runs were conducted in VISSIM for each of
the 200 cases, for a total of 1,000 runs. The average travel time was
recorded for each of the 1,000 runs. The results from the five multiple runs of each scenario were then averaged to represent each of the
200 parameter sets.
Feasibility Test
1. Simulation resolution: 1 to 9 time steps per simulation second
2. Number of observed preceding vehicles: 1 to 4
3. Average standstill distance: 1 to 5 m
4. Saturation flow rate: 1,756, 1,800, 1,846, 1,895, and 1,946
vehicles per hour
– Additive part of desired safety distance: 2.0 to 2.5
– Multiple part of desired safety distance: 3.0 to 3.7
5. Priority rules for minimum headway: 5 to 20 m
6. Priority rules for minimum gap time: 3 to 6 s
7. Desired speed distribution: 40 to 50, 30 to 40, and 20 to 30 mph
Both the additive and the multiple parts of the desired safety distances in the Wiedemann 74 car-following model determine the saturation flow rate for VISSIM. The values were based on Table 5.2.6
(flow rate, 25 s of through green time) in the VISSIM manual. The priority rules and the two associated parameters determine the priorities
of conflicting traffic movements in the intersection and designate the
right-of-way.
Experimental Design for Calibration
The Latin Hypercube Sampling toolbox in MATLAB was used to
generate 200 scenarios by using the first set of parameters and their
ranges, as defined above.
In this case, for eight controllable input parameters and five different values for each parameter, it would require 58 (i.e., 390,625) possible combinations. Any number of scenarios within this value could
have been used. However, a significant amount of time would be
required to conduct such a large number of experiments. Furthermore,
the large number of simulation runs would be increased if multiple
simulation runs for each scenario were required to reduce stochastic
variability. Initial tests indicated that the use of 200 scenarios was
adequate to cover the entire parameter surface and for computational
simulation and calculation.
With the simulation results based on the initial parameter set, it
is important to answer the following key questions by applying
statistical methods and analyzing plots.
• Are the simulation results based on the current parameters, and
do their ranges include the field conditions? If not, some parameters
and their ranges need to be adjusted or new parameters should be
explored. Histogram analysis is appropriate for this purpose. Figure 3
shows that the average field travel time, 56.75 s, falls outside of the
simulated distribution generated from the initial ranges of parameters. It indicates that the current parameters and their ranges were
not satisfactory and needed adjustment.
• Is the simulation result sensitive to the different values of each
parameter? Also, which parameters are critical to the simulation
results? The use of plots and the ANOVA test is appropriate. Scatter
plots of each parameter versus the travel time from the simulations
were investigated. It was found that the minimum gap and the desired
speed distribution were two parameters important to the results. The
travel time became higher when the minimum gap increased or the
mean desired speed decreased, or both. Figure 4 provides two examples of the scatter plots. The mean desired speed demonstrates an
apparent trend, while the simulation resolution has no effect on the
simulation result. However, the identification of the critical parameters through statistical analysis is more rigorous. The values of each
parameter were made discrete, separated into several groups, and
indexed. A one-way ANOVA procedure was then used to test the null
hypothesis that the means for two or more groups were equal. If the
parameter was sensitive to the result, the means for different groups
should be statistically different. Table 1 shows the ANOVA results
and lists the significance value of the F-test, the mean difference
between groups, and the SSRs between groups. A small P-value and
a large SSR indicate that the null hypothesis is rejected and that this
parameter is important to the result. The group pairs were listed if the
mean difference was significant at either the .5 or the .05 level.
70
Frequency
60
50
40
30
56.75 sec
20
10
0
FIGURE 3
25
30
35
40
45
Travel Time (sec)
50
Feasibility test results for Site 15 with VISSIM.
55
60
Park and Qi
213
60
Travel Time (sec)
50
40
30
20
10
0
20
25
30
35
40
Mean Desired Speed (mph)
(a)
45
50
60
Travel Time (sec)
50
40
30
20
10
0
0
1
2
3
4
5
6
Simulation Resolution
(b)
7
8
9
10
FIGURE 4 Calibration parameters versus VISSIM travel time: (a) mean desired speed and
(b) simulation resolution.
On the basis of the ANOVA results in Table 1, the desired speed
distribution and the minimum gap time were identified as critical
parameters. The desired speed distribution is an important parameter
because it has a significant influence on roadway capacity and achievable speeds. If a driver is not hindered by other vehicles, the driver
will travel at a desired speed with small stochastic variation. The min-
TABLE 1
imum gap is an important parameter for the modeling of conflicting
traffic movements and determination of when the movement with a
lower priority could proceed. To achieve a longer travel time from the
simulation, the mean desired speed should be decreased or the minimum gap should be increased. However, because the upper bound of
the minimum gap (i.e., 6 s) has already been reached, the use of higher
ANOVA Results for Site 15 with VISSIM
Site 15—VISSIM
Simulation resolution
No. of preceding vehicles
Average standstill distance
Saturation flow
Min. headway (m)
Min. gap time (s)
Desired speed distribution
Significance Value
(P-value)
Significant Mean
Differences (sig. < 0.5)
SSR Between
Groups
0.989
0.336
0.369
0.910
0.220
0.000
0.000
NA
1–3, 1–4
1–3
NA
1– 4, 2– 4
1–3, 1– 4*, 2– 4*, 3– 4
All
12.8
137.0
80.8
40.9
178.2
834.8
6655.9
* = The significance value is less than .05; NA = not applicable.
214
Transportation Research Record 1934
values for the minimum gap is not realistic. It was believed that some
other factors that might have affected the result were not yet identified. Therefore, additional tools, like the HCM procedure and field
data, were used to help clarify the ranges of some parameters, such as
the saturation flow on the southbound approach and the field speed
within the intersection.
Check Speed Within the Intersection
Although a posted speed limit of 45 mph was observed in the field,
the actual speed within the intersection could be less than 45 mph
because of inadequate sight distance, permitted left turns, a narrow
intersection, and other factors, such as the presence of gas stations
about 500 ft upstream of the stop line on the southbound approach. It
is more reasonable to use the field average speed to set the link speed
and turning speed. By viewing the videotapes, the speed data were
retrieved as the tail of the vehicle crossed the stop bar on the southbound approach. The first few vehicles in the queue were not considered because they were still accelerating. The result shows that the
mean field speed within the intersection is 35.3 mph, with a standard
deviation of 14.9 mph.
and the new ranges of the additive part of desired safety distance
and the multiple part of desired safety distance were 1.0 to 5.0 and
1.0 to 6.0, respectively. The updated parameter set and its ranges
are as follows:
1.
2.
3.
4.
5.
Simulation resolution: 1 to 9 time steps per simulation second
Number of observed preceding vehicles: 1 to 4
Maximum look-ahead distance: 200 to 300 m
Average standstill distance: 1 to 5 m
Saturation flow rate
– Additive part of desired safety distance: 1.0 to 5.0
– Multiple part of desired safety distance: 1.0 to 6.0
6. Priority rules for minimum headway: 5 to 20 m
7. Priority rules for minimum gap time: 3 to 6 s
8. Desired speed distribution: 30 to 40, 32.5 to 37.5, and 27.5 to
42.5 mph
With the new parameters, another 200 cases were generated by the
LHD method. The result shows that the new simulated distribution
covers the field data and indicates that the new parameter ranges were
able to capture the field condition.
Check Saturation Flow
Parameter Calibration by Use of GA
The saturation flow rate is another important factor for determination
of the intersection capacity, and this in turn affects the vehicle travel
time. It can be obtained from field data and by the HCM procedure
(1). The following shows how the saturation flow rate was calculated
and estimated by using these approaches:
A GA was integrated with the VISSIM model to calibrate the parameters. First, the GA is started by generating a number of individuals in
the population, each of which represents a feasible solution. Second,
each individual is decoded into actual parameter values and inserted
into the simulation model. Multiple simulations were then carried out
for each set of decoded values, with the simulation results channeled
into a fitness function. In this case, five simulation runs were conducted for each parameter set to reduce simulation variability. Third,
a fitness value is calculated by comparison of the simulation results
and field data by using Equation 1. If the comparison result does not
meet the stopping criterion, then a new population is generated after
the implementation of selection, crossover, mutation, and elitism
operations. This process is continued until the stopping criterion is
met or the maximum number of generations is reached.
Figure 5a shows the convergence of the GA. The results were
based on 10 generations and a population size of 20. Other parameter
settings included a crossover rate of 0.8 and a mutation rate of 0.05.
Both the best fitness and the average fitness values decrease as the
number of generations increases, which indicates that the simulation
result becomes closer to the field value. The small oscillation of fitness values in the last generations is due to the stochastic variability
of the VISSIM simulation because only five runs were conducted for
each set of parameters. For the parameter set with the best fitness
value, VISSIM was run 100 times and the average travel time was
recorded for each run. The resulting distribution of travel times is
shown in Figure 5b. The field travel times from 3 days also are shown,
and they all fall within the simulated distribution.
• Calculation of queue discharge headway from field data. The
field saturation flow rate on the southbound approach was obtained
from the videotapes. First, the time stamps of the third and the last
vehicles in the queue were recorded as they passed the stop line in
each cycle. The saturation flow rate was then calculated by using
Equation 2. The average headway was 3.13 s, which resulted in a
saturation flow of 1,149 vehicles per hour per lane.
SF =
3,600
(t3 − tn ) (n − 3)
(2)
where
SF = saturation flow rate (vehicles per hour per lane),
t3 = discharge headway of the third vehicle, and
tn = discharge headway of the nth vehicle.
• Calculation by HCM procedure. A saturation flow rate was calculated by the HCM procedure (1). The ideal saturation flow rate of
1,900 vehicles per hour per lane was adjusted for the prevailing conditions. The adjustment is made by introducing factors such as the
number of lanes, the lane width, the heavy vehicle percentage, and
right and left turns. The adjusted saturation flow rate on the southbound
approach was 1,313 vehicles per hour per lane.
Parameter Range Modification
The parameter ranges were modified on the basis of the field speed
data and the estimated saturation flow rates. The modified desired
speed ranges were 30 to 40, 32.5 to 37.5, and 27.5 to 42.5 mph;
Evaluation of Parameter Sets
In addition to the default and GA-optimized parameter sets, a bestguess parameter set was prepared to evaluate the performances of
the calibrated simulation models. The best-guess parameter set was
prepared because default simulation parameters were often adjusted
on the basis of the engineers’ knowledge of local traffic and other
conditions. As both the default and the GA-optimized parameter sets
Park and Qi
215
0.45
0.4
Best
0.35
Avg.
Fitness
0.3
0.25
0.2
0.15
0.1
0.05
0
0
1
2
3
4
5
6
7
Number of Generations
(a)
8
9
10
60
Field 2:
53.32 secs
50
Field 3:
46.51 secs
Frequency
40
30
Field 1:
70.43 secs
20
10
0
30
40
50
60
Travel Time (sec)
(b)
70
80
FIGURE 5 Initial calibration result obtained by use of GA: (a) convergence of
fitness value with generation and (b) travel time distribution of GA-based
parameter set.
were already identified, this section describes how the best-guess
parameter set was obtained.
Best-guess parameters were obtained on the basis of the users’
experience with the study network and with the help of the analytical tool in HCM. For instance, in this case study, additional information such as saturation flow was obtained from the HCM procedure.
According to the tables provided in the VISSIM manual, a linear
regression model with an R2 value of .92 was derived to estimate the
saturation flow rate:
SF = 2, 469.105 − (172.93 bx_add ) − (65.51 bx_mult )
(3)
where
SF = saturation flow rate (vehicles per hour per lane),
bx_add = additive part of desired safety distance, and
bx_mult = multiple part of desired safety distance.
By use of the Solver function in Microsoft Excel software, an optimal combination of these two parameters was obtained to minimize
the difference between the HCM and VISSIM saturation flow. The
values for bx_add and bx_mult were 5 and 2.85 m, respectively.
For comparison purposes, the parameter values for each set are
listed in Table 2. The evaluation of three parameter sets (i.e., default,
GA optimized, and best guess) were based on three criteria: (a) comparison of the distributions of 100 VISSIM simulation outputs,
(b) paired t-test, and (c) animations. Comparison of these three
parameter sets shows the importance of calibration for microscopic
simulation models. The travel times along the southbound approach
are compared in Figure 6. Figure 6 shows that all three field travel
times fall within the distribution of simulation results generated with
the GA-optimized parameter set. The two other models (i.e., the
default and best-guess parameter sets) generated travel times much
less than those observed in the field. A t-test was conducted to compare the simulation results obtained with the GA-optimized parameter
set with those obtained with the two other parameter sets. The result
shows that the GA-optimized parameter set generated statistically significant results, unlike the two other parameter sets. Animations of
each parameter set were viewed to determine whether the animations
216
Transportation Research Record 1934
TABLE 2
Three Parameter Sets for Site 15 with VISSIM
Site 15—VISSIM
Default
Best-Guess
GA-Based
Simulation resolution
Number of observed preceding vehicles
Maximum look-ahead distance (m)
Average standstill distance (m)
Additive part of desired safety distance
Multiple part of desired safety distance
Priority rules—minimum headway (m)
Priority rules—minimum gap time (s)
Desired speed distribution (mph)
5
2
250.00
2.00
3.00
3.00
5.0
3.0
Car: 40–50
HV: 30–40
23.40
10
4
300.00
2.00
5.00
2.85
15.0
3.5
Car: 40–50
HV: 30–40
29.53
6
4
215.15
3.85
5.0
5.3
20.0
4.0
Car: 27.5–42.5
HV: 15.5–18.6
50.33
Average travel time (s)
HV = heavy vehicle.
calibration is manifested. It is recommended that a simulation model
be calibrated before any further simulation-based traffic analyses are
conducted.
Feasibility testing was found to be useful in identifying the reasonable and appropriate ranges of calibration parameters. It was performed as an initial check of whether or not the selected parameters
and their ranges were able to achieve the field condition. In addition,
field data and analytical tools could be used to help verify the initial
parameter ranges. The impact of key parameters cannot be overstated,
and appropriate values need to be assigned. Scatter plots and statistical analysis, such as analysis by the ANOVA test, were effective in
identifying key parameters.
Because this study used only one MOE, travel time, for model calibration, the performance of other MOEs is uncertain. Further research
is recommended to include additional MOEs in the calibration
process. However, for some MOEs, such as delay and queue length, it
should be noted that different simulation models provide different calculation methods. Furthermore, to validate the model, it is desirable to
collect another set of data under conditions different from those used
for the calibration. Different conditions could be those that have been
were realistic or unrealistic. For the GA-optimized calibrated parameter set, the animations at several travel time percentiles were found
to be acceptable. For the default and the best-guess parameter sets, the
majority of vehicles passed the intersection without waiting, which
was not realistic.
CONCLUSIONS AND RECOMMENDATIONS
This paper has proposed and evaluated a general calibration procedure for microscopic simulation models. The validity of the proposed
procedure was demonstrated by use of a case study of an actuated
signalized intersection with VISSIM. The performance of the GAoptimized calibrated model was evaluated by comparing the distribution of the simulation output with multiple days of field data. The
result shows that use of the GA-optimized calibrated parameters
yielded a distribution of simulation outputs that matched the field
travel times for multiple days, while use of the default parameters and
best-guess parameters resulted in significant discrepancies between
the simulated and the field data. Therefore, the importance of model
45
40
53.32 sec
Frequency
35
46.51 sec
30
25
20
15
10
70.43 sec
5
0
20
30
40
50
Travel Time (sec)
Default Parameters
FIGURE 6
Best-Guess
60
70
GA-based
Comparison of travel times by different parameter sets.
80
Park and Qi
untried, such as after the implementation of a new signal timing plan,
or the use of different days on similar transportation facilities.
To account for day-to-day variability, multiple days of field data
should be used. The values of the data from only a single day could
be higher or lower than the true mean of the field value and may not
represent the general field conditions. A comprehensive range of data
collected by using advanced equipment as well as intelligent transportation system technologies should be obtained for the full calibration and validation of a simulation model. Although a network
with simple geometric alignments was tested in this study, a more
complex network should be considered to verify the performance of
the proposed procedure. Doing so will provide more insight into the
feasibility of the proposed procedure.
Finally, with the increased trend of the use of simulation applications for large and complex network-based problems, it is necessary
to calibrate not only driver behavior parameters but also dynamic origin–destination demand and route choice parameters. The proposed
procedure for simulation model calibration could be enhanced to
include more functions and evaluated for use with these scenarios.
217
5.
6.
7.
8.
9.
10.
11.
12.
REFERENCES
1. Highway Capacity Manual. TRB, National Research Council, Washington, D.C., 2000.
2. White, T. General Overview of Simulation Models. 49th Annual Meeting, Southern District Institute of Transportation Engineers, Williamsburg, Va., Jan. 2002.
3. Park, B., and J. D. Schneeberger. Microscopic Simulation Model Calibration and Validation: A Case Study of VISSIM for a Coordinated
Actuated Signal System. In Transportation Research Record: Journal of
the Transportation Research Board, No. 1856, Transportation Research
Board of the National Academies, Washington, D.C., 2003, pp. 185–192.
4. Sacks, J., N. Rouphail, B. Park, and P. Thakuriah. Statistically Based Validation of Computer Simulation Models in Traffic Operations and Man-
13.
14.
15.
16.
17.
agement. Journal of Transportation and Statistics, Vol. 5, No. 1, 2002,
pp. 1–24.
Hellinga, B. R. Requirement for the Calibration of Traffic Simulation
Models. Department of Civil Engineering, University of Waterloo.
http://www.civil.uwaterloo.ca/b.PDF. Accessed March 2003.
Milam, R. T. Recommended Guidelines for the Calibration and Validation of Traffic Simulation Models. Fehr & Peers Associates, Inc.
http://www.fehrandpeers.com/publications/traff_simulation_m.html.
Accessed March 2003.
Cohen, S. L. An Approach to Calibration and Validation of Traffic Simulation Models. Presented at 83rd Annual Meeting of the Transportation
Research Board, Washington, D.C., 2004.
Dowling, R., A. Skabardonis, J. Halkias, G. M. Hale, and G. Zammit.
Guidelines for Calibration of Microsimulation Models: Framework and
Applications. Presented at 83rd Annual Meeting of the Transportation
Research Board, Washington, D.C., 2004.
Zhang, Y., and L. E. Owen. Systematic Validation of a Microscopic
Traffic Simulation Program. Presented at 83rd Annual Meeting of the
Transportation Research Board, Washington, D.C., 2004.
VISSIM Version 3.7 Manual. PTV Planug Transport Verkehr AG, Innovative Transportation Concepts, Inc., Jan. 2003.
Wiedemann, R. Modeling of RTI-Elements on Multi-Lane Roads. In
Advanced Telematics in Road Transport, DG XIII, Commission of the
European Community, Brussels, Belgium, 1991.
McKay, M. D., and R. J. Beckman. A Comparison of Three Methods for
Selecting Values of Input Variables in the Analysis of Output from a
Computer Code. Technometrics, Vol. 21, No. 2, 1979, pp. 239–245.
MATLAB Users’ Manual. The Mathworks, Inc., Natick, Mass., 1999.
Milton, J. S., and J. C. Arnold. Introduction to Probability and Statistics.
McGraw-Hill, New York, 2003.
SPSS for Windows Release 10.1.0. SPSS Inc., Chicago, Ill., 2001.
Goldberg, D. E. Genetic Algorithms in Search, Optimization, and Machine
Learning. Addison-Wesley Publishing Co., Inc., Reading, Mass., 1989.
Beasley, D., D. R. Bull, and R. R. Martin. An Overview of Genetic Algorithms. Part I. Fundamentals. University Computing, Vol. 15, No. 2,
1993, pp. 58–69.
The Traffic Flow Theory and Characteristics Committee sponsored publication of
this paper.
Download