Document

advertisement
SPATIAL INTERACTION MODELS – GEOGRAPHICAL IDENTIFICATION
AND QUANTIFICATION OF MODEL BIASES GENERATED FROM GRAVITY
AND LOCATION SPECIFIC PREFERENCE FUNCTION MODELS
Dr Charles CHEUNG
Steer Davies Gleave, 28-32 Upper Ground, London SE1 9PD, United Kingdom
E-mail: charles.cheung@sdgworld.net
Dr John BLACK
Professor, Planning Research Centre, Faculty of Architecture, Wilkinson Building,
University of Sydney, Australia
E-mail: john.black@arch.usyd.edu.au
1
ABSTRACT
Current spatial interaction modeling techniques that are available and supported in software
packages mainly incorporate the gravity model, but previous research has identified
systematic biases are often found when the gravity model is applied to large metropolitan
regions. These biases may result in inappropriate decisions about infrastructure location and
capacity. This paper presents the result from a detailed investigation of the residual errors
identified in spatial interaction models. Applying various statistical and Geographical
Information
System-Transportation
(GIS-T)
techniques,
the
residual
errors
are
geographically identified and quantified. Travel datasets from the Sydney metropolitan area
are used. Improvement to model performance was made with the introduction of a
probabilistic model based on the Stouffer’s intervening opportunities model. The model
incorporates location specific preference functions to represent a relationship between
opportunities available and opportunities taken. The research presented in this paper
highlights the areas of deficiency in current spatial interaction modeling and points towards
the continual developments in preference functions and model evaluation techniques to
eliminate systematic model biases.
Keywords: spatial interaction, trip distribution, intervening opportunities, preference
functions, model biases
2
1. INTRODUCTION
One important transport challenge that a metropolitan city faces today is to develop and
maintain a balanced approach in achieving social, economic and environmental objectives.
The formulation of land use and transport plans and policies as well as the decision making
process in infrastructure investment often rely upon a robust and comprehensive analysis of
spatial interactions between residential locations and major activity centers. Models are
simplification of the real world designed for a specific purpose. One of the principles of
model building for forecasting purposes is to keep the formulation as simple as possible
and with a minimal number of calibration parameters and input variables that requires to
meet the model’s objectives and accuracy. The conventional approach to spatial interaction
modeling for the journey to work uses aggregate zonal data, often without stratification by
employment category. The lack of the degree of stratification for the estimation of trip
origin-destination (O-D) matrices can be seen from the current operation of the current
Sydney Strategic Travel Model (STM) where the approach is derived from the Sydney Area
Transportation Study (SATS) (NSW Department of Transport, 1974). Contemporary
advanced developments in computing technologies equip model builders with an array of
tools to easily and efficiently evaluate different spatial interaction models and their
functional variations that were not available when most aggregate models that form the
basis of today’s practice were first formulated. These advancements also allow
investigations into other form of stratified spatial interaction models that can be practically
processed and calibrated in a reasonable timeframe.
3
Somewhat different to the gravity model, the intervening opportunities model starts from
the first principle that the spatial interaction of trip making behavior is related to the
accessibility of opportunities. It introduces the probability theory as the theoretical
foundation of trip distribution. Stouffer (1940) applied the model to the context of
migration and the location of services and residences before Schneider (1959) developed its
mathematical procedure for the Chicago Area Transportation Study, to simulate an origindestination pattern of trips. The intervening opportunities model has attracted far less
interest from researchers and mainly due to its additional complexity and apparent
computational difficulty. In fact, compared to the gravity model, Stopher and Meyburg
(1975) argued that the intervening opportunities model has a stronger conceptual base and
attempts to address the problem of individual behavior. The Chicago Area Transportation
Study in the 1950s compared the performance of both the gravity model and the intervening
opportunities model and concluded that there was little difference in model accuracies.
A major contribution to the understanding, calibration and application of the intervening
opportunities model was presented by Ruiter (1967 and 1969). The L parameter of the
intervening opportunities model, as presented by Ruiter (1967), represents the constant
probability of a possible destination being accepted if it is considered. It acts as the
calibration parameter in the present form of the intervening opportunities model, where the
probability that a trip will terminate at a destination implies a negative exponential
relationship with the total number of possible destinations. A recent enhancement to the
understanding of the probability of trip making behavior was through the development of
preference functions in the 1990s. Masuya and Black (1992) presented the theoretical
4
concept of the preference function and discussed its estimation process by using journey-towork data from Sapporo, Japan. Conceptually, a preference function is the inverse of the
intervening opportunities concept and represents a zonal aggregate of the travel behavioral
response given a particular opportunity surface surrounding those travelers (Paez et al.,
2001), which can be related to the probability that a destination being accepted in the
intervening opportunities model. A preference function can be generated and estimated by
various functional forms which have a decaying relationship. These include the logarithm
or natural logarithm and quadratic functions (Masuya et al., 2002). The power function
represents another alternative that had a decaying relationship. Cheung (2006) presented the
mathematical and computerized procedure in the development of a calibrated preference
function model using various types of preference function.
The common spatial interaction model statistical evaluation measures include a comparison
of the average trip length; comparison of trip length or travel time frequency distributions;
chi-square; root-mean-square error (RMSE) or percent root-mean-square-error (%RMSE);
number of interchanges most accurately estimated; and comparison of the differences in
intra-zonal trips (for example, see Edens, 1970; Evans, 1971; Batty, 1976; Gray, 1980;
Smith and Hutchinson, 1981; Easa, 1984; de la Barra, 1989; Hunt, 1994; Sen and Smith,
1995; Zhao et al., 2001; Mao and Demetsky, 2002; and Baltimore Metropolitan Council,
2004). The best model is usually selected based on the goodness-of-fit statistics with
limited interpretation from 3-D graphs and GIS maps.
5
This paper presents the evaluation of the gravity and preference function models with the
use of travel datasets from a large metropolitan area, Sydney. Both statistical and GIS-T
techniques were applied to quantify and geographically identify model biases, which were
rarely undertaken in the past. In section 2, the underpinning theory of the gravity model and
intervening opportunities model are presented. Section 3 describes the data source used to
compare and evaluate both models. Section 4 gives the results of model calibration, and
section 5 compares the performance of each model using well-established statistical
methods (Black and Salter, 1975).
2. THEORIES
The framework of the gravity model is based on the Newton’s Law of Gravity and its
purpose is to find an equation that reproduces the intra-zonal and inter-zonal trip
interchanges of travel survey data. The theories and mathematical concepts of the gravity
model are well covered in Wilson (1970), Bruton (1970), Hutchinson (1974), Stopher and
Meyburg (1975), Black (1981), Erlander and Stewart (1990), Sen and Smith (1995) and
Ortuzar and Willumsen (2001). For the purpose of this research investigation, the doubly
constrained gravity model, which represents the standard practice is used. General notations
of this gravity model for the estimation of intra-zonal and inter-zonal trips are sketched out
below (see, Black, 1981 and Ortuzar and Willumen, 2001):
For a doubly constrained model: Tij = ai.Oi.bj.Dj. f(Cij)
6
(2.1)
where
Tij = estimate of the number of from zone i to j
Oi = survey number of workers living in zone i
Dj = survey number of jobs located in zone j
Cij = spatial impedance from zone i to j (distance, time or cost)
f(Cij) = the impedance or deterrence function
ai, bi = zonal balancing factors
The doubly constrained gravity model contains a constant for each production zone (ai) and
a constant for each attraction zone (bi) to ensure that the estimates of zonal production are
equal to the survey zonal productions as well as the estimates of zonal attraction are equal
to the survey zonal attraction. Hence, the doubly constrained model has the following
conditions.

Tij = Oi and
j

Tij = Dj
(2.2)
i
and
ai = [  D. bj. f(Cij)]-1 and bj = [  O. ai. f(Cij)]-1
all
(2.3)
all
The mainstream form of the intervening opportunities model is presented in the major
textbooks including, Hutchinson (1974); Stopher and Meyburg (1975); Kanafani (1983);
Sheppard (1995); and Ortuzar and Willumsen (2001). It is sketched out as follows:
7
Tij = Oi [exp(-LVj) – exp(-LVj+1)]
(2.4)
[1 – exp(-LV)]
where
Tij = estimate of the number of from zone i to j
Oi = survey number of workers living in zone i
L = calibration parameter
V = subtended volume of residential destinations
A preference function represents the relationship between the cumulative proportion of jobs
taken and the cumulative proportion of jobs reached in a defined opportunities surface. It
relates to the fundamental concept in the derivation of the intervening opportunities model
as sketched out in Equation (2.5).
Tij = Oi [P(Vj+1) – P(Vj)]
(2.5)
P(Vj+1) – P(Vj) is the probability of a trip terminating in zone j, which is expressed as the
total probability that a trip will terminate by the time j+1 possible destinations are
considered minus the total probability that a trip will terminate by the time j possible
destinations are considered. Relating this to the theory of preference functions, then, P(Vj)
is a function of the ratio of the cumulative number of jobs reached (Xj) for zone j to the
total job opportunities available (XT) in a defined opportunities surface. Similarly, P(Vj+1)
8
is a function of the ratio of the cumulative number of jobs reached (Xj+1) for zone j+1 to
the total job opportunities available (XT) in a defined opportunities surface.
Hence, the estimation of Tij becomes:
Tij = Oi [f(Xj+1/ XT) – f(Xj/ XT)]
(2.6)
where
Tij = expected interchange from zone i to zone j
Oi = the volume of trip origins at zone i
f(Xj/XT) = a (preference) function of the ratio of the cumulative number of jobs reached
(Xj) for zone j to the total job opportunities available (XT) in a defined opportunities surface
f(Xj+1/XT) = (preference) function of the ratio of the cumulative number of jobs reached
(Xj+1) for zone j+1 to the total job opportunities available (XT) in a defined opportunities
surface.
Three forms of preference functions can be examined (see Cheung, 2006):
Logarithm function: f(X/XT) = a.ln(X/XT) + b
(2.7)
Quadratic function: f(X/XT) = a.(X/XT)2 + b.(X/XT) + c
(2.8)
Power function: f(X/XT) = a.(X/XT)b
(2.9)
where
9
a, b and/or c are parameter values of the slope and/or coefficients of the preference
functions
A system containing n zones will have a vector of n preference functions. However, in
order to apply the preference functions to estimate every trip interchange in a system, a
standard set, or in terms of a stratified model, standard sets of parameter values for a, b
and/or c are required to be estimated for each functional form.
3. DATA SOURCE
This section describes the data sources used in this research investigation. The estimation of
a mathematical relationship, or a series of mathematical relationships that represent the
zone-to-zone trip interchanges in a spatial interaction model, requires surveyed data that
can derive zone-to-zone trip matrices and some forms of zone-to-zone travel impedance,
often in the form of distance, time, cost or a combination of these. Travel datasets and
impedance data were obtained for the Sydney metropolitan region. Sydney is the most
populated city in Australia and the host of the 2000 Olympics Games. With Sydney’s
population reaching 4.3 million in 2005 (Australian Bureau of Statistics, 2006), a rapid
population growth in its outer centers (such as Blacktown, Baulkham Hills and Liverpool),
a diversified industry employment base as well as a comprehensive motorway network, it
represents a good study area for the examination of spatial interaction models by
geographical locations.
10
The Australian Bureau of Statistics conducts the Census of Population and Housing survey
every 5 years in Australia, where it records the place of enumeration (the origin) and the
place of employment (the destination) for every individual on the Census day. Using the
Census data, the Transport and Population Data Centre of the New South Wales
Government’s Department of Planning processes the origin-destination information to
produce the Journey-to-Work (JTW) dataset. Hence, the JTW dataset provides information
on the trip to work on Census day undertaken by all employed people aged 15 years and
over and represents the most comprehensive source of home-to-work trip information
released by the Government. Through collaborative activities that were undertaken over a
period of time between the transport research team within the School of Civil and
Environmental Engineering, University of New South Wales, and the Transport and
Population Data Centre, Department of Planning, the 1996 JTW data was obtained from the
Centre. The 1996 JTW dataset allowed the compilation of stratified origin-destination
home-to-work trip matrices. The other input data required was the zone-to-zone travel
impedance, mostly in the form of travel distance for the purpose of this research. The use of
travel distance instead of travel time or cost, especially at a strategic level, has an advantage
that travel distances among locations were back-traceable and accurately calculated from
maps or a travel network model. Through collaboration with Computing in Transportation,
an advisory consultancy to the Government, the Sydney’s travel network model was
obtained. With the aid of TransCAD, a geographical information system-transportation
(GIS-T) software package provided by the School of Civil and Environmental Engineering,
zone-to-zone travel distances were extracted from the travel network model.
11
4. DEVELOPMENT OF MODELS
The 1996 JTW origin-destination trip matrix for the Sydney metropolitan area, was used to
calibrate the preference function models with the three functional forms as defined earlier
(logarithm, quadratic and power). Calibration was achieved by an iterative process of
adjusting the calibration parameters (a, b, and/or c – see Equations 2.7 to 2.9) that best
replicate the frequency distribution curve of the observed origin-destination survey data and
the mean trip length.
The iterative process of determining the optimum frequency distribution curve was
undertaken by using the coincidence ratio. The coincidence ratio is used to compare two
distributions. In using the coincidence ratio, the ratio in common between two distributions
is measured as a percentage of the total area of those distributions. Mathematically, the sum
of the lower value of the two distributions at each increment is divided by the sum of the
higher value of the two distributions at that corresponding increment. In other words, the
coincidence ratio measures the percent of area that “coincides” for the two curves. The
coincidence ratio lies between zero and one, where zero indicates two disjoint distributions
and one indicates identical distributions. The calibration criteria for successful preference
models had a small difference in the comparison of observed and modeled mean trip length
(<3%) and a co-incidence ratio of equal or greater than 0.7.
12
Aggregate preference function models using the various functional types: logarithmic,
quadratic and power functions have been successfully calibrated. These models were
developed in a spreadsheet (Microsoft Excel) environment where the calibration was
undertaken using an iterative process to estimate parameter values that satisfy the
calibration criteria. The values of the calibration parameters for the various models are
shown in Table 1, where the power function provides the best model fit when co-incidence
ratios are compared.
Table 1: Calibration Results for the Aggregate Preference Functions Model
In the development of stratified, or residential location-specific, preference function models,
the aggregate (46 SLAs by 46 SLAs) 1996 JTW origin-destination trip matrix for the
Sydney metropolitan area was stratified into three portions:
1.
An inner ring trip matrix: 14 SLAs by 46 SLAs;
2.
A middle ring trip matrix: 15 SLAs by 46 SLAs; and
3.
An outer ring trip matrix: 17 SLAs by 46 SLAs
In fact, the raw preference curves for the inner, middle and outer Sydney SLAs do appear to
have some distinct differences as shown in Figure 1, Figure 2 and Figure 3 respectively.
13
Figure 1: Raw Preference Curves for the Inner Sydney SLAs
(Note: Keys for SLA codes – Ashfield (150), Botany Bay (1100), Drummoyne (2550),
Lane Cove (4700), Leichhardt (4800), Marrickville (5200), Mosman (5350), North Sydney
(5950), Randwick (6550), South Sydney (7070), Sydney Inner (7201), Sydney Remainder
(7202), Waverley (8050) and Woollahra (8500))
Figure 2: Raw Preference Curves for the Middle Sydney SLAs
(Note: Keys for SLA codes – Auburn (200), Bankstown (350), Burwood (1300),
Canterbury (1550), Concord (1900), Hunters Hill (4100), Hurstville (4150), Kogarah
(4450), Ku-ring-gai (4500), Manly (5150), Parramatta (6250), Rockdale (6650), Ryde
(6700), Strathfield (7100) and Willoughby (8250))
Figure 3: Raw Preference Curves for the Outer Sydney SLAs
(Note: Keys for SLA codes – Baulkham Hills (500), Blacktown (750), Blue Mountains
(900), Camden (1450), Campbelltown (1500), Fairfield (2850), Gosford (3100),
Hawkesbury (3800), Holroyd (3950), Hornsby (4000), Liverpool (4900), Penrith (6350),
Pittwater (6370), Sutherland (7150), Warringah (8000), Wollondilly (8400) and Wyong
(8550))
The estimated mean trip lengths and co-incidence ratios, as well as the values of the
calibration parameters for the various preference function models, calibrated for inner,
middle and outer areas are shown in Table 2, Table 3 and Table 4, respectively.
14
Table 2: Calibration Results for the Inner Area Preference Functions Model
Table 3: Calibration Results for the Middle Area Preference Functions Model
Table 4: Calibration Results for the Outer Area Preference Functions Model
The three location stratified trip matrices were then re-aggregated to produce the full 46
SLAs by 46 SLAs origin-destination trip matrix. The mean trip lengths and co-incidence
ratios for this stratified model is shown in Table 5.
Table 5: Calibration Results for the Stratified Preference Functions Model
The calibration results shown above, in a similar manner to the aggregate preference
models, suggest that the power function provides the best model fit, regardless of the
degree of stratification. Considering the power function models, the stratification of inner,
middle and outer areas provide an improvement in the match between the observed and
modeled curves of trip length frequency distribution, evident from an increase of the coincidence ratio from 0.89 to 0.93. Improvements are also evident with the use of logarithm
and quadratic functions. Other goodness-of-fit statistics and a detailed residual analysis are
shown in the next section.
15
5. MODEL EVALUATION
Several model performance evaluation techniques are presented in this section to further
assess the performance of the calibrated preference function models. Firstly, macro
performances statistics are used to compare observed data against modeled estimates. These
include:

The co-incidence ratio;

The coefficient of determination (R2);

The percent root mean square error (%RMSE); and

The percentage of intra-zonal trips.

Plots of observed and modeled trip length frequency distribution curves;
Secondly, residual analysis was undertaken by the use of graphical plots and GIStransportation techniques. These include:

Scatter plots of observed against modeled trip interchanges;

Tables showing the residuals (both over- and under-estimations) grouped by suburban
areas;

Graphical 3D plots of the areas of over- and under-estimations, grouped by suburban
areas; and
16

Assignment of residuals onto the transport network using GIS-transportation techniques
(the TransCAD software package).
Table 6 compares the macro performance statistics for the various functional types of the
aggregate preference function models.
Table 6: Statistical Performances of Aggregate Preference Functions Models
At this point, it is appropriate to compare the performance of the aggregate preference
functions models with the aggregate gravity models (Table 7).
Table 7: Statistical Performances of Aggregate Gravity Models
Table 8 then compares the macro performance statistics for the various functional types of
the stratified preference function models.
Table 8: Statistical Performances of Stratified Preference Functions Models
Figure 4 and Figure 5 below illustrate the observed and modeled trip length frequency
distribution curves for the aggregate and stratified preference function models respectively.
Figure 4: Comparison of Trip Length Frequency Distribution Curves for Various
Functional Forms (Aggregate Preference Functions Models)
17
Figure 5: Comparison of Trip Length Frequency Distribution Curves for Various
Functional Forms (Stratified Preference Functions Models)
As shown in the above performance statistics, and in a comparison of the trip length
frequency distribution curves, stratification by using 3 preference functions to represent the
trip pattern characteristics for the inner, middle and other locations in Sydney did increase
the goodness-of-fit measures of the model by a considerable margin. This is evident from:

a better co-incidence ratio (from 0.89 to 0.93) when the power function is used;

an increase in the coefficient of determination (R2) (from 0.89 to 0.91);

a smaller value of the root-mean-square-error (RMSE) (from 28.4 to 26.4); a better
estimate of intra-zonal trips (from an over-estimation of 4% to an over-estimation of
1%); and

visually, the stratified models obtain a clear better fit than the aggregate models.
Furthermore, the preference function models achieved a better model fit compared with the
gravity models from an assessment of the macro performance statistics.
Figure 6 below is the plot of observed against modeled trip interchanges from the aggregate
model. The power function model is illustrated as it has the best performance. While a
18
slope value of 1 and an R2 value of 1 indicate a perfect model fit, a slope value of greater
than 1 indicates a tendency for over-estimation and a slope value of less than 1 indicates the
opposite, i.e. under-estimation.
Figure 6: Comparison of Trips Interchanges (Aggregate Power, Preference Functions
Model)
As shown in Figure 6, the power model performs with an R2 of 0.89 and a slope of 0.99.
Figure 7 below is the plot of observed against modeled trip interchanges from the stratified
models. Again, the power function model is shown.
Figure 7: Comparison of Trips Interchanges (Stratified Power, Preference Functions
Model)
The research found that for all functions, the stratified models perform better than their
corresponding aggregate models. As shown in Figure 6 and Figure 7, the stratified power
model has a R2 of 0.90 comparing with a R2 of 0.85 in the aggregate power model.
The figures shown below are analyses of residuals (both over- and under-estimations) using
3D plots for the various functional forms and for both the aggregate the stratified models.
Interpretations are followed after the illustrations.
19
Figure 8: 3D-Plot Showing Over Estimated Residual (Aggregate Power, Preference
Functions Model)
Figure 8 above shows that the aggregate power preference function model over-estimates
high number of localized trips in the South East and Inner/Central West regions.
Figure 9: 3D-Plot Showing Under Estimated Residual (Aggregate Power, Preference
Functions Model)
Figure 9 above shows that the aggregate power preference function model under-estimates
high number of localized trips in the Outer West and Central Coast regions and trips
terminating in the Inner/East region from North East and South East regions.
Figure 10: 3D-Plot Showing Over Estimated Residual (Stratified Power, Preference
Functions Model)
Figure 10 above shows that the stratified power preference function model over-estimates
high number of localized trips in the Inner/Central West and North West regions.
Compared to the aggregate model, the extent of the over-estimation has been reduced in
most areas.
Figure 11: 3D-Plot Showing Under Estimated Residual (Stratified Power, Preference
Functions Model)
20
Figure 11 above shows that the aggregate power preference function model under-estimates
high number of localized trips in the Central Coast region and trips terminating in the
Inner/East region from Inner/East and South East regions. Compared to the aggregate
model, the extent of the over-estimation has been reduced in most areas.
The following network plots of residuals for the aggregate preference functions models and
for the location specific preference functions models, indicate where the over- and underestimations of spatial residual biases might occur in different geographic areas and their
magnitudes on the transport network of Sydney. For illustration purposes, an all-or-nothing
assignment was performed using TransCAD to assign the residuals (a cell by cell difference
of trip interchanges between observed data and modeled estimates) for each of the models.
Over- and under-estimations of residual biases are illustrated on separate figures.
Interpretations of the results are presented at after the illustrations.
Figure 12: Network Plot Showing Over Estimation of Spatial Biases (Aggregate Power,
Preference Functions Model)
Figure 13: Network Plot Showing Under Estimation of Spatial Biases (Aggregate Power,
Preference Functions Model)
Figure 14: Network Plot Showing Over Estimation of Spatial Biases (Stratified Power,
Preference Functions Model)
21
Figure 15: Network Plot Showing Under Estimation of Spatial Biases (Stratified Power,
Preference Functions Model)
Drawing on the findings shown in the above figures, over-estimation of trips was found for
trips heading towards the inner and central west suburbs and over-estimation of trips
towards the inner/ eastern suburbs. Despite stratified preference function models were
found to produce spatial biases in similar geographical areas and transport corridors as in
the aggregate models, they are shown to have merit for practical application because of an
improvement in accuracy in the range of 7 to 23% (based on macro performance statistics)
for the estimation of trip interchanges in Sydney. This improvement in accuracy in relative
terms was more pronounced when compared with the gravity model.
6. FURTHER EXTENSIONS TO MODEL
The analyses and results presented have shown that there are merits in developing stratified
models based on geographic locations (i.e. inner, middle and outer areas in the case of
Sydney) in a metropolitan area. This may represent a more general finding to other large
metropolitan regions and needs to be investigated.
It is worthwhile to note that the grouping of locations or zones does not necessary follow
the geographical jurisdiction of the study area. Further appropriate stratification may lead to
22
a future reduction in spatial residual bias. For example, from the preference function plots,
differences in trip pattern existed within the inner, middle, and outer areas, as shown in
Figure 16, Figure 17 and Figure 18. This shows that there is a potential for further
stratification.
Figure 16: Differential Preference Patterns among Inner SLAs
Figure 17: Differential Preference Patterns among Middle SLAs
Figure 18: Differential Preference Patterns among Outer SLAs
As a result, a stratification to produce six separate preference functions to represent the
distinct trip characteristics for each group of geographic location may be worthwhile for
further investigation in producing improved model accuracy.
Furthermore, the preference function model used is a residential location based model, i.e.
destination zones are ranked from the origin zone. In other words, the preference function
models presented is production constrained. A potential improvement to the model
performance is to investigate an attraction constrained model with the use of employment
location-specific preference functions. In other words, the process involved in the
development of this model is a reverse of the one presented in this paper.
23
Apart from the aforementioned extensions, another potential extension to improve the
model accuracy is by exploring a more detailed zone system, for example, using travel
analysis zone instead of statistical local areas. Currently, equivalent to the 46 SLAs, the
Sydney metropolitan area has approximately 700 to 800 travel analysis zones (TAZs). The
methods described in this paper are equally applicable at the more detailed spatial
resolution. It would be especially important to investigate the zonal within variance in the
preference functions, as well as the zonal between SLA variance. The use of a more
detailed zoning system is expected to improve the accuracy of the model. On the other hand,
it also consumes a far greater need for computer hardware and software requirements,
which is considered applicable in the near future with the continuous advancement in
computing technologies.
7. CONCLUSIONS
The spatial interaction models presented in this paper were evaluated against a number of
statistical and graphical techniques. The results showed that the aggregate preference
function model produced less systematic bias compared with the gravity model. The
stratified model, based on residential specific functions, produced even less bias compared
with the aggregate preference function model, regardless of the functional form used. A
preference function model with a power function form generally produced more accuracy
and less systematic spatial bias.
24
Preference function models, in both aggregate and geographically stratified forms, with
probability decaying functions in the forms of logarithm, quadratic and power were
successfully calibrated under a spreadsheet based modeling platform. Different functional
parameters for inner, middle and outer suburbs were determined for geographically
stratified preference function models. Comparing the performance and accuracy of the
various models, goodness-of-fit statistics including the coefficient of determination (R2),
percentage root mean square error (%RMSE), co-incidence ratio and the percentage of
intra-zonal trips were used. Among the aggregate models, the preference function model
with a power function produced the best overall match between modeled estimates and
observed data with both R2 and co-incidence ratio values of 0.89 and a %RMSE value of
28.4 (although it over-estimated intra-zonal trips by 4%). When the above statistics were
considered, the aggregate preference models generally provided more accuracy than the
aggregate gravity models.
With the use of location-specific preference functions, geographical stratification showed
improvement in model performance. The location-specific power function model
performed better than the aggregate power function model, with a R2 of 0.91, a coincidence ratio of 0.93, a %RMSE of 26.4, and an intra-zonal trip percentage of 33% (1%
more than the observed data). The comparison of the trip length frequency distribution
curves from observed and modeled estimates illustrated a closer match of observed trip
frequency for stratified models and the power function models seemed to provide the best
fit. A similar conclusion was reached when the plots of observed and modeled trip
interchanges were examined.
25
Graphical and GIS-transportation analytical techniques were applied to further explain the
global statistics (R2, %RMSE, co-incidence ratio and intra-zonal trip proportion) and to
pinpoint geographically where spatial residual errors occur and where these residuals are
likely to be distributed onto the transport network. With the use of stratified preference
function models, the improvement in accuracy in relative terms was more pronounced
compared with the gravity model.
The analyses and results presented in the paper conclude that preference function models
were more accurate in the estimation of journey-to-work trips in the Sydney metropolitan
area and geographically stratified models (location-specific preference functions models),
which have separate functions for inner, middle and outer areas were concluded to achieve
a better fit between observed and modeled data. The implications of these differences to
land use and spatial planning are articulated, in particular, where future land use and
transport investment decisions are largely based on findings in these spatial interaction
models, where inaccurate model results could potentially propagate onto the subsequent
stages of the transport modeling process (for example, mode split and trip assignment
stages). It is also worth-noting that there are some interesting policy contrasts, with the
theory of the gravity model more geared towards transportation policy (i.e. the policy
handle is cost/time/distance of travel in a system), and the preference functions more
closely related to land use policy (the policy handle being the distribution of opportunities
in space).
26
Transport modeling practitioners are encouraged to use the research findings presented in
this paper as an indicative guide for future performance appraisal in spatial interaction
models. Despite the support of the use of the gravity model framework by many
commercial software packages, incorporation of the mathematical procedure for the
development of calibrated location-specific function models to computer software
platforms is recommended.
REFERENCES
Australian Bureau of Statistics (2006) Regional Population Growth, Australia, 2004-2005,
ABS Catalogue No. 3218.0, Australian Bureau of Statistics, Australian Government,
Canberra.
Baltimore Metropolitan Council (2004) Baltimore Region Travel Demand Model for Base
Year 2000, Task Report 04-01, Baltimore, Maryland.
Batty, M. (1976) Urban Modelling: Algorithms, Calibrations, Predictions, Cambridge
University Press, Cambridge.
Black, J. A. (1981) Urban Transport Planning: Theory and Practice. Croom Helm, London.
27
Black, J.A. and R.J. Salter, (1975) ‘A Review of the Modelling Achievements of British
Urban Land-Use Transportation Studies Outside the Conurbations’, Journal of the
Institution of Municipal Engineers, Vol. 102, pp. 100-105.
Bruton, M. J. (1970) Introduction to Transportation Planning, Hutchinson, London.
Cheung, C. (2006) Development and Evaluation of Stratified Spatial Interaction Models
and Models Based on Location Specific Preference Functions for Sustainable Transport
and Traffic Assessment, Ph.D. Thesis, School of Civil and Environmental Engineering,
University of New South Wales, Sydney.
de la Barra, T., Perez, B. and Vera, N. (1984) ‘TRANUS-J: Putting Large Models into
Small Computers’, Environment and Planning B, Vol. 11, pp. 87-101.
Easa, R. (1984) ‘Working Paper Number 84-7: Development of a Doubly Constrained
Intervening Opportunities Models for Trip Distribution’, Chicago Area Transportation
Study, Chicago.
Edens, H. J. (1970) ‘Analysis of a Modified Gravity Model’, Transportation Research, Vol.
4, pp. 51-62.
Erlander, S. and Stewart, N. F. (1990) The Gravity Model in Transportation Analysis:
Theory and Extensions, VSP BV, Utrecht.
28
Evans, A. W. (1971) ‘The Calibration of Trip Distribution Models with Exponential or
Similar Cost Functions’, Transportation Research, Vol. 5, pp. 15-38.
Gray, R. H. (1980) Gravity Models: A Conceptually and Computationally Simple Approach,
Thesis submitted for the Graduate College of the University of Illinois at Chicago Circle,
Chicago, Illinois.
Hunt, J. D. (1994) ‘Calibrating the Naples Land Use and Transport Model’, Environment
and Planning B: Planning and Design, Vol. 21, pp. 569-90
Hutchinson, B. G. (1974) Principles of Urban Transport Systems Planning, McGraw-Hill,
Washington DC.
Kanafani, A. K. (1983) Transportation Demand Analysis, McGraw-Hill, New York.
Mao, S. and Demetsky, M. J. (2002) Calibration of the Gravity Model for Truck Freight
Flow
Distribution, Centre
for
Transportation
Studies,
University
of
Virginia,
Charlottesville, United States.
Masuya, Y. and Black, J. (1992) ‘Transport Infrastructure Development and Journey to
Work Preference Functions in Sapporo’, Infrastructure Planning Review: Japan Society of
Civil Engineers, Vol. 10, pp. 127-134.
29
Masuya, Y., Shitamura, M., Saito, K., and Black, J. (2002) ‘Urban Spatial Re-structuring
and Journey-to-work Trip Lengths: A Case Study of Sapporo from 1972 to 1994’, in Traffic
and Transportation Studies 2002, American Society of Civil Engineers, United States of
America.
NSW Department of Transport (1974) Sydney Area Transportation Study 1971, Sydney.
Ortuzar, J. and Willumsen, L. (2001) Modelling Transport, 3rd Edition, John Wiley & Sons,
West Sussex, England.
Páez, A., Suthanaya, P. and Black, J. (2001) ‘A Spatial Analysis of Transportation ModeSpecific Journey-to-Work Commuting Preferences: Implications for Sustainable Transport
Policies’, Proceedings of the 9th World Conference on Transport Research, Seoul, 22-27
July, CD-rom.
Ruiter, E. R. (1967) ‘Toward a Better Understanding of the Intervening Opportunities
Model’, Transportation Research, Vol. 1, pp. 47-56.
Ruiter, E. R. (1969) ‘Improvements in Understanding, Calibrating and Applying the
Opportunity Model’, Highway Research Record, Vol. 165, pp.1-21.
30
Sen, A. and Smith, T. E. (1995) Gravity Models of Spatial Interaction Behaviour, SpringerVerlag Berlin, Heidelberg.
Schneider, M. (1959) ‘Gravity Models and Trip Distribution Theory’, Journal of Regional
Science, Vol. 5, pp. 51-56.
Sheppard, E. (1995) ‘Modelling and Predicting Aggregate Flows’, in The Geography of
Urban Transportation, Hanson, S. (ed), The Guilford Press, New York.
Smith, D. P. and Hutchinson, B. G. (1981) “Goodness of Fit Statistics for Trip Distribution
Models”, Transportation Research A, Vol. 15, pp. 295-303.
Stopher, P. R. and Meyburg, A. H. (1975) Urban Transportation Modelling and Planning,
Lexington Books, D.C. Heath and Company, Toronto/London.
Stouffer, S. (1940) ‘Intervening Opportunities: A Theory Relating Mobility and Distance’,
American Sociological Review, Vol. 5, pp. 845-867.
Wilson, A. G. (1970) ‘Advances and Problems in Distribution Modelling’, Transportation
Research, Vol. 4, pp. 1-18.
Zhao, F., Chow, L.F, Li, M. T. and Shen, D. L. (2001) ‘Refinement of FSUTMS Trip
Distribution Methodology’, Technical Memorandum No. 3: Calibration of an Intervening
31
Opportunity Model For Palm Beach County, prepared for Florida Department of
Transportation, September 2001.
32
Table 1: Calibration Results for the Aggregate Preference Functions Model
Mean Trip
Co-incidence
Functional Forms
Value of
Parameters
Length (km)
Ratio
Logarithmic
18.1
0.81
a
-0.157
Quadratic
18.1
0.86
a
-1.077
b
2.254
c
0.269
a
1.116
b
0.296
Power
18.1
0.89
Parameters
Note: The mean trip length observed in the Census JTW data for Sydney is 18.1km.
33
Table 2: Calibration Results for the Inner Area Preference Functions Model
Mean Trip
Co-incidence
Functional Forms
Value of
Parameters
Length (km)
Ratio
Logarithmic
8.5
0.62
a
-0.146
Quadratic
8.5
0.84
a
-1.407
b
2.184
c
0.204
a
1.230
b
0.422
Power
8.5
0.87
Parameters
Note: The mean trip length observed in the Census JTW data for inner Sydney is 8.5km.
34
Table 3: Calibration Results for the Middle Area Preference Functions Model
Mean Trip
Co-incidence
Functional Forms
Value of
Parameters
Length (km)
Ratio
Logarithmic
12.4
0.78
a
-0.195
Quadratic
12.4
0.85
a
-1.044
b
1.774
c
0.231
a
1.115
b
0.404
Power
12.4
0.90
Parameters
Note: The mean trip length observed in the Census JTW data for middle distance Sydney is
12.4km.
35
Table 4: Calibration Results for the Outer Area Preference Functions Model
Mean Trip
Co-incidence
Functional Forms
Value of
Parameters
Length (km)
Ratio
Logarithmic
24.7
0.80
a
-0.148
Quadratic
24.7
0.84
a
-1.453
b
1.736
c
0.322
a
1.119
b
0.263
Power
24.7
0.86
Parameters
Note: The mean trip length observed in the Census JTW data for outer Sydney is 24.7km.
36
Table 5: Calibration Results for the Stratified Preference Functions Model
Mean Trip
Co-incidence
Length (km)
Ratio
Logarithmic
18.1
0.84
Quadratic
18.1
0.89
Power
18.1
0.93
Functional Forms
37
Table 6: Statistical Performances of Aggregate Preference Functions Models
Co-incidence
Intra-zonal
R2
Functional Forms
%RMSE
Ratio
Trip (%)
Logarithmic
0.81
0.87
35.8
39%
Quadratic
0.86
0.85
32.4
32%
Power
0.89
0.89
28.4
36%
Note: The percentage of intra-zonal trip in observed Census JTW data is 32%.
38
Table 7: Statistical Performances of Aggregate Gravity Models
Co-incidence
Intra-zonal
R2
Functional Forms
%RMSE
Ratio
Trip (%)
Exponential
0.81
0.70
44.7
18%
Power
0.82
0.79
38.4
28%
Note: The percentage of intra-zonal trip in the observed Census JTW data is 32%. The
gravity model with the use of a gamma deterrence function was not successfully calibrated
with the maximum number of iterations and therefore its performance has been considered
as unsatisfactory and the performance measures are not reported.
39
Table 8: Statistical Performances of Stratified Preference Functions Models
Co-incidence
Intra-zonal
R2
Functional Forms
%RMSE
Ratio
Trip (%)
Logarithmic
0.84
0.87
35.9
38%
Quadratic
0.89
0.90
28.4
35%
Power
0.93
0.91
26.4
33%
Note: The percentage of intra-zonal trip in observed Census JTW data is 32%.
40
Figure 1: Raw Preference Curves for the Inner Sydney SLAs
Figure 2: Raw Preference Curves for the Middle Sydney SLAs
Figure 3: Raw Preference Curves for the Outer Sydney SLAs
Figure 4: Comparison of Trip Length Frequency Distribution Curves for Various
Functional Forms (Aggregate Preference Functions Models)
Figure 5: Comparison of Trip Length Frequency Distribution Curves for Various
Functional Forms (Stratified Preference Functions Models)
Figure 6: Comparison of Trips Interchanges (Aggregate Power, Preference Functions
Model)
Figure 7: Comparison of Trips Interchanges (Stratified Power, Preference Functions Model)
Figure 8: 3D-Plot Showing Over Estimated Residual (Aggregate Power, Preference
Functions Model)
Figure 9: 3D-Plot Showing Under Estimated Residual (Aggregate Power, Preference
Functions Model)
41
Figure 10: 3D-Plot Showing Over Estimated Residual (Stratified Power, Preference
Functions Model)
Figure 11: 3D-Plot Showing Under Estimated Residual (Stratified Power, Preference
Functions Model)
Figure 12: Network Plot Showing Over Estimation of Spatial Biases (Aggregate Power,
Preference Functions Model)
Figure 13: Network Plot Showing Under Estimation of Spatial Biases (Aggregate Power,
Preference Functions Model)
Figure 14: Network Plot Showing Over Estimation of Spatial Biases (Stratified Power,
Preference Functions Model)
Figure 15: Network Plot Showing Under Estimation of Spatial Biases (Stratified Power,
Preference Functions Model)
Figure 16: Differential Preference Patterns among Inner SLAs
Figure 17: Differential Preference Patterns among Middle SLAs
Figure 18: Differential Preference Patterns among Outer SLAs
42
Y (Cum. Prop. of Jobs Taken)
1.0
0.8
0.6
0.4
0.2
150
1100
2550
4700
4800
5200
5350
5950
6550
7070
7201
7202
8050
8500
0.0
0.0
0.2
0.4
0.6
0.8
X (Cumulative Proportion of Jobs Reached)
43
1.0
Y (Cum. Prop. of Jobs Taken)
1.0
0.8
0.6
0.4
0.2
200
350
1300
1550
1900
4100
4150
4450
4500
5150
6250
6650
6700
7100
8250
0.0
0.0
0.2
0.4
0.6
0.8
X (Cumulative Proportion of Jobs Reached)
44
1.0
Y (Cum. Prop. of Jobs Taken)
1.0
0.8
0.6
0.4
0.2
500
750
900
1450
1500
2850
3100
3800
3950
4000
4900
6350
6370
7150
8000
8400
8550
0.0
0.0
0.2
0.4
0.6
0.8
X (Cumulative Proportion of Jobs Reached)
45
1.0
Frequency of Trips in 5km Intervals
500,000
Observed
Linear-Log
Quadratic
Power
400,000
300,000
200,000
100,000
0
5
10
15
20
25
30
35
Distance (km)
46
40
45
50
55
60
Frequency of Trips in 5km Intervals
500,000
Observed
Linear-Log
Quadratic
Power
400,000
300,000
200,000
100,000
0
5
10
15
20
25
30
35
Distance (km)
47
40
45
50
55
60
50,000
Modelled No. of Trips in Interchanges
45,000
y = 0.993x
R2 = 0.892
40,000
35,000
30,000
25,000
20,000
15,000
10,000
5,000
0
0
5,000
10,000
15,000 20,000
25,000
30,000 35,000
Observed No. of Trips in Interchanges
48
40,000 45,000
50,000
50,000
Modelled No. of Trips in Interchanges
45,000
y = 1.007x
R2 = 0.908
40,000
35,000
30,000
25,000
20,000
15,000
10,000
5,000
0
0
5,000
10,000
15,000 20,000
25,000
30,000 35,000
Observed No. of Trips in Interchanges
49
40,000 45,000
50,000
50000
40000
35000
30000
25000
t
as ast
st
t
/E
E
r
e
es est st
Ea
h
t
n
t
h
W
t
e
In Nor
l
W
u
t
es
tra
th th W
os
W
So
r
n
r
C
o
e
u
l
te
N
/C
ra
So Ou
er
nt
n
e
In
C
Origin
50
Inner/East
North East
South East
Inner/Central West
North West
South West
Outer West
Central Cost
inatio
n
20000
15000
10000
5000
0
Dest
Number of JTW Trips
45000
50000
40000
35000
30000
25000
Origin
51
Inner/East
North East
South East
Inner/Central West
North West
South West
Outer West
Central Cost
inatio
n
er
/ C ut h
en
E
tra ast
lW
N
es
or
th
t
W
So
es
ut
t
h
W
O
e
ut
st
er
C
W
en
es
tra
t
lC
os
t
So
In
n
N
or
th
er
/E
as
t
Ea
st
15000
10000
5000
0
Dest
20000
In
n
Number of JTW Trips
45000
50000
40000
35000
30000
25000
t
as ast
st
t
/E
E
r
e
es est st
Ea
h
t
n
t
h
W
t
e
In Nor
l
W
u
t
es
tra
th th W
os
W
So
r
n
r
C
o
e
u
l
te
N
/C
ra
So Ou
er
nt
n
e
In
C
Origin
52
Inner/East
North East
South East
Inner/Central West
North West
South West
Outer West
Central Cost
inatio
n
20000
15000
10000
5000
0
Dest
Number of JTW Trips
45000
50000
40000
35000
30000
25000
Origin
53
Inner/East
North East
South East
Inner/Central West
North West
South West
Outer West
Central Cost
inatio
n
er
/ C ut h
en
E
tra ast
lW
N
es
or
th
t
W
So
es
ut
t
h
W
O
e
ut
st
er
C
W
en
es
tra
t
lC
os
t
So
In
n
N
or
th
er
/E
as
t
Ea
st
15000
10000
5000
0
Dest
20000
In
n
Number of JTW Trips
45000
N
Wyong
$
0
20 km
Residual Errors (No. of Trips)
$
Gosford
10,000
5,000
1,000
Hawkesbury
$
Hornsby
$
$
$
$
$
Blue Mountains
$
$
$
Pacific Ocean
$
$
$
$
$
$
$
$
$
$
$ $
$
$$
$
$
$
$$
$
$
$
$
$
$
$
$
Camden
$
$
$
$
$
$
Sutherland
$
$
Pittwater
$
Hornsby
$
Blacktown
$
$
$
$
Penrith
Chatswood
$
$
$
$
Parramatta
$
$
$
$
$
$
$
$
$
$
$
$
$
$
CBD
$
$
$
$
$
Liverpool
$
$
$
$
$
N
$
$
$
0
10 km
$
$
Campbelltown
Sutherland
$
54
Residual Errors (No. of Trips)
10,000
5,000
1,000
N
Wyong
$
0
20 km
Residual Errors (No. of Trips)
$
Gosford
10,000
5,000
1,000
Hawkesbury
$
Hornsby
$
$
$
$
$
Blue Mountains
$
$
$
Pacific Ocean
$
$
$
$
$
$
$
$
$
$$
$
$
$
$$
$
$ $
$
$
$
$
$
$
$
$
$
Camden
$
$
$
$
$
$
Sutherland
$
$
Pittwater
$
Hornsby
$
Blacktown
$
$
$
$
Penrith
Chatswood
$
$
$
$
Parramatta
$
$
$
$
$
$
$
$
$
$
$
$
$
$
CBD
$
$
$
$
$
Liverpool
$
$
$
N
$
$
$
$
$
0
10 km
Residual Errors (No. of Trips)
$
$
10,000
Campbelltown
Sutherland
$
55
5,000
1,000
N
Wyong
$
0
20 km
Residual Errors (No. of Trips)
$
Gosford
10,000
5,000
1,000
Hawkesbury
$
Hornsby
$
$
$
$
$
Blue Mountains
$
$
$
Pacific Ocean
$
$
$
$
$
$
$
$
$
$
$
$$
$
$
$ $
$
$$
$
$
$
$
$
$
$
$
Camden
$
$
$
$
$
$
Sutherland
$
$
Pittwater
$
Hornsby
$
Blacktown
$
$
$
$
Chatswood
$
Penrith
$
$
$
$
Parramatta
$
$
$
$
$
$
$
$
$
$
$
$
$
CBD
$
$
$
$
$
Liverpool
$
$
$
$
$
N
$
$
$
0
10 km
$
$
Campbelltown
Sutherland
$
56
Residual Errors (No. of Trips)
10,000
5,000
1,000
N
Wyong
$
0
20 km
Residual Errors (No. of Trips)
$
Gosford
10,000
5,000
1,000
Hawkesbury
$
Hornsby
$
$
$
$
$
Blue Mountains
$
$
$
Pacific Ocean
$
$
$
$
$
$
$
$
$
$
$ $
$
$$
$
$
$
$$
$
$
$
$
$
$
$
$
Camden
$
$
$
$
$
$
Sutherland
$
$
Pittwater
$
Hornsby
$
Blacktown
$
$
$
$
Penrith
Chatswood
$
$
$
$
Parramatta
$
$
$
$
$
$
$
$
$
$
$
$
$
$
CBD
$
$
$
$
$
Liverpool
$
$
$
N
$
$
$
$
$
0
10 km
Residual Errors (No. of Trips)
$
$
Campbelltown
Sutherland
$
57
10,000
5,000
1,000
Y (Cum. Prop. of Jobs Taken)
1.0
Inner SLAs with distance
minimisation
0.8
0.6
Inner SLAs with distance
maximisation
0.4
0.2
150
1100
2550
4700
4800
5200
5350
5950
6550
7070
7201
7202
8050
8500
0.0
0.0
0.2
0.4
0.6
0.8
X (Cumulative Proportion of Jobs Reached)
58
1.0
Y (Cum. Prop. of Jobs Taken)
1.0
Middle SLAs with
distance minimisation
0.8
0.6
Middle SLAs with
distance maximisation
200
350
1300
1550
1900
4100
4150
4450
4500
5150
6250
6650
6700
7100
8250
0.4
0.2
0.0
0.0
0.2
0.4
0.6
0.8
X (Cumulative Proportion of Jobs Reached)
59
1.0
Y (Cum. Prop. of Jobs Taken)
1.0
0.8
0.6
Outer SLAs with distance
maximisation
0.4
Outer SLAs with distance
minimisation
0.2
500
750
900
1450
1500
2850
3100
3800
3950
4000
4900
6350
6370
7150
8000
8400
8550
0.0
0.0
0.2
0.4
0.6
0.8
X (Cumulative Proportion of Jobs Reached)
60
1.0
Download