SPATIAL INTERACTION MODELS – GEOGRAPHICAL IDENTIFICATION AND QUANTIFICATION OF MODEL BIASES GENERATED FROM GRAVITY AND LOCATION SPECIFIC PREFERENCE FUNCTION MODELS Dr Charles CHEUNG Steer Davies Gleave, 28-32 Upper Ground, London SE1 9PD, United Kingdom E-mail: charles.cheung@sdgworld.net Dr John BLACK Professor, Planning Research Centre, Faculty of Architecture, Wilkinson Building, University of Sydney, Australia E-mail: john.black@arch.usyd.edu.au 1 ABSTRACT Current spatial interaction modeling techniques that are available and supported in software packages mainly incorporate the gravity model, but previous research has identified systematic biases are often found when the gravity model is applied to large metropolitan regions. These biases may result in inappropriate decisions about infrastructure location and capacity. This paper presents the result from a detailed investigation of the residual errors identified in spatial interaction models. Applying various statistical and Geographical Information System-Transportation (GIS-T) techniques, the residual errors are geographically identified and quantified. Travel datasets from the Sydney metropolitan area are used. Improvement to model performance was made with the introduction of a probabilistic model based on the Stouffer’s intervening opportunities model. The model incorporates location specific preference functions to represent a relationship between opportunities available and opportunities taken. The research presented in this paper highlights the areas of deficiency in current spatial interaction modeling and points towards the continual developments in preference functions and model evaluation techniques to eliminate systematic model biases. Keywords: spatial interaction, trip distribution, intervening opportunities, preference functions, model biases 2 1. INTRODUCTION One important transport challenge that a metropolitan city faces today is to develop and maintain a balanced approach in achieving social, economic and environmental objectives. The formulation of land use and transport plans and policies as well as the decision making process in infrastructure investment often rely upon a robust and comprehensive analysis of spatial interactions between residential locations and major activity centers. Models are simplification of the real world designed for a specific purpose. One of the principles of model building for forecasting purposes is to keep the formulation as simple as possible and with a minimal number of calibration parameters and input variables that requires to meet the model’s objectives and accuracy. The conventional approach to spatial interaction modeling for the journey to work uses aggregate zonal data, often without stratification by employment category. The lack of the degree of stratification for the estimation of trip origin-destination (O-D) matrices can be seen from the current operation of the current Sydney Strategic Travel Model (STM) where the approach is derived from the Sydney Area Transportation Study (SATS) (NSW Department of Transport, 1974). Contemporary advanced developments in computing technologies equip model builders with an array of tools to easily and efficiently evaluate different spatial interaction models and their functional variations that were not available when most aggregate models that form the basis of today’s practice were first formulated. These advancements also allow investigations into other form of stratified spatial interaction models that can be practically processed and calibrated in a reasonable timeframe. 3 Somewhat different to the gravity model, the intervening opportunities model starts from the first principle that the spatial interaction of trip making behavior is related to the accessibility of opportunities. It introduces the probability theory as the theoretical foundation of trip distribution. Stouffer (1940) applied the model to the context of migration and the location of services and residences before Schneider (1959) developed its mathematical procedure for the Chicago Area Transportation Study, to simulate an origindestination pattern of trips. The intervening opportunities model has attracted far less interest from researchers and mainly due to its additional complexity and apparent computational difficulty. In fact, compared to the gravity model, Stopher and Meyburg (1975) argued that the intervening opportunities model has a stronger conceptual base and attempts to address the problem of individual behavior. The Chicago Area Transportation Study in the 1950s compared the performance of both the gravity model and the intervening opportunities model and concluded that there was little difference in model accuracies. A major contribution to the understanding, calibration and application of the intervening opportunities model was presented by Ruiter (1967 and 1969). The L parameter of the intervening opportunities model, as presented by Ruiter (1967), represents the constant probability of a possible destination being accepted if it is considered. It acts as the calibration parameter in the present form of the intervening opportunities model, where the probability that a trip will terminate at a destination implies a negative exponential relationship with the total number of possible destinations. A recent enhancement to the understanding of the probability of trip making behavior was through the development of preference functions in the 1990s. Masuya and Black (1992) presented the theoretical 4 concept of the preference function and discussed its estimation process by using journey-towork data from Sapporo, Japan. Conceptually, a preference function is the inverse of the intervening opportunities concept and represents a zonal aggregate of the travel behavioral response given a particular opportunity surface surrounding those travelers (Paez et al., 2001), which can be related to the probability that a destination being accepted in the intervening opportunities model. A preference function can be generated and estimated by various functional forms which have a decaying relationship. These include the logarithm or natural logarithm and quadratic functions (Masuya et al., 2002). The power function represents another alternative that had a decaying relationship. Cheung (2006) presented the mathematical and computerized procedure in the development of a calibrated preference function model using various types of preference function. The common spatial interaction model statistical evaluation measures include a comparison of the average trip length; comparison of trip length or travel time frequency distributions; chi-square; root-mean-square error (RMSE) or percent root-mean-square-error (%RMSE); number of interchanges most accurately estimated; and comparison of the differences in intra-zonal trips (for example, see Edens, 1970; Evans, 1971; Batty, 1976; Gray, 1980; Smith and Hutchinson, 1981; Easa, 1984; de la Barra, 1989; Hunt, 1994; Sen and Smith, 1995; Zhao et al., 2001; Mao and Demetsky, 2002; and Baltimore Metropolitan Council, 2004). The best model is usually selected based on the goodness-of-fit statistics with limited interpretation from 3-D graphs and GIS maps. 5 This paper presents the evaluation of the gravity and preference function models with the use of travel datasets from a large metropolitan area, Sydney. Both statistical and GIS-T techniques were applied to quantify and geographically identify model biases, which were rarely undertaken in the past. In section 2, the underpinning theory of the gravity model and intervening opportunities model are presented. Section 3 describes the data source used to compare and evaluate both models. Section 4 gives the results of model calibration, and section 5 compares the performance of each model using well-established statistical methods (Black and Salter, 1975). 2. THEORIES The framework of the gravity model is based on the Newton’s Law of Gravity and its purpose is to find an equation that reproduces the intra-zonal and inter-zonal trip interchanges of travel survey data. The theories and mathematical concepts of the gravity model are well covered in Wilson (1970), Bruton (1970), Hutchinson (1974), Stopher and Meyburg (1975), Black (1981), Erlander and Stewart (1990), Sen and Smith (1995) and Ortuzar and Willumsen (2001). For the purpose of this research investigation, the doubly constrained gravity model, which represents the standard practice is used. General notations of this gravity model for the estimation of intra-zonal and inter-zonal trips are sketched out below (see, Black, 1981 and Ortuzar and Willumen, 2001): For a doubly constrained model: Tij = ai.Oi.bj.Dj. f(Cij) 6 (2.1) where Tij = estimate of the number of from zone i to j Oi = survey number of workers living in zone i Dj = survey number of jobs located in zone j Cij = spatial impedance from zone i to j (distance, time or cost) f(Cij) = the impedance or deterrence function ai, bi = zonal balancing factors The doubly constrained gravity model contains a constant for each production zone (ai) and a constant for each attraction zone (bi) to ensure that the estimates of zonal production are equal to the survey zonal productions as well as the estimates of zonal attraction are equal to the survey zonal attraction. Hence, the doubly constrained model has the following conditions. Tij = Oi and j Tij = Dj (2.2) i and ai = [ D. bj. f(Cij)]-1 and bj = [ O. ai. f(Cij)]-1 all (2.3) all The mainstream form of the intervening opportunities model is presented in the major textbooks including, Hutchinson (1974); Stopher and Meyburg (1975); Kanafani (1983); Sheppard (1995); and Ortuzar and Willumsen (2001). It is sketched out as follows: 7 Tij = Oi [exp(-LVj) – exp(-LVj+1)] (2.4) [1 – exp(-LV)] where Tij = estimate of the number of from zone i to j Oi = survey number of workers living in zone i L = calibration parameter V = subtended volume of residential destinations A preference function represents the relationship between the cumulative proportion of jobs taken and the cumulative proportion of jobs reached in a defined opportunities surface. It relates to the fundamental concept in the derivation of the intervening opportunities model as sketched out in Equation (2.5). Tij = Oi [P(Vj+1) – P(Vj)] (2.5) P(Vj+1) – P(Vj) is the probability of a trip terminating in zone j, which is expressed as the total probability that a trip will terminate by the time j+1 possible destinations are considered minus the total probability that a trip will terminate by the time j possible destinations are considered. Relating this to the theory of preference functions, then, P(Vj) is a function of the ratio of the cumulative number of jobs reached (Xj) for zone j to the total job opportunities available (XT) in a defined opportunities surface. Similarly, P(Vj+1) 8 is a function of the ratio of the cumulative number of jobs reached (Xj+1) for zone j+1 to the total job opportunities available (XT) in a defined opportunities surface. Hence, the estimation of Tij becomes: Tij = Oi [f(Xj+1/ XT) – f(Xj/ XT)] (2.6) where Tij = expected interchange from zone i to zone j Oi = the volume of trip origins at zone i f(Xj/XT) = a (preference) function of the ratio of the cumulative number of jobs reached (Xj) for zone j to the total job opportunities available (XT) in a defined opportunities surface f(Xj+1/XT) = (preference) function of the ratio of the cumulative number of jobs reached (Xj+1) for zone j+1 to the total job opportunities available (XT) in a defined opportunities surface. Three forms of preference functions can be examined (see Cheung, 2006): Logarithm function: f(X/XT) = a.ln(X/XT) + b (2.7) Quadratic function: f(X/XT) = a.(X/XT)2 + b.(X/XT) + c (2.8) Power function: f(X/XT) = a.(X/XT)b (2.9) where 9 a, b and/or c are parameter values of the slope and/or coefficients of the preference functions A system containing n zones will have a vector of n preference functions. However, in order to apply the preference functions to estimate every trip interchange in a system, a standard set, or in terms of a stratified model, standard sets of parameter values for a, b and/or c are required to be estimated for each functional form. 3. DATA SOURCE This section describes the data sources used in this research investigation. The estimation of a mathematical relationship, or a series of mathematical relationships that represent the zone-to-zone trip interchanges in a spatial interaction model, requires surveyed data that can derive zone-to-zone trip matrices and some forms of zone-to-zone travel impedance, often in the form of distance, time, cost or a combination of these. Travel datasets and impedance data were obtained for the Sydney metropolitan region. Sydney is the most populated city in Australia and the host of the 2000 Olympics Games. With Sydney’s population reaching 4.3 million in 2005 (Australian Bureau of Statistics, 2006), a rapid population growth in its outer centers (such as Blacktown, Baulkham Hills and Liverpool), a diversified industry employment base as well as a comprehensive motorway network, it represents a good study area for the examination of spatial interaction models by geographical locations. 10 The Australian Bureau of Statistics conducts the Census of Population and Housing survey every 5 years in Australia, where it records the place of enumeration (the origin) and the place of employment (the destination) for every individual on the Census day. Using the Census data, the Transport and Population Data Centre of the New South Wales Government’s Department of Planning processes the origin-destination information to produce the Journey-to-Work (JTW) dataset. Hence, the JTW dataset provides information on the trip to work on Census day undertaken by all employed people aged 15 years and over and represents the most comprehensive source of home-to-work trip information released by the Government. Through collaborative activities that were undertaken over a period of time between the transport research team within the School of Civil and Environmental Engineering, University of New South Wales, and the Transport and Population Data Centre, Department of Planning, the 1996 JTW data was obtained from the Centre. The 1996 JTW dataset allowed the compilation of stratified origin-destination home-to-work trip matrices. The other input data required was the zone-to-zone travel impedance, mostly in the form of travel distance for the purpose of this research. The use of travel distance instead of travel time or cost, especially at a strategic level, has an advantage that travel distances among locations were back-traceable and accurately calculated from maps or a travel network model. Through collaboration with Computing in Transportation, an advisory consultancy to the Government, the Sydney’s travel network model was obtained. With the aid of TransCAD, a geographical information system-transportation (GIS-T) software package provided by the School of Civil and Environmental Engineering, zone-to-zone travel distances were extracted from the travel network model. 11 4. DEVELOPMENT OF MODELS The 1996 JTW origin-destination trip matrix for the Sydney metropolitan area, was used to calibrate the preference function models with the three functional forms as defined earlier (logarithm, quadratic and power). Calibration was achieved by an iterative process of adjusting the calibration parameters (a, b, and/or c – see Equations 2.7 to 2.9) that best replicate the frequency distribution curve of the observed origin-destination survey data and the mean trip length. The iterative process of determining the optimum frequency distribution curve was undertaken by using the coincidence ratio. The coincidence ratio is used to compare two distributions. In using the coincidence ratio, the ratio in common between two distributions is measured as a percentage of the total area of those distributions. Mathematically, the sum of the lower value of the two distributions at each increment is divided by the sum of the higher value of the two distributions at that corresponding increment. In other words, the coincidence ratio measures the percent of area that “coincides” for the two curves. The coincidence ratio lies between zero and one, where zero indicates two disjoint distributions and one indicates identical distributions. The calibration criteria for successful preference models had a small difference in the comparison of observed and modeled mean trip length (<3%) and a co-incidence ratio of equal or greater than 0.7. 12 Aggregate preference function models using the various functional types: logarithmic, quadratic and power functions have been successfully calibrated. These models were developed in a spreadsheet (Microsoft Excel) environment where the calibration was undertaken using an iterative process to estimate parameter values that satisfy the calibration criteria. The values of the calibration parameters for the various models are shown in Table 1, where the power function provides the best model fit when co-incidence ratios are compared. Table 1: Calibration Results for the Aggregate Preference Functions Model In the development of stratified, or residential location-specific, preference function models, the aggregate (46 SLAs by 46 SLAs) 1996 JTW origin-destination trip matrix for the Sydney metropolitan area was stratified into three portions: 1. An inner ring trip matrix: 14 SLAs by 46 SLAs; 2. A middle ring trip matrix: 15 SLAs by 46 SLAs; and 3. An outer ring trip matrix: 17 SLAs by 46 SLAs In fact, the raw preference curves for the inner, middle and outer Sydney SLAs do appear to have some distinct differences as shown in Figure 1, Figure 2 and Figure 3 respectively. 13 Figure 1: Raw Preference Curves for the Inner Sydney SLAs (Note: Keys for SLA codes – Ashfield (150), Botany Bay (1100), Drummoyne (2550), Lane Cove (4700), Leichhardt (4800), Marrickville (5200), Mosman (5350), North Sydney (5950), Randwick (6550), South Sydney (7070), Sydney Inner (7201), Sydney Remainder (7202), Waverley (8050) and Woollahra (8500)) Figure 2: Raw Preference Curves for the Middle Sydney SLAs (Note: Keys for SLA codes – Auburn (200), Bankstown (350), Burwood (1300), Canterbury (1550), Concord (1900), Hunters Hill (4100), Hurstville (4150), Kogarah (4450), Ku-ring-gai (4500), Manly (5150), Parramatta (6250), Rockdale (6650), Ryde (6700), Strathfield (7100) and Willoughby (8250)) Figure 3: Raw Preference Curves for the Outer Sydney SLAs (Note: Keys for SLA codes – Baulkham Hills (500), Blacktown (750), Blue Mountains (900), Camden (1450), Campbelltown (1500), Fairfield (2850), Gosford (3100), Hawkesbury (3800), Holroyd (3950), Hornsby (4000), Liverpool (4900), Penrith (6350), Pittwater (6370), Sutherland (7150), Warringah (8000), Wollondilly (8400) and Wyong (8550)) The estimated mean trip lengths and co-incidence ratios, as well as the values of the calibration parameters for the various preference function models, calibrated for inner, middle and outer areas are shown in Table 2, Table 3 and Table 4, respectively. 14 Table 2: Calibration Results for the Inner Area Preference Functions Model Table 3: Calibration Results for the Middle Area Preference Functions Model Table 4: Calibration Results for the Outer Area Preference Functions Model The three location stratified trip matrices were then re-aggregated to produce the full 46 SLAs by 46 SLAs origin-destination trip matrix. The mean trip lengths and co-incidence ratios for this stratified model is shown in Table 5. Table 5: Calibration Results for the Stratified Preference Functions Model The calibration results shown above, in a similar manner to the aggregate preference models, suggest that the power function provides the best model fit, regardless of the degree of stratification. Considering the power function models, the stratification of inner, middle and outer areas provide an improvement in the match between the observed and modeled curves of trip length frequency distribution, evident from an increase of the coincidence ratio from 0.89 to 0.93. Improvements are also evident with the use of logarithm and quadratic functions. Other goodness-of-fit statistics and a detailed residual analysis are shown in the next section. 15 5. MODEL EVALUATION Several model performance evaluation techniques are presented in this section to further assess the performance of the calibrated preference function models. Firstly, macro performances statistics are used to compare observed data against modeled estimates. These include: The co-incidence ratio; The coefficient of determination (R2); The percent root mean square error (%RMSE); and The percentage of intra-zonal trips. Plots of observed and modeled trip length frequency distribution curves; Secondly, residual analysis was undertaken by the use of graphical plots and GIStransportation techniques. These include: Scatter plots of observed against modeled trip interchanges; Tables showing the residuals (both over- and under-estimations) grouped by suburban areas; Graphical 3D plots of the areas of over- and under-estimations, grouped by suburban areas; and 16 Assignment of residuals onto the transport network using GIS-transportation techniques (the TransCAD software package). Table 6 compares the macro performance statistics for the various functional types of the aggregate preference function models. Table 6: Statistical Performances of Aggregate Preference Functions Models At this point, it is appropriate to compare the performance of the aggregate preference functions models with the aggregate gravity models (Table 7). Table 7: Statistical Performances of Aggregate Gravity Models Table 8 then compares the macro performance statistics for the various functional types of the stratified preference function models. Table 8: Statistical Performances of Stratified Preference Functions Models Figure 4 and Figure 5 below illustrate the observed and modeled trip length frequency distribution curves for the aggregate and stratified preference function models respectively. Figure 4: Comparison of Trip Length Frequency Distribution Curves for Various Functional Forms (Aggregate Preference Functions Models) 17 Figure 5: Comparison of Trip Length Frequency Distribution Curves for Various Functional Forms (Stratified Preference Functions Models) As shown in the above performance statistics, and in a comparison of the trip length frequency distribution curves, stratification by using 3 preference functions to represent the trip pattern characteristics for the inner, middle and other locations in Sydney did increase the goodness-of-fit measures of the model by a considerable margin. This is evident from: a better co-incidence ratio (from 0.89 to 0.93) when the power function is used; an increase in the coefficient of determination (R2) (from 0.89 to 0.91); a smaller value of the root-mean-square-error (RMSE) (from 28.4 to 26.4); a better estimate of intra-zonal trips (from an over-estimation of 4% to an over-estimation of 1%); and visually, the stratified models obtain a clear better fit than the aggregate models. Furthermore, the preference function models achieved a better model fit compared with the gravity models from an assessment of the macro performance statistics. Figure 6 below is the plot of observed against modeled trip interchanges from the aggregate model. The power function model is illustrated as it has the best performance. While a 18 slope value of 1 and an R2 value of 1 indicate a perfect model fit, a slope value of greater than 1 indicates a tendency for over-estimation and a slope value of less than 1 indicates the opposite, i.e. under-estimation. Figure 6: Comparison of Trips Interchanges (Aggregate Power, Preference Functions Model) As shown in Figure 6, the power model performs with an R2 of 0.89 and a slope of 0.99. Figure 7 below is the plot of observed against modeled trip interchanges from the stratified models. Again, the power function model is shown. Figure 7: Comparison of Trips Interchanges (Stratified Power, Preference Functions Model) The research found that for all functions, the stratified models perform better than their corresponding aggregate models. As shown in Figure 6 and Figure 7, the stratified power model has a R2 of 0.90 comparing with a R2 of 0.85 in the aggregate power model. The figures shown below are analyses of residuals (both over- and under-estimations) using 3D plots for the various functional forms and for both the aggregate the stratified models. Interpretations are followed after the illustrations. 19 Figure 8: 3D-Plot Showing Over Estimated Residual (Aggregate Power, Preference Functions Model) Figure 8 above shows that the aggregate power preference function model over-estimates high number of localized trips in the South East and Inner/Central West regions. Figure 9: 3D-Plot Showing Under Estimated Residual (Aggregate Power, Preference Functions Model) Figure 9 above shows that the aggregate power preference function model under-estimates high number of localized trips in the Outer West and Central Coast regions and trips terminating in the Inner/East region from North East and South East regions. Figure 10: 3D-Plot Showing Over Estimated Residual (Stratified Power, Preference Functions Model) Figure 10 above shows that the stratified power preference function model over-estimates high number of localized trips in the Inner/Central West and North West regions. Compared to the aggregate model, the extent of the over-estimation has been reduced in most areas. Figure 11: 3D-Plot Showing Under Estimated Residual (Stratified Power, Preference Functions Model) 20 Figure 11 above shows that the aggregate power preference function model under-estimates high number of localized trips in the Central Coast region and trips terminating in the Inner/East region from Inner/East and South East regions. Compared to the aggregate model, the extent of the over-estimation has been reduced in most areas. The following network plots of residuals for the aggregate preference functions models and for the location specific preference functions models, indicate where the over- and underestimations of spatial residual biases might occur in different geographic areas and their magnitudes on the transport network of Sydney. For illustration purposes, an all-or-nothing assignment was performed using TransCAD to assign the residuals (a cell by cell difference of trip interchanges between observed data and modeled estimates) for each of the models. Over- and under-estimations of residual biases are illustrated on separate figures. Interpretations of the results are presented at after the illustrations. Figure 12: Network Plot Showing Over Estimation of Spatial Biases (Aggregate Power, Preference Functions Model) Figure 13: Network Plot Showing Under Estimation of Spatial Biases (Aggregate Power, Preference Functions Model) Figure 14: Network Plot Showing Over Estimation of Spatial Biases (Stratified Power, Preference Functions Model) 21 Figure 15: Network Plot Showing Under Estimation of Spatial Biases (Stratified Power, Preference Functions Model) Drawing on the findings shown in the above figures, over-estimation of trips was found for trips heading towards the inner and central west suburbs and over-estimation of trips towards the inner/ eastern suburbs. Despite stratified preference function models were found to produce spatial biases in similar geographical areas and transport corridors as in the aggregate models, they are shown to have merit for practical application because of an improvement in accuracy in the range of 7 to 23% (based on macro performance statistics) for the estimation of trip interchanges in Sydney. This improvement in accuracy in relative terms was more pronounced when compared with the gravity model. 6. FURTHER EXTENSIONS TO MODEL The analyses and results presented have shown that there are merits in developing stratified models based on geographic locations (i.e. inner, middle and outer areas in the case of Sydney) in a metropolitan area. This may represent a more general finding to other large metropolitan regions and needs to be investigated. It is worthwhile to note that the grouping of locations or zones does not necessary follow the geographical jurisdiction of the study area. Further appropriate stratification may lead to 22 a future reduction in spatial residual bias. For example, from the preference function plots, differences in trip pattern existed within the inner, middle, and outer areas, as shown in Figure 16, Figure 17 and Figure 18. This shows that there is a potential for further stratification. Figure 16: Differential Preference Patterns among Inner SLAs Figure 17: Differential Preference Patterns among Middle SLAs Figure 18: Differential Preference Patterns among Outer SLAs As a result, a stratification to produce six separate preference functions to represent the distinct trip characteristics for each group of geographic location may be worthwhile for further investigation in producing improved model accuracy. Furthermore, the preference function model used is a residential location based model, i.e. destination zones are ranked from the origin zone. In other words, the preference function models presented is production constrained. A potential improvement to the model performance is to investigate an attraction constrained model with the use of employment location-specific preference functions. In other words, the process involved in the development of this model is a reverse of the one presented in this paper. 23 Apart from the aforementioned extensions, another potential extension to improve the model accuracy is by exploring a more detailed zone system, for example, using travel analysis zone instead of statistical local areas. Currently, equivalent to the 46 SLAs, the Sydney metropolitan area has approximately 700 to 800 travel analysis zones (TAZs). The methods described in this paper are equally applicable at the more detailed spatial resolution. It would be especially important to investigate the zonal within variance in the preference functions, as well as the zonal between SLA variance. The use of a more detailed zoning system is expected to improve the accuracy of the model. On the other hand, it also consumes a far greater need for computer hardware and software requirements, which is considered applicable in the near future with the continuous advancement in computing technologies. 7. CONCLUSIONS The spatial interaction models presented in this paper were evaluated against a number of statistical and graphical techniques. The results showed that the aggregate preference function model produced less systematic bias compared with the gravity model. The stratified model, based on residential specific functions, produced even less bias compared with the aggregate preference function model, regardless of the functional form used. A preference function model with a power function form generally produced more accuracy and less systematic spatial bias. 24 Preference function models, in both aggregate and geographically stratified forms, with probability decaying functions in the forms of logarithm, quadratic and power were successfully calibrated under a spreadsheet based modeling platform. Different functional parameters for inner, middle and outer suburbs were determined for geographically stratified preference function models. Comparing the performance and accuracy of the various models, goodness-of-fit statistics including the coefficient of determination (R2), percentage root mean square error (%RMSE), co-incidence ratio and the percentage of intra-zonal trips were used. Among the aggregate models, the preference function model with a power function produced the best overall match between modeled estimates and observed data with both R2 and co-incidence ratio values of 0.89 and a %RMSE value of 28.4 (although it over-estimated intra-zonal trips by 4%). When the above statistics were considered, the aggregate preference models generally provided more accuracy than the aggregate gravity models. With the use of location-specific preference functions, geographical stratification showed improvement in model performance. The location-specific power function model performed better than the aggregate power function model, with a R2 of 0.91, a coincidence ratio of 0.93, a %RMSE of 26.4, and an intra-zonal trip percentage of 33% (1% more than the observed data). The comparison of the trip length frequency distribution curves from observed and modeled estimates illustrated a closer match of observed trip frequency for stratified models and the power function models seemed to provide the best fit. A similar conclusion was reached when the plots of observed and modeled trip interchanges were examined. 25 Graphical and GIS-transportation analytical techniques were applied to further explain the global statistics (R2, %RMSE, co-incidence ratio and intra-zonal trip proportion) and to pinpoint geographically where spatial residual errors occur and where these residuals are likely to be distributed onto the transport network. With the use of stratified preference function models, the improvement in accuracy in relative terms was more pronounced compared with the gravity model. The analyses and results presented in the paper conclude that preference function models were more accurate in the estimation of journey-to-work trips in the Sydney metropolitan area and geographically stratified models (location-specific preference functions models), which have separate functions for inner, middle and outer areas were concluded to achieve a better fit between observed and modeled data. The implications of these differences to land use and spatial planning are articulated, in particular, where future land use and transport investment decisions are largely based on findings in these spatial interaction models, where inaccurate model results could potentially propagate onto the subsequent stages of the transport modeling process (for example, mode split and trip assignment stages). It is also worth-noting that there are some interesting policy contrasts, with the theory of the gravity model more geared towards transportation policy (i.e. the policy handle is cost/time/distance of travel in a system), and the preference functions more closely related to land use policy (the policy handle being the distribution of opportunities in space). 26 Transport modeling practitioners are encouraged to use the research findings presented in this paper as an indicative guide for future performance appraisal in spatial interaction models. Despite the support of the use of the gravity model framework by many commercial software packages, incorporation of the mathematical procedure for the development of calibrated location-specific function models to computer software platforms is recommended. REFERENCES Australian Bureau of Statistics (2006) Regional Population Growth, Australia, 2004-2005, ABS Catalogue No. 3218.0, Australian Bureau of Statistics, Australian Government, Canberra. Baltimore Metropolitan Council (2004) Baltimore Region Travel Demand Model for Base Year 2000, Task Report 04-01, Baltimore, Maryland. Batty, M. (1976) Urban Modelling: Algorithms, Calibrations, Predictions, Cambridge University Press, Cambridge. Black, J. A. (1981) Urban Transport Planning: Theory and Practice. Croom Helm, London. 27 Black, J.A. and R.J. Salter, (1975) ‘A Review of the Modelling Achievements of British Urban Land-Use Transportation Studies Outside the Conurbations’, Journal of the Institution of Municipal Engineers, Vol. 102, pp. 100-105. Bruton, M. J. (1970) Introduction to Transportation Planning, Hutchinson, London. Cheung, C. (2006) Development and Evaluation of Stratified Spatial Interaction Models and Models Based on Location Specific Preference Functions for Sustainable Transport and Traffic Assessment, Ph.D. Thesis, School of Civil and Environmental Engineering, University of New South Wales, Sydney. de la Barra, T., Perez, B. and Vera, N. (1984) ‘TRANUS-J: Putting Large Models into Small Computers’, Environment and Planning B, Vol. 11, pp. 87-101. Easa, R. (1984) ‘Working Paper Number 84-7: Development of a Doubly Constrained Intervening Opportunities Models for Trip Distribution’, Chicago Area Transportation Study, Chicago. Edens, H. J. (1970) ‘Analysis of a Modified Gravity Model’, Transportation Research, Vol. 4, pp. 51-62. Erlander, S. and Stewart, N. F. (1990) The Gravity Model in Transportation Analysis: Theory and Extensions, VSP BV, Utrecht. 28 Evans, A. W. (1971) ‘The Calibration of Trip Distribution Models with Exponential or Similar Cost Functions’, Transportation Research, Vol. 5, pp. 15-38. Gray, R. H. (1980) Gravity Models: A Conceptually and Computationally Simple Approach, Thesis submitted for the Graduate College of the University of Illinois at Chicago Circle, Chicago, Illinois. Hunt, J. D. (1994) ‘Calibrating the Naples Land Use and Transport Model’, Environment and Planning B: Planning and Design, Vol. 21, pp. 569-90 Hutchinson, B. G. (1974) Principles of Urban Transport Systems Planning, McGraw-Hill, Washington DC. Kanafani, A. K. (1983) Transportation Demand Analysis, McGraw-Hill, New York. Mao, S. and Demetsky, M. J. (2002) Calibration of the Gravity Model for Truck Freight Flow Distribution, Centre for Transportation Studies, University of Virginia, Charlottesville, United States. Masuya, Y. and Black, J. (1992) ‘Transport Infrastructure Development and Journey to Work Preference Functions in Sapporo’, Infrastructure Planning Review: Japan Society of Civil Engineers, Vol. 10, pp. 127-134. 29 Masuya, Y., Shitamura, M., Saito, K., and Black, J. (2002) ‘Urban Spatial Re-structuring and Journey-to-work Trip Lengths: A Case Study of Sapporo from 1972 to 1994’, in Traffic and Transportation Studies 2002, American Society of Civil Engineers, United States of America. NSW Department of Transport (1974) Sydney Area Transportation Study 1971, Sydney. Ortuzar, J. and Willumsen, L. (2001) Modelling Transport, 3rd Edition, John Wiley & Sons, West Sussex, England. Páez, A., Suthanaya, P. and Black, J. (2001) ‘A Spatial Analysis of Transportation ModeSpecific Journey-to-Work Commuting Preferences: Implications for Sustainable Transport Policies’, Proceedings of the 9th World Conference on Transport Research, Seoul, 22-27 July, CD-rom. Ruiter, E. R. (1967) ‘Toward a Better Understanding of the Intervening Opportunities Model’, Transportation Research, Vol. 1, pp. 47-56. Ruiter, E. R. (1969) ‘Improvements in Understanding, Calibrating and Applying the Opportunity Model’, Highway Research Record, Vol. 165, pp.1-21. 30 Sen, A. and Smith, T. E. (1995) Gravity Models of Spatial Interaction Behaviour, SpringerVerlag Berlin, Heidelberg. Schneider, M. (1959) ‘Gravity Models and Trip Distribution Theory’, Journal of Regional Science, Vol. 5, pp. 51-56. Sheppard, E. (1995) ‘Modelling and Predicting Aggregate Flows’, in The Geography of Urban Transportation, Hanson, S. (ed), The Guilford Press, New York. Smith, D. P. and Hutchinson, B. G. (1981) “Goodness of Fit Statistics for Trip Distribution Models”, Transportation Research A, Vol. 15, pp. 295-303. Stopher, P. R. and Meyburg, A. H. (1975) Urban Transportation Modelling and Planning, Lexington Books, D.C. Heath and Company, Toronto/London. Stouffer, S. (1940) ‘Intervening Opportunities: A Theory Relating Mobility and Distance’, American Sociological Review, Vol. 5, pp. 845-867. Wilson, A. G. (1970) ‘Advances and Problems in Distribution Modelling’, Transportation Research, Vol. 4, pp. 1-18. Zhao, F., Chow, L.F, Li, M. T. and Shen, D. L. (2001) ‘Refinement of FSUTMS Trip Distribution Methodology’, Technical Memorandum No. 3: Calibration of an Intervening 31 Opportunity Model For Palm Beach County, prepared for Florida Department of Transportation, September 2001. 32 Table 1: Calibration Results for the Aggregate Preference Functions Model Mean Trip Co-incidence Functional Forms Value of Parameters Length (km) Ratio Logarithmic 18.1 0.81 a -0.157 Quadratic 18.1 0.86 a -1.077 b 2.254 c 0.269 a 1.116 b 0.296 Power 18.1 0.89 Parameters Note: The mean trip length observed in the Census JTW data for Sydney is 18.1km. 33 Table 2: Calibration Results for the Inner Area Preference Functions Model Mean Trip Co-incidence Functional Forms Value of Parameters Length (km) Ratio Logarithmic 8.5 0.62 a -0.146 Quadratic 8.5 0.84 a -1.407 b 2.184 c 0.204 a 1.230 b 0.422 Power 8.5 0.87 Parameters Note: The mean trip length observed in the Census JTW data for inner Sydney is 8.5km. 34 Table 3: Calibration Results for the Middle Area Preference Functions Model Mean Trip Co-incidence Functional Forms Value of Parameters Length (km) Ratio Logarithmic 12.4 0.78 a -0.195 Quadratic 12.4 0.85 a -1.044 b 1.774 c 0.231 a 1.115 b 0.404 Power 12.4 0.90 Parameters Note: The mean trip length observed in the Census JTW data for middle distance Sydney is 12.4km. 35 Table 4: Calibration Results for the Outer Area Preference Functions Model Mean Trip Co-incidence Functional Forms Value of Parameters Length (km) Ratio Logarithmic 24.7 0.80 a -0.148 Quadratic 24.7 0.84 a -1.453 b 1.736 c 0.322 a 1.119 b 0.263 Power 24.7 0.86 Parameters Note: The mean trip length observed in the Census JTW data for outer Sydney is 24.7km. 36 Table 5: Calibration Results for the Stratified Preference Functions Model Mean Trip Co-incidence Length (km) Ratio Logarithmic 18.1 0.84 Quadratic 18.1 0.89 Power 18.1 0.93 Functional Forms 37 Table 6: Statistical Performances of Aggregate Preference Functions Models Co-incidence Intra-zonal R2 Functional Forms %RMSE Ratio Trip (%) Logarithmic 0.81 0.87 35.8 39% Quadratic 0.86 0.85 32.4 32% Power 0.89 0.89 28.4 36% Note: The percentage of intra-zonal trip in observed Census JTW data is 32%. 38 Table 7: Statistical Performances of Aggregate Gravity Models Co-incidence Intra-zonal R2 Functional Forms %RMSE Ratio Trip (%) Exponential 0.81 0.70 44.7 18% Power 0.82 0.79 38.4 28% Note: The percentage of intra-zonal trip in the observed Census JTW data is 32%. The gravity model with the use of a gamma deterrence function was not successfully calibrated with the maximum number of iterations and therefore its performance has been considered as unsatisfactory and the performance measures are not reported. 39 Table 8: Statistical Performances of Stratified Preference Functions Models Co-incidence Intra-zonal R2 Functional Forms %RMSE Ratio Trip (%) Logarithmic 0.84 0.87 35.9 38% Quadratic 0.89 0.90 28.4 35% Power 0.93 0.91 26.4 33% Note: The percentage of intra-zonal trip in observed Census JTW data is 32%. 40 Figure 1: Raw Preference Curves for the Inner Sydney SLAs Figure 2: Raw Preference Curves for the Middle Sydney SLAs Figure 3: Raw Preference Curves for the Outer Sydney SLAs Figure 4: Comparison of Trip Length Frequency Distribution Curves for Various Functional Forms (Aggregate Preference Functions Models) Figure 5: Comparison of Trip Length Frequency Distribution Curves for Various Functional Forms (Stratified Preference Functions Models) Figure 6: Comparison of Trips Interchanges (Aggregate Power, Preference Functions Model) Figure 7: Comparison of Trips Interchanges (Stratified Power, Preference Functions Model) Figure 8: 3D-Plot Showing Over Estimated Residual (Aggregate Power, Preference Functions Model) Figure 9: 3D-Plot Showing Under Estimated Residual (Aggregate Power, Preference Functions Model) 41 Figure 10: 3D-Plot Showing Over Estimated Residual (Stratified Power, Preference Functions Model) Figure 11: 3D-Plot Showing Under Estimated Residual (Stratified Power, Preference Functions Model) Figure 12: Network Plot Showing Over Estimation of Spatial Biases (Aggregate Power, Preference Functions Model) Figure 13: Network Plot Showing Under Estimation of Spatial Biases (Aggregate Power, Preference Functions Model) Figure 14: Network Plot Showing Over Estimation of Spatial Biases (Stratified Power, Preference Functions Model) Figure 15: Network Plot Showing Under Estimation of Spatial Biases (Stratified Power, Preference Functions Model) Figure 16: Differential Preference Patterns among Inner SLAs Figure 17: Differential Preference Patterns among Middle SLAs Figure 18: Differential Preference Patterns among Outer SLAs 42 Y (Cum. Prop. of Jobs Taken) 1.0 0.8 0.6 0.4 0.2 150 1100 2550 4700 4800 5200 5350 5950 6550 7070 7201 7202 8050 8500 0.0 0.0 0.2 0.4 0.6 0.8 X (Cumulative Proportion of Jobs Reached) 43 1.0 Y (Cum. Prop. of Jobs Taken) 1.0 0.8 0.6 0.4 0.2 200 350 1300 1550 1900 4100 4150 4450 4500 5150 6250 6650 6700 7100 8250 0.0 0.0 0.2 0.4 0.6 0.8 X (Cumulative Proportion of Jobs Reached) 44 1.0 Y (Cum. Prop. of Jobs Taken) 1.0 0.8 0.6 0.4 0.2 500 750 900 1450 1500 2850 3100 3800 3950 4000 4900 6350 6370 7150 8000 8400 8550 0.0 0.0 0.2 0.4 0.6 0.8 X (Cumulative Proportion of Jobs Reached) 45 1.0 Frequency of Trips in 5km Intervals 500,000 Observed Linear-Log Quadratic Power 400,000 300,000 200,000 100,000 0 5 10 15 20 25 30 35 Distance (km) 46 40 45 50 55 60 Frequency of Trips in 5km Intervals 500,000 Observed Linear-Log Quadratic Power 400,000 300,000 200,000 100,000 0 5 10 15 20 25 30 35 Distance (km) 47 40 45 50 55 60 50,000 Modelled No. of Trips in Interchanges 45,000 y = 0.993x R2 = 0.892 40,000 35,000 30,000 25,000 20,000 15,000 10,000 5,000 0 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 Observed No. of Trips in Interchanges 48 40,000 45,000 50,000 50,000 Modelled No. of Trips in Interchanges 45,000 y = 1.007x R2 = 0.908 40,000 35,000 30,000 25,000 20,000 15,000 10,000 5,000 0 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 Observed No. of Trips in Interchanges 49 40,000 45,000 50,000 50000 40000 35000 30000 25000 t as ast st t /E E r e es est st Ea h t n t h W t e In Nor l W u t es tra th th W os W So r n r C o e u l te N /C ra So Ou er nt n e In C Origin 50 Inner/East North East South East Inner/Central West North West South West Outer West Central Cost inatio n 20000 15000 10000 5000 0 Dest Number of JTW Trips 45000 50000 40000 35000 30000 25000 Origin 51 Inner/East North East South East Inner/Central West North West South West Outer West Central Cost inatio n er / C ut h en E tra ast lW N es or th t W So es ut t h W O e ut st er C W en es tra t lC os t So In n N or th er /E as t Ea st 15000 10000 5000 0 Dest 20000 In n Number of JTW Trips 45000 50000 40000 35000 30000 25000 t as ast st t /E E r e es est st Ea h t n t h W t e In Nor l W u t es tra th th W os W So r n r C o e u l te N /C ra So Ou er nt n e In C Origin 52 Inner/East North East South East Inner/Central West North West South West Outer West Central Cost inatio n 20000 15000 10000 5000 0 Dest Number of JTW Trips 45000 50000 40000 35000 30000 25000 Origin 53 Inner/East North East South East Inner/Central West North West South West Outer West Central Cost inatio n er / C ut h en E tra ast lW N es or th t W So es ut t h W O e ut st er C W en es tra t lC os t So In n N or th er /E as t Ea st 15000 10000 5000 0 Dest 20000 In n Number of JTW Trips 45000 N Wyong $ 0 20 km Residual Errors (No. of Trips) $ Gosford 10,000 5,000 1,000 Hawkesbury $ Hornsby $ $ $ $ $ Blue Mountains $ $ $ Pacific Ocean $ $ $ $ $ $ $ $ $ $ $ $ $ $$ $ $ $ $$ $ $ $ $ $ $ $ $ Camden $ $ $ $ $ $ Sutherland $ $ Pittwater $ Hornsby $ Blacktown $ $ $ $ Penrith Chatswood $ $ $ $ Parramatta $ $ $ $ $ $ $ $ $ $ $ $ $ $ CBD $ $ $ $ $ Liverpool $ $ $ $ $ N $ $ $ 0 10 km $ $ Campbelltown Sutherland $ 54 Residual Errors (No. of Trips) 10,000 5,000 1,000 N Wyong $ 0 20 km Residual Errors (No. of Trips) $ Gosford 10,000 5,000 1,000 Hawkesbury $ Hornsby $ $ $ $ $ Blue Mountains $ $ $ Pacific Ocean $ $ $ $ $ $ $ $ $ $$ $ $ $ $$ $ $ $ $ $ $ $ $ $ $ $ $ Camden $ $ $ $ $ $ Sutherland $ $ Pittwater $ Hornsby $ Blacktown $ $ $ $ Penrith Chatswood $ $ $ $ Parramatta $ $ $ $ $ $ $ $ $ $ $ $ $ $ CBD $ $ $ $ $ Liverpool $ $ $ N $ $ $ $ $ 0 10 km Residual Errors (No. of Trips) $ $ 10,000 Campbelltown Sutherland $ 55 5,000 1,000 N Wyong $ 0 20 km Residual Errors (No. of Trips) $ Gosford 10,000 5,000 1,000 Hawkesbury $ Hornsby $ $ $ $ $ Blue Mountains $ $ $ Pacific Ocean $ $ $ $ $ $ $ $ $ $ $ $$ $ $ $ $ $ $$ $ $ $ $ $ $ $ $ Camden $ $ $ $ $ $ Sutherland $ $ Pittwater $ Hornsby $ Blacktown $ $ $ $ Chatswood $ Penrith $ $ $ $ Parramatta $ $ $ $ $ $ $ $ $ $ $ $ $ CBD $ $ $ $ $ Liverpool $ $ $ $ $ N $ $ $ 0 10 km $ $ Campbelltown Sutherland $ 56 Residual Errors (No. of Trips) 10,000 5,000 1,000 N Wyong $ 0 20 km Residual Errors (No. of Trips) $ Gosford 10,000 5,000 1,000 Hawkesbury $ Hornsby $ $ $ $ $ Blue Mountains $ $ $ Pacific Ocean $ $ $ $ $ $ $ $ $ $ $ $ $ $$ $ $ $ $$ $ $ $ $ $ $ $ $ Camden $ $ $ $ $ $ Sutherland $ $ Pittwater $ Hornsby $ Blacktown $ $ $ $ Penrith Chatswood $ $ $ $ Parramatta $ $ $ $ $ $ $ $ $ $ $ $ $ $ CBD $ $ $ $ $ Liverpool $ $ $ N $ $ $ $ $ 0 10 km Residual Errors (No. of Trips) $ $ Campbelltown Sutherland $ 57 10,000 5,000 1,000 Y (Cum. Prop. of Jobs Taken) 1.0 Inner SLAs with distance minimisation 0.8 0.6 Inner SLAs with distance maximisation 0.4 0.2 150 1100 2550 4700 4800 5200 5350 5950 6550 7070 7201 7202 8050 8500 0.0 0.0 0.2 0.4 0.6 0.8 X (Cumulative Proportion of Jobs Reached) 58 1.0 Y (Cum. Prop. of Jobs Taken) 1.0 Middle SLAs with distance minimisation 0.8 0.6 Middle SLAs with distance maximisation 200 350 1300 1550 1900 4100 4150 4450 4500 5150 6250 6650 6700 7100 8250 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 X (Cumulative Proportion of Jobs Reached) 59 1.0 Y (Cum. Prop. of Jobs Taken) 1.0 0.8 0.6 Outer SLAs with distance maximisation 0.4 Outer SLAs with distance minimisation 0.2 500 750 900 1450 1500 2850 3100 3800 3950 4000 4900 6350 6370 7150 8000 8400 8550 0.0 0.0 0.2 0.4 0.6 0.8 X (Cumulative Proportion of Jobs Reached) 60 1.0