TOWARDS A NEURAL NETWORK BASED MODEL FOR TRAVEL DEMAND FORECAST: GIS AND REMOTE SENSING APPROACH André Dantas1, Marcus V. Lamar1, Yaeko Yamashita2, Koshi Yamamoto1, Eizo Hideshima1 1 2 Nagoya Institute of Technology, Nagoya, Japan Department of Civil Engineering, University of Brasilia - Brazil E-mail: andre@keik1.ace.nitech.ac.jp A formulation of travel forecast model based on geographical–spatial data of urban areas is presented. It focus on land use – transportation system interaction in order to represent displacements (trips) needs of human activities within the urban area. The geographical–spatial data mainly originated from satellite and aerial photograph images, are obtained exploring Geographical Information Systems and Remote Sensing. Neural Networks are used to establish an “artificial intelligent” comprehension of the urban dynamic that affect travel demand process. A case study in Boston Metropolitan Area proved the methodology’s efficiency, given that the consideration of geographical spatial variables resulted in a good result of trip forecast. 1. INTRODUCTION In the process of urban transportation planning, travel demand forecast is one of the most important analysis instruments to evaluate the capacity to attend the present and future needs. These forecasts integrate into a planning process aiming to generate the necessary information to decide where, when and how transportation system has to be improved. They intend to measure how urban activities are converted in travels, by assuming that those interactions generate the movement of people. In this sense, the number of trips from origin to destination zones are estimated by correlating socioeconomic data (population, income, etc) [Khisty, 1990]. Forecast models have failed in providing an efficient representation of spatial and temporal interaction between the land use and transportation system. Models were developed processing and correlating representative variables using traditional statistic techniques without considering geographical – spatial reality. They do not incorporate the complex and uncertain nature of land use and transportation system interaction generating static and deterministic models that require too much data to be applicable in real situations [Harris, 1991] [Rodrigue, 1997]. To overcome these limitations, this paper proposes a Remote Sensing (RS) and Geographical Information Systems (GIS) approach to develop a model based on Neural Network (NN). This approach intends to explore RS data as the main source to generate information related to land use patterns and transportation system interactions using GIS. This information is processed by NN that will establish an artificial intelligent modelling considering the interactions in a spatial-temporal perspective. In the sequence, it is established an RS-GIS-NN integration that provides the incorporation and analysis of the urban dynamics in the model. Moreover, according to Morain and Baros (1996), this integration data leads to time reduction and cost-saving benefits against field surveys that are hardly updated by using traditional methods. This fact is essential for the urban transportation planning process since the proposed approach intends to reduce and/or eliminate extensive data collection as usually processed. In this paper, we concentrate on the description of this integration, the model’s formulation for spatial dimension (without temporal variations) and the case study for Boston Metropolitan Area. 2. TRAVEL FORECAST MODEL: RS&GIS& INTEGRATION APPROACH RS and GIS integration has been successfully applied to solve urban problems generating information for decision-making activities. Many applications have being conducted exploring this integration IN the recent years (Morain and Baros, 1996). However, Foresman and Millete (1997) emphasize that GIS capabilities have been concentrated in Boolean operations and conventional spatial statistics when processing multilayered map-based information that has lead to a basic level of “display analysis”. In order to reach more powerful spatial analytic and spatial process modelling tools NN has been applied. Openshaw (1991) and Openshaw et al (1995) reported successful applications and NN efficiency, showing perspectives to elaborate a new modelling conception. In this direction, it is introduced here the integration framework specifically to the travel forecast modelling. It can be applied to process data from different urban areas, which would be the ideal situation providing more generalization for the model. Its description is concentrated on the integration process and the role of its components. Five modules as shown in Figure 1 compose the framework. The integration process starts with data obtainment. In the sequence, satellite images and aerial photographs are stored in RS data Module (1) and Origin / destination matrix are elaborated in Trips data Module (2) and finally maps containing transportation system information (road, subway, train, bus, etc) are transferred to Maps data Module (3). In this obtainment process, it is fundamental to define a common aggregation level once this definition will direct affect all modelling process. After this step, data from Module 1 is used in GIS Module (4) in order to process multispectral analysis and aerial photograph interpretation generating the land use patterns. After, they are incorporated to GIS database that previously stored data from Module 2 and 3. Spatial-temporal queries are processed using this database and the results are conducted to NN Module (5). In this module, query results are pre-processed and then divided in training and test data sets. NN software performs the training process defining a function that is tested and revised until the expected level of forecast is reached. Finally, Module 5 returns to Module 4 (GIS), where travel forecast function is applied to a future scenario and then visualized. 1 RS data 2 t=0 Trips data Maps data 3 t=0 t=0 t =1 t=1 t=1 t=n t=n t=n 4 CLASS Classe 3 3 Multispectral Analysis & Aerial Photograph interpretation Land use Patterns GIS database Visualization & Analysis GIS f X t n , Y t n ,.....,Z t n Trips t n Travel Forecast Spatial-temporal Queries f X , Y ,....., Z Trips Test data set Pre–Processing Data Set NN Forecast Function 5 Training Process Training data set Figure 1: RS and GIS and NN integration for Travel Forecast Problem 3. NN DEFINITION FOR MODELLING IN SPATIAL DIMENSION An urban area is divided in Homogenous Aggregated Sector (HAS) as defined by Taco et al [1999]. Trips (Tij) between a pair of HAS pairs are verified, where i and j denote the origin and destination, respectively. Land use is represented by occupied area of each pattern in each HAS, which is assigned to RLUHAS (Residential Land use), CLUHAS (Commercial Land use) and SLUHAS (Service Land use). Transportation system is characterized by the extension of each mode in each HAS, which is associated to RTSHAS (Road Transportation System), BTSHAS (Bus Transportation System), STSHAS (Subway Transportation System) and TTSHAS (Train Transportation System). In addition, it is defined the distance Dij between a pair of HAS that intends to represent location relations. The model aims to forecast the levels of urban movements as main element to be used in strategic planning activities. It is expected that the model’s final output is an intensity indicator classifying the forecasted movements in categories such as high, mediumhigh, medium, medium-low and low. In this way, Tij is classified z categories according to range variation (v) between them and the maximum and minimum trips observed in the urban area (or study area). In the NN modelling, these characteristics are transformed in input (X) and output (Y) vectors that are processed as a feedforward Multilayer Perceptron (MLP) (see Wasserman, 1989). Here, the input vector is defined such as X = (RLUi, CLUi, SLUi, RTSi, BTSi, STSi, TTSi, RLUj, CLUj, SLUj, RTSj, BTSj, STSj, TTSj, Dij). The Y vector is obtained from the classification of generating (y1, y2, ….yz) where y values are defined by equation 1. 1.0 , k Ti j k where k (1, 2, …, z) yk = 0.0 , (1) otherwise where k and k are the minimum and maximum value of category k, respectively. Physically, it means that trips between two HAS (i and j) are expressed in NN by the combination of land use, transportation systems and distances indicators for each HAS (origin and destination). As a typical NN, input, hidden and output layers compose it. Hidden layers (H) are composed by are hidden neuron units H1, H2, … Hm connect input and output, expressed by X and Y, respectively, as shown by Figure 2. Hidden Layer Input Layer x1 W11 H1 V11 y1 y2 H2 x2 W1m xn Output Layer Wnm Hm Vmz Figure 2: Typical NN structure V1z Yz The output of the neurons are calculated by: n H r f Wrs X s s 1 (2) m y g f Vu H ug u 1 (3) where f is the activation function assumed in this study a sigmoid function. W and V are weight matrixes that are obtained by applying a backpropagation based training algorithm [Wasserman, 1989]. 4. CASE STUDY This case study intended to evaluate the integration process and the proposed NN model. Tests were conducted in Boston Metropolitan Area (Massachusetts State – USA), which covers about 1400 square miles (3580 square kilometers) with nearly three million people live. It was selected a study area in Boston South area, near to Boston Medical Center that involves seventeen Traffic Zones (TZ) as shown in Figure 2. It was assumed that the TZ`s were equivalent to HAS definition (see item 3). The study area is mainly occupied by residential land use and it is located very near to downtown. TZ limits Figure 3: Study area – selected TZ RS and Transportation system data were obtained from MassGIS database (http://www.magnet.state.ma.us/mgis/). In this database, data was projected to Massachusetts State plane Mainland Zone (FIPSZONE 2001) coordinate system, Datum NAD83, units meters. It was used black and white digital orthophotos produced in 1992 in 1:5000 scale. Additionally, bus route maps from Metropolitan Boston Transportation Authority (MBTA) were incorporated to Transportation data. Finally, Central Transportation Planning Staff (CTPS) provided access to Travel data related to 1990 survey that involves all purposes trips as well as HAS definition. In the sequence, GIS database construction activities are described. Next, NN experiments and results are introduced. 4.1 GIS database Geoconcept GIS software was used to develop these tests. First, Land Use patterns were obtained following United States Geological Service (USGS) classification system [Avery and Berlin, 1990] and Taco’s methodology [Taco et al, 1999]. It was analyzed up to Level II (Residential, Commercial and Services) according to model’s requirements (see item 3). Figure 3 presents the Land Use patterns for the study area. In the sequence, Transportation System was digitalized in the GIS database. Figure 4 shows Road, Bus, Subway and Train systems for study area. 4.2 NN experiments and results Once GIS database was elaborated, the X and Y vectors were established from this database using spatial queries to conduct NN experiments. The input vector X was normalized using equation 4. xncTZ 1 2 xcTZ xcmin / xcmax xcmin (4) where xcTZ is the value from X vector for characteristic c (land use or transportation system or distance) in TZ and xncZT is the normalized value. xcmin and xcmax are the minimum and maximum value for the related vector. In the sequence, Y vectors were generated through the classification of Tij matrix according to equation 1. Considering that the maximum and minimum trips were 67 and 0, respectively. Then, z (categories) and v (range) were defined 5 and 14, respectively. In order to create training and test data, X and Y vectors were divided following a random distribution. Test data were selected according to a standard criterion, which establishes that it has to be 25% of the total. So, 216 and 73 vectors were assigned to training and test sets, respectively. Five NN structure types were simulated in T-learn software. From the basic structure described in item 3, it was established variations in hidden layer. Structures A, B and C were defined with 15, 30 and 8 nodes in hidden layer, respectively. In structure D and E, it was inserted one and two hidden layers, receptively. The NN were trained until a minimum MSE error was reached in the test data through the application of equation 5. 1 p MSE Tw Yw p w1 2 (5) where MSE is the mean square error considering forecasted (Y) and expected (T) classified trips for p testing data vectors. Residential Services Commercial Figure 3: Land Use patterns Road Bus Train Subway Figure 4: Transportation System The Table 1 summarizes the results for these five structures. It can be observed that MSE variation was very small indicating that the NN structure did not affect the final results. It can be verified that structure E presented the worst result comparing to the others that are on the same level considering the number of interactions to reach the minimum. However, this difference does not compromise the time execution since just few seconds are necessary to perform the complete optimization process. Finally, the recognition rates show that all the structures reached the same level of optimization and indicating that the network structure is not a fundamental parameter to obtain a good modelling for this formulation. Analyzing in detail the forecasted results, it is verified that the errors occurred in five testing vectors. These vectors were related to categories 2, 3 and 4. On the other hand, all the test vectors related to category 5 (0 Tij 14) were correctly forecasted. This NN performance can be explained by the composition of X and Y vectors that were obtained from study area. It is observed that there is a great concentration of data on category 5 (252 vectors – 87% of the total). Meanwhile, categories 1, 2, 3 and 4 have only 1, 3, 5 and 28 vectors, respectively. This composition led to the construction of training and testing data sets that resulted in NN efficiency just for those vectors in category 5. Table 1 –Simulation results Interactions Training Medium Error Test Recognition Rate (%) MSE A 52920 B 55944 Structures C 52920 D 64584 E 138456 0.099626 0.099721 0.099698 0.099692 0.099267 94.5 94.5 94.5 94.5 945 0.150688 0.15633 0.15993 0.157894 0.155705 Despite the high recognition rate, the final results were compromised by data composition. In this way, it is suggested that a special treatment should be done to provide a better NN modelling. In this sense, it is proposed the creation of a “balanced” data set generating training and test vectors smoothly distributed. It was necessary to “create” data for those categories with few vectors (1,2,3,4) and “eliminate” some from category 5. It was defined that 40 and 10 vectors of each category will compose the training and testing data, respectively. First, it was “created” data for those vectors related to categories 1, 2, 3 and 4 by adding a gaussian noise with average zero and variance 0.001 to those existing vectors, generating new “noisy” version of the original ones. This operation was performed in order to generate to supply the NN with vectors highly correlated with original ones. Next, it was reduced the number of vectors from category 5 applying the LBG algorithm [Linde et al., 1980]. The LBG aims to design quasioptimum codebooks in vector quantization based on coding systems. It was applied to create 50 template vectors of category 5, which minimize the MSE of the representation of all vectors of category 5 by that 50 templates. Using the “balanced” data set, X and Y vectors were defined for training and testing. Again, it was simulated the structures A, B, C, D and E as previously defined. In Table 2, it is verified that structure E reached the best result for testing data generating almost the same recognition rate obtained before. Table 2 – Simulation results for “balanced” data Interactions Training Medium Error Test Recognition Rate (%) MSE A 79600 B 1031000 Structures C 4554800 0.024952 0.000195 0.015074 0.024986 0.024920 90.0 90.0 88.0 90.0 94.0 0.191919 0.189542 0.194383 0.180819 0.101601 D 96400 E 144400 As it was expected, using “balanced” data the recognition rates are worse than the previous experiment. However, the “balanced” data provided a better modelling for those categories that have only few samples in training set, improving the modelling capability of the NN. In all categories, the NN was able to forecast correctly the most part of testing vectors. For instance, in structure E only three vectors presented an incorrect forecast. In these vectors, it is noticed that one or more characteristics of the input vector (X) is zero. Specifically, one of the errors occurred for trips from TZ 48 to 58, which do not contain train and subway transportation systems. Similarly, it is observed the same behavior for TZ 56. Physically, these errors express how difficult it is to the NN to reach the expected result without the information about the transportation system. 5. CONCLUSIONS This paper presented a model that provides the representation of urban dynamic in travel forecast process, which the traditional modelling did not contemplate. We can not criticize the traditional modelling, since it was developed in the generation where the tools for simulation were related to that time. In this way, the traditional modelling has to be upgraded to the new era where new techniques and technology such as GIS, RS, NN become to fulfill the deficiency of traditional modelling. In order to establish the representation of the urban dynamic, it was explored the integration of RS, GIS and NN that was described and discussed. This integration was essential to concept a framework that combines raster data from RS, GIS capabilities and a non-linear mathematical formulation (NN) to express complex interactions between the land use and transportation systems. After the definition of NN formulation for modelling in spatial dimension, the case study for Boston Metropolitan Area proved model’s efficiency. The proposed formulation was capable to process the travel forecast modelling reaching a considerable recognition rate. It was applied a special treatment on the obtained data in order to generate a better generalization for NN modelling. Such treatment proved to be efficient to forecast trips in all the categories. The observed errors were related to cases where there was not sufficient information for NN establish the suitable forecast. One possible direction to solve this limitation is related to the aggregation level of the data. In these experiments, TZ definition was used considering that there was not any other suitable definition of HAS. In Boston case, TZ definition was designed for microanalysis generating units with a very limited area and specific characteristics of land use and transportation system. It is expected that using a macro aggregation level these kind of errors would be eliminated. Acknowledgments: The authors would like to express gratitude by the scholarship supported from Japanese Ministry of Education (Monbushou) for the development of this research. We also would like to thank Central Transportation Planning Staff (CTPS) of Boston for all support and travel data used in this research. REFERENCES Avery, T. E., Berlin, G. L. Fundamental of Remote Sensing and Airphoto Intrepretation. Maxwell Macmillan International, New York, USA [1990]. Fischer, M.M. From Conventional to Knowledge-based Geographical Information Systems. Computer, Environmental and Urban Systems. Vol. 18, no. 4, pp-233-242. USA [1994]. Fisher, M.M Spatial analysis: retrospect and prospect. In Geographical Information Systems: Principles and Technical issues. Eds Longely, P.A., Goodchild, M.F, Maguirre, D.J., Rhind, D.W. v. 1, Second Edition, USA [1999]. Foresman , T.W., Millete, T.L.; Integration of remote sensing and GIS technologies for planning. In Integration of Geographical Information Systems and Remote Sensing [1997]. Harris, B.; Land use Models in Transportation Planning: a review of past developments and current best practice; Delawae Valley Regional Planning Commision. (http://www.bts.gov/tmip/papers/landuse/compendium/dvrpe-appb.htm) [1996]. Khisty, C. J.; Transportation Engineering: an introduction; pp. 388; Prentice-Hall [1990]. Linde, Y., Buzo, A, Gray, R.M., “An Algorithm for vector Quantizers design”, IEEE Trans. On Communications, v.com-28, pp.84-95, January. [1980]. Morain, S.; Baros, S. L., Raster imagery in Geographical Information systems. On Word Press, p. 495; USA[1996] . Openshaw, S.; A concepts-rich approach to spatial analysis, theory generation, and scientific discorvery in GIS using massively parallel computing. In Innovations in GIS 1 [1991] Openshaw S; Blake, M; Wymer, C ; Using neurocomputing methods to classify Britain’s residential areas. In Innovations in GIS 2 [1995] Rodrigue, J. P. Parallel modelling and Neural Networks: an overview for transportation / land use systems. Transportation Research C, v. 5, n. 5, p. 259-271 [1997]. Taco, P.W., Yamashita, Y, Moreira, Souza, N, Dantas, A. Trip Generation Model: A New Conception Using Remote Sensing and Geographic Information Systems. Photogrammetrie Fernerkundung Geoinformation, Germany (to be published) [1999]. Wasserman, P. D. (1989) Neural Computing: theory and practice, p. 210; Van Nostrand Reinhold, New York, USA [1989].