Making the invisible - visible Examples of Smart-card Data Analysis Chen Zhong Research Associate CASA, UCL GIS Seminar @ CEGE, UCL, LONDON 22.10.2014 1 Contents Background | Data | Urban Studies | Examples of data analysis | Singapore | London | … Cities I Conclusion | Insights | Future Work | 2 Background – Smart cards https://en.wikipedia.org/wiki/List_of_smart_cards 3 Background – Automated fare collection (AFC) system Boarding + alighting [1] Pelletier, M. P., Trépanier, M., & Morency, C. (2011). Smart card data use in public transit: A literature review. Transportation Research Part C: Emerging Technologies, 19(4), 557-568. 4 [2] Google images Background – Smart card data real-time, big volume , unstructured Journey id Card id Card Type Mode Boarding Stop id 1 9****1 Adult train STN Sengkang 2 9****2 senior bus 64041 Alighting stop Id Trip _start STN Hougang 8:30 Trip _distance Trip_tim Fair e Transit Count 2.4 6.417 0.23 0 67009 or ? 4.6 16.667 0.91 0 or ? 13:30 5 Background – Smart card data coverage Note: 1. Number of stations is the number of stations with smart-card records generated. 2. The area of Beijing only counts the area enclosed by the 6th ring road for a fair comparison. 3. Total population refers to the world population review, http://worldpopulationreview.com/world-cities/, accessed in July 2015 6 Background – The value of ‘Big’ smart-card data “big data is about three major shifts of mindset that are interlinked and hence reinforce one another. The first is the ability to analyse vast amounts of data about a topic rather than be forced to settle for smaller sets. The second is a willingness to embrace data’s real-world messiness rather than privilege exactitude. The third is a growing respect for correlations rather than a continuing quest for elusive causality.” - Mayer-Schönberger V and Cukier K. (2013) The reuse of the data – meaningless data can be used for untapped purpose. 7 Background – For urban studies Social physics – Identify regularities and Scaling laws Transport research – understanding mobility behaviors and transit planning Spatial data mining – Inferring trip purpose, urban activities, and urban functions, …… Urban geography – Understanding urban dynamics and spatial interactions Urban planning – linking urban features to mobility patterns 2 8 Background – For urban studies Transportation data / spatiotemporal data Built Environment Urban functions spatial structure People Urban activities & movements 9 Examples – Singapore Singapore is an island city-state in Southeast Asia with an area of 710.2 km2 . The current population of Singapore including non-residents is approximately 5.4 million. 10 Examples – About variability Variability Regularity Random Diversity Variance Disorder Unpredictable ... … Pattern Uniform principle Similarity Order Predictable …… Using the variability to understanding the diversity and dynamics of a city - transit - social - urban 11 Singapore - One-week Smart-card Data 1 work day in 2011 > 50% population are using public transportation system Travel from 117 MRT stations 4599 bus stops Generate >5, 000 , 000 travels Journey id Card id Card Type Mode Boarding Stop id 1 9****1 Adult train STN Sengkang 2 9****2 senior bus 64041 Alighting stop Id Trip _start STN Hougang 8:30 Trip _distance Trip_tim Fair e Transit Count 2.4 6.417 0.23 0 67009 4.6 16.667 0.91 0 13:30 12 Singapore - Variability in temporal distribution of trips (starting time) Data: number of trips per 30 minutes Usage: Understand the travel behavior Understand the life styles in Singapore 13 Singapore - Variability in temporal patterns of (boarding) stations Data: number of trips per 30 minutes at four stops Usage: Understand the travel purpose and the urban functions around the station Monday Tuesday Wednesday Thursday Friday Saturday Sunday 14 Singapore - Variability in clusters of stations by temporal patterns Data: number of trips per minute at one station Method: Correlation matrix K- means clustering Usage: Inferring urban functions 15 Singapore - Variability in clusters of stations by flows Data: O-D matrix of bus stops and metro stations Method: Complex network analysis Community detection method Usage: Partition of urban space by collective intra-city movement patterns Infer spatial structure Note: nodes – each stop/station edges – trips between two stop/stations weight – the number of trips 16 Singapore - Variability in clusters of stations by flows Wi ( x, y) = 1 dij ( x, y)λ …… Travel Records Sa Sb Sc Sa 0 10 2 Sb 20 0 3 Sc 8 6 0 … Network Properties … 0 + Index Number of nodes Number of edges Average degree Average strength Average shortest Clustering centrality Closeness centrality … Spatial Properties 2010 4599 +107 621730 131.8342 645.5789 2.229015 2011 4599 +107 702052 148.866 788.577 2.196655 2012 4599 +117 725046 153.4164 801.2078 2.185142 0.2116035 0.2238426 1.170022e06 … 0.2268748 1.085218e06 … 1.161199e-06 … Point + X Y Cluster … Na 387790 153759.4 1 … Na 387852.4 153753.4 2 … Na 387334.2 153485.6 3 … Na 387339 153546.1 2 … … … … … 17 Singapore - Variability in clusters of stations by flows (flow diagram) Partition/neighborhood/region A group of stations Data: number of trips per minute at one station Method: Complex network analysis Community detection method Usage: Partition of urban space by collective intra-city movement patterns Infer spatial structure 18 Singapore - Variability Variabilityin changes clusterscross of stations spatialby scales flows (color-coded map) Monday Sunday 19 Singapore - Variability in clusters of stations by flows over years 20 Examples – London Mapping – UG + BUS Locations (not updated one, some stops are missing) Tube Network + National Railway Bus stop locations 21 London – variability and regularity Variability Variability Regularity Regularity Random Diversity Variance Disorder Unpredictable ... … Pattern Uniform principle Similarity Order Predictable …… Using the variability and regularity to understanding the diversity and dynamics of the city - tube disruption - travel behavior - cities 22 London – Tube disruption (by Dr. Ed Manley) Circle and District line part closure From Edgware Road to Aldgate/ Aldgate East 19th July 2012 07:49 to 12:04 1234022 Oyster Cards with regular pattern during disrupted time period travelled 23 No Change: Increased Travel Time Greater than 2SD above mean increase on usual travel time for that Oyster Card Size equal to proportion of users that regularly travel from station during time period, and travelled that during disruption 24 Origin Changes Locations from where individuals changed from their usual origin station 25 Partial Switch to Bus Locations from where users replaced a part of usual journey with a bus journey Dr. Ed Manley 26 Upton Park Dagenham East Dagenham Heathway 78.9% 76.7% 76.4% Total Proportion Disrupted Proportion of all 79103 users identified as being disrupted from usual patterns Dr. Ed Manley 27 London – Tube disruption Tower Hill Upton Park Affected Line Affected Line Travel Time 51.6% Travel Time 15.3% Origin 4.2% Origin 16.2% Des6na6on 6.7% Des6na6on 7.8% Mode Switch 16.3% Mode Switch 12.4% TOTAL 78.9% TOTAL 59.6% Turnpike Lane West Ham Unaffected Line Affected Line Travel Time 14.4% Travel Time 2.2% Origin 6.5% Origin 3.1% Des6na6on 4.6% Des6na6on 4.5% Mode Switch 5.9% Mode Switch 7.5% TOTAL 31.4% TOTAL 17.3% 28 Cities – Variability and Regularity Variability Variability Random Diversity Variance Disorder Unpredictable ... … Regularity Regularity Pattern Uniform principle Similarity Order Predictable …… Using the variability and regularity to understanding the diversity and dynamics of Cities 29 Cities – Variability and Regularity Variability Variability in Regularity Regularity Less Variability in Regularity Regularity, equals to a pattern, which could be a uniform principle, arrangement, or order that repeats and reproducible, therefore, can be used as basis for simulation or predications. On the contrary … variability related to diversity, disorder, variance, unpredictable… Q1: is one city more regular than the others (in terms of mobility patterns)? - maybe, still searching… Q2: if there is a “variability in regularity” in all cities, is there a regularity in variability? - maybe, still searching… 30 Cities – Temporal distribution of trip starting time 31 Cities – Temporal distribution of trip starting time 32 Insights and Future work Variability (/regularities) exists in mobility patterns • Over multiple days • Between different locations • Between passenger groups (in progress) • Between different cities Variability changes following rules (sub-linear function) • Across spatial scales (to be verified) • Across temporal scales Future work • Data quality and data coverage problem • Integrated method for spatiotemporal data analysis • Integrating the variability factors into transport models • Exploring potentials of smart-card data … urban mobility data 33 Thanks Contact: Chen ZHONG c.zhong@ucl.ac.uk Center for Advanced Spatial Analysis University College London 22.10.2015 34