"Hidden geometry of urban areas and interpretation of highly inhomogeneous, incomplete databases" Dmitry Volchenkov Project FP7 – ICT-318723 MATHEMACS The city is a spatial network, providing alternatives to our movements … ..and thus converting a space pattern into a pattern of relationships Street maps of London, showing poverty and wealth by color coding, Charles Booth (1840-1916), London, UK Royal Saltworks of Chaux Arc-et-Senans, France Claude-Nicolas Ledoux (1736 –1806) Plan for the Ideal City of Chaux City of Karlsruhe: A network of large avenues A modernization program of Paris commissioned by Napoléon III and led by the Seine prefect, Baron GeorgesEugène Haussmann, between 1852 and 1870. The more isolated is a place, the worse is the situation in that How to spot isolation? We used to live in Euclidean space Volchenkov, D., Ph. Blanchard, Mathematical Analysis of Urban Spatial Networks, © Springer ISBN 978-3540-87828-5, [3564 downloads since January, 2009] In order to quantify isolation, we have to use such the structural characteristics that fit the Euclidean space structure! Euclidean space structure of a graph First –passage time of RW Commute time First – passage time of RW Random Walks: What is that? Physical model Mathematical meaning P1 P2 P3 P4 1 , a permutation matrix Symmetry of route choice: the equivalent paths are equiprobable if , 0, then Aut T, 0, T ij 1, a stochastic matrix j RW is a stochastic automorphism expressing structural symmetries: Equivalent walks are equiprobable A “path integral” graph distance All possible paths are taken into account, some paths are more preferable (for RW) then others. Geometry of Data & Graphs • Path integral sums over all RWs to compute a propagator. • Propagator is the Green’s function of the diffusion operator: T, 0, G Tn n 1 " L1" 1 T • The Drazin generalized inverse (the group inverse w.r.t. matrix multiplication) preserves symmetries of the Laplace operator: LGL L, GLG G, G, L 0 • Given two distributions x,y, their scalar product: x, y T x, G y • The (squared) norm of a distribution: x 2 T x, Gx • The Euclidean distance: xy 2 T x 2 T y 2 T 2x, y T Probabilistic geometry of graphs Graph A T D1 A, D diag deg( 1),deg( N ) ˆ D1 2 AD 1 2 , T ˆ , N , T l l l l 1 s ,i s , j s 2 1 s 1,i 1, j N Gij 2 ,i 1,i 1 2 N ,i 1,i 1 N 1 1 N , 12,i i 2, j 1, j 1 2 , N, j 1, j 1 N First-passage time: i 2 T 2 N 1 k ,i i H ij 2 k 2 1 k 1,i i 1 N 1 Commute time: Kij i j 2 T k, j k ,i 1, j 1 k k 2 1,i 1 k N 2 2,i 2, j , 3, j deg i 2E i , j T ei , Ge j PR N 1 j , 3,i i Can we see the first-passage times? Tax assessment value of land ($) Manhattan, 2005 (Mean) First passage time (Mean) first-passage times in the city graph of Manhattan SoHo Federal Hall 10 East Village 100 1,000 Bowery East Harlem 5,000 10,000 Can we see the first-passage times? (Mean) first-passage times in the city graph of Manhattan Federal Hall SoHo East Village Bowery East Harlem 10 100 1,000 5,000 10,000 Log of the mean annual household income (×$1,000, 2003) Federal Hall SoHo East Village Bowery East Harlem 300 100 60 40 20 Log of the annual prison expenditures ( ×$1,000, 2003) Federal Hall 100 SoHo 250 East Village 1,000 2,500 Bowery 10,000 East Harlem 50,000 Why are mosques located close to railways? NEUBECKUM: IsolationM oschee 10 log first - passage time (M oschee) 12 dB M in first - passage time first - passage time (Kirche) IsolationKirche 10 log 3 dB M in first - passage time Social isolation vs. structural isolation Can we hear first-passage times? F. Liszt Consolation-No1 P. Tchaikovsky, Danse Napolitaine V.A. Mozart, Eine Kleine Nachtmusik Bach_Prelude_BWV999 R. Wagner, Das Rheingold (Entrance of the Gods) Can we hear first-passage times? First-passage time Recurrence time Tonality: the hierarchy of harmonic intervals Tonality of music The basic pitches for the E minor scale are "E", "F#", "G", "A", "B". The recurrence time vs. the first passage time over 804 compositions of 29 Western composers. Principal components by random walks Representations of graphs & databases in the probabilistic geometric space are essentially multidimensional! 1000 × 1000 data table (or a connected graph of 1000 nodes) is embedded into 999-dimensional space! Dimensions are unequal! ~ 1 , k 2.... N 1 k Kernel principal component analysis (KPCA) with the kernel G T 1 1T " L " n n 1 Nonlinear principal components by random walks MILCH K = MILK Matrix of lexical distances, A Dmilch, milk 2 5 ; Stochastic normalizat ion, T d l1 , l2 G T n 1 Dl1 , l2 # List of words List of words n 1 " L1" Kernel PCA 1 T In contrast to the covariance matrix which best explains the variance in the data with respect to the mean, the kernel G traces out all higher order dependencies among data entries. Integration of databases for forecasting future trends • Real-world databases are inhomogeneous & incomplete: • The major statistics come after WWII; • The number of polities is ever growing; Integration of databases for forecasting future trends Database A time Integration of databases for forecasting future trends Relevant databases Transitions between states Database C Database A Database B time A graph of states How can we save Europe? Crisis for Europe as trust hits record low Is there a common trend for European countries? No common trends for EU Maddison historical database: GDP per capita Kalman filter based on GDP data Hypothesis (fitting parameters) Present Forecasting ? Database Training sequence time + Average over many evolution scenarios … if we play the previous history No common trends for EU if we play the previous history SCENARIO #1 SCENARIO #2 High trend High trend Low trend Low trend Economic recovery after the WWII came at different rates in different parts of Europe. Maddison’s database retells us the story about recovering after the WWII Industrial countries have an edge on competitors if there is no war (GDP variations are limited to ± $500/year) Traditional capital shelters thrive for larger variations Maddison’s database predicts bankruptcy to the countries that remained uninvolved in the global recovery process. IRAQ To catch up with new tendencies, we have to add more databases Evolution of political Regimes Democracy/Autocracy indices Inequality Top income shares; the largest historical database available concerning the evolution of income inequality Polity IV tells us that • Six criteria are enough to fully describe a governing regime; • These criteria describe a political state- no matter whether this state is presently occupied, or not; • The historical data on governing polices are well documented (no interruptions/almost no “noise”); • It is possible to quantify the difference between political regimes Regulation of chief executive recruitment Unregulated Openness of Executive Recruitment Closed Competitiven ess of Executive Recruitment Selection Dual executive election Regulated Open Regulation of Participatio n Competitiv eness of participatio n Unlimited Authority Unregulated Unregulate d Intermediate Multiple identity Repressed Slight to moderate limitations Sectarian Suppressed Unregulated Dual executive designation Transitional Executive constrains Intermediate Factional Restricted Substantial limitations Dual hereditary/co mpetitive Transitional Intermediate Regulated Executive Parity Competitiv e + Interruption (foreign occupation) + Interregnum (anarchy) + Transitional = 7,566 “states” Polity IV tells us that “Political distance” – the minimal number of political changes (reforms) required to convert the political system of one country into that of another Trends in Governance in 1810 Trends in Governance in 2012 the world is always in transience Polity IV tells us that • There should be a positive feedback, reinforcing the multiplication of polities; dN N dt • We witness the very beginning of a chain reaction process (of atomization of the polity landscape) the number of polities is ever growing The World Top Income database tells us that If the GDP-gain substantially outmatches/ lags below the mean (red line), it apparently comes at the cost of increasing inequality Global synchronization of inequality dynamics Parabolic fit(!) rapidly rising inequality marks wars/conflicts/ instabilities, and instabilities multiply polities. 232 configurations have been observed since 1800 "Tajikistan", 2013 "Nepal", 1945 "Korea North", 2013 "Libya", 2010 Foreign interruption "Cuba", 2005 "Thailand", 2013 "Korea South", 2013 "United States", 2013 "Czech Republic", 2013 "Estonia", 2013 New configurations arise from time to time Random walks on the graph of political regimes Transition matrix between types of governance (17,000 historical transitions) Each political regime has its own dynamics for GDP and IPLC Process starts from the actual data (GDPPC & IPLC) for 2013 + Averaging over all collected histories Most transitions happen within the groups of authoritarian states and presidential republics, while liberal democracies and dictatorships are quite “sticky”. A common state insists on a common economic and political destiny for its citizens. However, the actual trends of different economic groups might be statistically inconsistent. Polities proliferation score Possible splitting of a country is visible as the statistically inconsistent trends. Greece vs. Russia Expected number of countries Main factors resulting in multiplying scores: 1. inequality (stretches bandwidth of boxes); 2. Authoritarian regimes are short-lived, quickly transforming to other modes of authoritarianism, provoking instability There can be a common European trend • if polities are allowed to split within EU without wars; • the workforce are allowed to migrate freely; Germany vs. Greece Back to the City-States? Strong inequality worsens perspectives, authoritarian governance worsens perspectives USA vs. China IPLC ~ O(GDPpc2) “In slowly growing economies, past wealth naturally takes on disproportionate importance, because it takes only a small flow of new savings to increase the stock of wealth steadily and substantially.” (Thomas Piketty, Capital in the Twenty-First Century (2014)) Battle in Asia, concord in Europe China (red) vs. Indonesia (blue) Germany (dark) vs. Austria (light) Conclusions The city converts a space pattern into a pattern of relationships RWs represent stochastic automorphisms of a structure; summing up all RWs → Probabilistic geometry RWs can be used in order to combine different (incomplete) databases Kernel Principal Component Analysis handles high-order dependences in data Some references D.V., Ph. Blanchard, Mathematical Analysis of Urban Spatial Networks, © Springer Series Understanding Complex Systems, Berlin / Heidelberg. ISBN 978-3-540-87828-5, 181 pages (2009). D.V., Ph. Blanchard, “Introduction to Random Walks on Graphs and Databases”, © Springer Series in Synergetics , Vol. 10, Berlin / Heidelberg , ISBN 978-3-64219591-4 (2011). Volchenkov, D., “Markov Chain Scaffolding of Real World Data”, Discontinuity, Nonlinearity, and Complexity 2(3) 289–299 (2013)| DOI: 10.5890/DNC.2013.08.005. Volchenkov, D., Jean-René Dawin, “Musical Markov Chains ”, International Journal of Modern Physics: Conference Series, 16 (1) , 116-135 (2012) DOI: 10.1142/S2010194512007829. Volchenkov, D., Ph. Blanchard, J.-R. Dawin, “Markov Chains or the Game of Structure and Chance. From Complex Networks, to Language Evolution, to Musical Compositions”, The European Physical Journal Special Topics 184, 1-82 © Springer Berlin / Heidelberg (2010). Volchenkov, D., “Random Walks and Flights over Connected Graphs and Complex Networks”, Communications in Nonlinear Science and Numerical Simulation, 16 (2011) 21–55 http://dx.doi.org/10.1016/j.cnsns.2010.02.016 (2010).