Modeling City Size Data with a Double-Asymptotic Model (Tsallis q-entropy) Deriving the two Asymptotic Coefficients (q,Y0) and the crossover parameter (kappa: ҝ) for 24 historical periods, 900-1970 from Chandler’s data in the largest world cities in each checking that variations in the parameters for adjacent periods entail real urban system variation and that these variations characterize historical periods then testing hypotheses about how these variations tie in to what is known about World system interaction dynamics good lord, man, why would you want to do all this? That will be the story Why Tsallis q-entropy? That part of the story comes out of network analysis there is a new kid on the block beside scale-free and small-world models of networks which are not very realistic (they are toy models) Tsallis q-entropy is realistic (more later) but does it apply to social phenomena as a general probabilistic model? The bet was, with Tsallis, that a generalized social circles network model would not only fit but help to explain q-entropy in terms of multiplicative effects that occur in networks when you have feedback That’s the history of the paper in Physical Review E by DW, CTsallis, NKejzar, et al. and we won the bet So what is Tsallis q-entropy? It is a physical theory and mathematical model (of) how physical phenomena depart from randomness (entropy) but also fall back toward entropy at sufficiently small scale but that’s only one side of the story, played out between: q=1 (entropy) and q>1, multiplicative effects as observed in power-law tendencies Breaking out of entropy toward powerlaw tails with slope 1/(1-q) That story Is in Physical Review E 2006 by DW, CTsallis, NKejzar, et al. for simulated feedback networks So what’s the other side of the story? q=2 Breaking out of entropy q=4 etcetera toward powerlaw tails with slope 1/(1-q) In the first part we had breakout from q=1 with q increases that lower the slope Ok, now you have figured out that as q 1 toward an infinite slope the qentropy function converges to pure entropy, as measured by Boltzmann-Gibbs But that’s not all because there is another ordered state on the other side of entropy, where q (always ≥ 0) is less that 1! While q > 1 tends to power-law and q=1 converges to exponential (appropriate for BG entropy), q < 1 as it goes to 0 tends toward a simple linear function. That story is told in the Tsallis q-entropy equation Yq ≡ Y0 [1-(1-q) x/κ]1/(1-q) Ok, so given x, the variable sizes of cities, then Yq ≡ the qexponential fitted to real data Y(x) by parameters Y0, κ, and q. And the q-exponential is simply the eqx′ ≡ x[1-(1q) x ′]1/(1-q) part of the function where it can be proven that eq=1x ≡ ex ≡ the measure of entropy. Then q is the metric measure of departure from entropy, in our two directions, above or below 1. The story is told in the Tsallis q-entropy equation Yq ≡ Y0 [1-(1-q) x/κ]1/(1-q) Ok, so now we know what q means, but what the parameters Y0 and κ? Well, remember: there are two asymptotes here, not just the asymptote to the powerlaw tail, but the asymptote to the smallness of scale at which the phenomena, such as “city of size x” no longer interacts with multiplier effects and may even cease to exist (are there cities with 10 people?) This story is told in the Tsallis q-entropy equation Yq ≡ Y0 [1-(1-q) x/κ]1/(1-q) So, now let’s look at the two asymptotes in the context of a cumulative distribution: Y0 is all the limit of all people in cities This story is told in the Tsallis q-entropy equation Yq ≡ Y0 [1-(1-q) x/κ]1/(1-q) Here is a curve that fits these two asymptotes: Y0 is the limit of all people in cities This story is told in the Tsallis q-entropy equation Yq ≡ Y0 [1-(1-q) x/κ]1/(1-q) Here are three curves with the same Y0 and q but different k 10000 Y0 is the limit of all people in cities 1000 100 100 1000 So now you get the idea of how the curves are fit by the three parameters This story is told in the Tsallis q-entropy equation Yq ≡ Y0 [1-(1-q) x/κ]1/(1-q) Cumulative City Populations v900 10 8 6 4 v1000 v1100 24MIL v1150 v1200 v1250 v1300 v1350 v1400 3MIL v1450 v1500 3.1 v1550 v1575 420K v1600 v1650 v1700 v1750 v1800 55K v1825 v1850 v1875 v1900 v1925 31. 39. 50. 63. 79. 10 12 15 20 25 6 8 1 1 4 0 6 9 0 1 31 39 50 63 6 8 1 1 79 10 12 15 19 25 31 39 50 63 68 4 00 59 85 95 12 62 81 12 10 00 binlogged City Size Bins v1950 v1970 min One amazing feature in these fits is the estimate of Y0 Cumulative City Populations v900 10 8 6 4 v1000 v1100 24MIL v1150 v1200 v1250 v1300 v1350 v1400 3MIL v1450 v1500 3.1 v1550 v1575 420K v1600 v1650 v1700 v1750 v1800 55K v1825 v1850 v1875 v1900 v1925 31. 39. 50. 63. 79. 10 12 15 20 25 6 8 1 1 4 0 6 9 0 1 31 39 50 63 6 8 1 1 79 10 12 15 19 25 31 39 50 63 68 4 00 59 85 95 12 62 81 12 10 00 binlogged City Size Bins v1950 v1970 min China log population, log estimate Y0: urban population, and estimated % urban (the estimates of Y0 are in exactly the right ratios to total population and %ages) 14 .83B .83B 13 Total population 170M 12 80M 11 30M 10 7% 6% 5% 4% 3% 2% 170M 80M Percentages 30M 9 44M 44M 8 4M Y0 estimates 4M 7 900 1000 1100 1150 1200 1250 1300 1350 1400 1450 1500 1550 1575 1600 1650 1700 1750 1800 1825 1850 1900 1925 1950 1970 year 7% 6% 5% 4% 3% 2% q runs test: 8 Q-periods (p=.06) Parameter Estimates 00 95% Conf idence Interval 00 Parameter Asymptotic 00 00 Estimat e Std. Error Lower Bound Upper Bound q .795 .448 -1.133 2. 724 k 229. 307 95.434 -181.314 639. 928 Y 2471.785 493. 159 349. 891 4593.679 95% Trimmed Range Lower Bound Upper Bound q 900 1000 1100 1150 1200 1250 1300 1350 1400 1450 1500 1550 1575 1600 1650 1700 1750 1800 1825 1850 1900 1925 1950 1970 k Y Boot strap a,b date estimates for 1650 Table 1: Example of bootstrapped parameter Parameter Estimates q 95% Conf idence Interval k Asymptotic Parameter Y q q k k Y Y q Estimat e Std. Error Lower Bound Upper Bound 1. 953 .795 5. 000 229. 307 41800.846 2471.785 .953 .094 170. 491 6. 854 1338728.666 3. 307 -2.146 .608 -728.564 215. 592 -5718283.703 2465.167 6. 052 .983 738. 564 243. 022 5801885.394 2478.403 a. Based on 60 samples. k val ue equals 4161.644. b. Loss func tion 95% Trimmed Range Lower Bound Upper Bound .795 .795 229. 307 229. 307 2471.785 2471.785 100000 -1.5644 y = 7E+09x 2 R = 0.947 -0.6579 Average R2 y = 142750x 2 R = 0.8795 10000 -0.6451 y = 1E+06x 2 R = 0.9338 -0.4203 Power law fits .93 y = 8587.9x 2 R = 0.8639 -0.7624 y = 23999x 2 R = 0.9981 1000 -0.3933 y = 21567x 2 R = 0.9533 -0.4728 y = 11616x -0.6254 y = 24166x R2 = 0.8888 2 -0.7764 R = 0.9381 y = 30224x 2 R = 0.9443 Pop (k) -1.8002 y = 705358x 2 R = 0.9453 q entropy fits 100 .984 10 1 1 0.99 10 100 Bin Size (k) 0.98 0.97 0.96 0.95 0.94 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 Figure 4: Variation in R2 fit for q to the q-entropy model – China 900-1970 Key: Mean value for runs test shown by dotted line. 1900 2000 1000 900 Data 900 Fitted 1000 Fitted 1000 Data 1300 Fitted 1300 Data 1350 Fitted 1350 Data 1400 Fitted 1400 Data 1450 Fitted 1450 Data 1500 Fitted 1500 Data 1970 Fitted 1970 Data 1950 Fitted 1950 Data 1900 Fitted 1900 Data D1900 Fitted 1800 Fitted D1800 Fitted 1800 Data Power (1970 Data) Power (1950 Data) Power (1900 Data) Power (1800 Data) Power (1350 Data) Power (1500 Data) 10000 Power (1450 Data) Power (1400 Data) commensurability & lowest bin convergence to Y0 Table 2: Correlations among the commensurate-ordering variables in Table 3 Pop Y0 31.6K Communalities Total Chinese Population .88 Y0 Estimate .75** .95 Bin Estimate at 31.6K .81** .96** .97 Κ .70** .81** .90** .91 * p <.05 ** p < .01 : Uniformly converge to 10±2 thousand as smallest city sizes for all periods City Systems China – Middle Asia – Europe World system interaction dynamics The basic idea of this series is to look at rise and fall of cities embedded in networks of exchange in different regions over the last millennium… and How innovation or decline in one region affects the other How cityrise and cityfall periods relate to the cycles of population and sociopolitical instability described by Turchin (endogenous dynamics in periods of relative closure) How to expand models of historical dynamics from closed-period endogenous dynamics to economic relationships and conflict between regions or polities, i.e., world system interaction dynamics 20 Turchin’s secular cycle dynamic-China -200 (a) Han ChinaChina (a) Han 60 60 15 0 100 200 (b) Tang China (b) Tang China Population Instability 10 30 5 10 40 Instability Population, mln 40 300 15 50 Instability Population, mln 50 -100 0 30 5 20 ?????? 6 0 -200 -100 0 100 200 300 400 500 20 6600 700 800 900 Year Figure 8:China Turchin secular cycles graphs for China up to 1100 (b) Tang 60 50 15 (2005), with population numbers between the Han Note: (a) and (b) are from Turchin and Tang Dynasties filled in. Sociopolitical instability in the gap between Turchin’s Population Instability Han and Tang graphs has not been measured. 0 1000 Example: Kohler on Chaco Kohler, et al. (2006) have replicated such cycles for pre-state Southwestern Colorado for the pre-Chacoan, Chacoan, and postChacoan, CE 600–1300, for which they have “one of the most accurate and precise demographic datasets for any prehistoric society in the world.” Secular oscillation correctly models those periods “when this area is a more or less closed system,” but, just as Turchin would have it, not in the “open-systems” period, where it “fits poorly during the time [a 200 year period] when this area is heavily influenced first by the spread of the Chacoan system, and then by its collapse and the local political reorganization that follows.” Relative regional closure is a precondition of the applicability of the model of endogenous oscillation. Kohler et al. note that their findings support Turchin’s model in terms of being “helpful in isolating periods in which the relationship between violence and population size is not as expected. Table 6: Total Chinese population oscillations and q q ranges Endogenous secular population cycle ‘Early’ ‘Late’ Population Crash pop. rise pop. rise Maximum q~3 ‘abnormal’ q~1.7 ‘rigid’ q~1.5 Zipfian q~1 ‘random’ q~.5 - .8 ‘chaotic’ q~0 ‘flee the cities’ 1000 1.37 1450 1.50 1500 1.34 1925 1.39 1300 0.85 1350 0.85 1400 1.24 1700 1.00 1750 1.29 1900 1.14 1100 1.72 1850 1.85 1575 1.35 1600 1.48 1970 1.49 1150 1.4 1550 1.04 1950 1.06 1200 0.54 Exceptions Economy Exception Captured deurbanized 1800 2.77 1825 2.99 1650 0.8 1875 <1? 1250 0.02 Sufficient statistics to include population and q parameters plus spatial distribution and network configurations of transport links among cities of different sizes and functions. Population P Rural and Urban Y0 China – Middle Asia - Europe The basic idea of the next series will be to measure the time lag correlation between variations of q in China and those in the Middle East/India, and Europe. This will provide evidence that q provides a measure of city topology that relates to city function and to city growth, and that diffusions from regions of innovation to regions of borrowing Sufficient statistics to include population and q parameters plus spatial distribution and network configurations of transport links among cities of different sizes and functions. Population P Rural and Urban Y0 10 8 6 4 31.6 39.8 50.1 63.1 79.4 100 126 159 200 251 316 398 501 631 794 1000 1259 1585 1995 2512 3162 3981 5012 6310 bin and actual population size data Figure 5: Chinese Cities, fitted q-lines Transforms: natural log c900 min c1000 VAR000 c1100 VAR000 c1150 VAR000 c1200 VAR000 c1250 VAR000 c1300 VAR000 c1350 VAR000 c1400 VAR000 c1450 VAR000 c1500 VAR000 c1550 VAR000 c1575 VAR000 c1600 VAR000 c1650 VAR000 c1700 VAR000 c1750 VAR000 c1800 VAR000 c1825 VAR000 c1850 VAR000 c1875 VAR000 c1900 VAR000 c1914 VAR000 c1925 VAR000 c1950 c1970 end