Modeling City Size Data with a Double-Asymptotic Model q ҝ)

advertisement
Modeling City Size Data with a Double-Asymptotic Model
(Tsallis q-entropy)
Deriving the two Asymptotic Coefficients (q,Y0)
and the crossover parameter (kappa: ҝ)
for 24 historical periods, 900-1970
from Chandler’s data in the largest world cities in each
checking that variations in the parameters
for adjacent periods entail real urban system variation
and that these variations characterize historical periods
then testing hypotheses about how these variations
tie in to what is known about
World system interaction dynamics
good lord, man, why would you want to do all this?
That will be the story
1
Y0=3228
largest=530
Angle u
ü=-Ou
2
Why Tsallis q-entropy?
That part of the story comes out of network analysis
there is a new kid on the block
beside scale-free and small-world models of networks
which are not very realistic
Tsallis q-entropy is realistic (more later)
but does it apply to social phenomena
as a general probabilistic model?
The bet was, with Tsallis,
that a generalized social circles network model would
not only fit but help to explain q-entropy
in terms of multiplicative effects
that occur in networks
when you have feedback
That’s the history
of the paper in Physical Review E by DW, CTsallis, NKejzar, et al.
and we won the bet
3
So what is Tsallis q-entropy?
It is a physical theory and mathematical model (of) how
physical phenomena depart from randomness (entropy)
but also fall back toward entropy at sufficiently small scale
but that’s only one side of the story, played out between:
q=1 (entropy) and q>1, multiplicative effects
as observed in power-law tendencies
Breaking
out of
entropy
(exponential)
toward powerlaw tails with
slope 1/(1-q)
That story
Is in Physical Review E 2006 by DW, CTsallis, NKejzar, et al.
for simulated feedback networks
4
So what’s the other side of the story?
q=2
Breaking
out of
entropy
(exponential)
q=4 etcetera
toward powerlaw tails with
slope 1/(1-q)
In the first part we had breakout from q=1 with q increases that lower the slope
Ok, now you have figured out that as q 1 toward an infinite slope the qentropy function converges to pure entropy, as measured by Boltzmann-Gibbs
But that’s not all because there is another ordered state on the other side of
entropy, where q (always ≥ 0) is less that 1! While q > 1 tends to power-law and
q=1 converges to exponential (appropriate for BG entropy), q < 1 as it goes to 0
tends toward a simple linear function.
That story
is told in the Tsallis q-entropy equation
Yq ≡ Y0 [1-(1-q)
x/κ]1/(1-q)
5
Ok, so given x, the variable sizes of cities, then Yq ≡ the qexponential fitted to real data Y(x) by parameters Y0, κ,
and q. And the q-exponential is simply the eqx′ ≡ x[1-(1q) x ′]1/(1-q) part of the function where it can be proven
that eq=1x ≡ ex ≡ the measure of entropy. Then q is the
metric measure of departure from entropy, in our two
directions, above or below 1.
The story
is told in the Tsallis q-entropy equation
Yq ≡ Y0 [1-(1-q)
x/κ]1/(1-q)
6
Ok, so now we know what q means, but what the
parameters Y0 and κ? Well, remember: there are two
asymptotes here, not just the asymptote to the powerlaw tail, but the asymptote to the smallness of scale at
which the phenomena, such as “city of size x” no longer
interacts with multiplier effects and may even cease to
exist (are there cities with 10 people?)
This story
is told in the Tsallis q-entropy equation
Yq ≡ Y0 [1-(1-q)
x/κ]1/(1-q)
7
So, now let’s look at the two asymptotes in the context of a
cumulative distribution:
Y0
is all the limit of all people in cities
This story
is told in the Tsallis q-entropy equation
Yq ≡ Y0 [1-(1-q)
x/κ]1/(1-q)
8
Here is a curve that fits these two asymptotes:
Y0
is the limit of all people in cities
This story
is told in the Tsallis q-entropy equation
Yq ≡ Y0 [1-(1-q)
x/κ]1/(1-q)
9
Here are three curves with the same Y0 and q but different k
10000
Y0
is the limit of all people in cities
1000
100
100
1000
So now you get the idea of how the
curves are fit by the three parameters
This story
is told in the Tsallis q-entropy equation
Yq ≡ Y0 [1-(1-q)
x/κ]1/(1-q)
10
Cumulative City Populations
v900
10
8
6
4
v1000
v1100
24MIL
v1150
v1200
v1250
v1300
v1350
v1400
3MIL
v1450
v1500
3.1
v1550
v1575
420K
v1600
v1650
v1700
v1750
v1800
55K
v1825
v1850
v1875
v1900
v1925
31. 39. 50. 63. 79. 10 12 15 20 25
6
8
1
1
4
0
6
9
0
1
31 39 50 63
6
8
1
1
79 10 12 15 19 25 31 39 50 63 68
4 00 59 85 95 12 62 81 12 10 00
binlogged
City Size Bins
v1950
v1970
min
11
One feature in these fits is the estimate of Y0 (total urban populations)
Cumulative City Populations
v900
10
8
6
4
v1000
v1100
24MIL
v1150
v1200
v1250
v1300
v1350
v1400
3MIL
v1450
v1500
3.1
v1550
v1575
420K
v1600
v1650
v1700
v1750
v1800
55K
v1825
v1850
v1875
v1900
v1925
31. 39. 50. 63. 79. 10 12 15 20 25
6
8
1
1
4
0
6
9
0
1
31 39 50 63
6
8
1
1
79 10 12 15 19 25 31 39 50 63 68
4 00 59 85 95 12 62 81 12 10 00
binlogged
City Size Bins
v1950
v1970
min
12
China log population, log estimate Y0: urban population, and estimated % urban
(the estimates of Y0 are in exactly the right ratios to total population and %ages)
14
.83B
.83B
13
Total population
170M
12
80M
11
30M
10
7%
6%
5%
4%
3%
2%
170M
80M
Percentages
30M
9
44M
44M
?
8
4M
Y0 estimates
4M
7
900 1000 1100 1150 1200 1250 1300 1350 1400 1450 1500 1550 1575 1600 1650 1700 1750 1800 1825 1850 1900 1925 1950 1970
year
13
7%
6%
5%
4%
3%
2%
q runs test: 8 Q-periods (p=.06)
Parameter Estimates
00
95% Conf idence Interval
00
Parameter
Asymptotic
00
00
Estimat e
Std. Error
Lower Bound
Upper Bound
q
.795
.448
-1.133
2. 724
k
229. 307
95.434
-181.314
639. 928
Y
2471.785
493. 159
349. 891
4593.679
95% Trimmed Range
Lower Bound
Upper Bound
q
900 1000 1100 1150 1200 1250 1300 1350 1400 1450 1500 1550 1575 1600 1650 1700 1750 1800 1825 1850 1900 1925 1950 1970
k
Y
Boot strap
a,b
date estimates for 1650
Table 1: Example of bootstrapped parameter
Parameter Estimates
q
95% Conf idence Interval
k
Asymptotic
Parameter
Y
q
q
k
k
Y
Y
q
Estimat e
Std. Error
Lower Bound
Upper Bound
1. 953
.795
5. 000
229. 307
41800.846
2471.785
.953
.094
170. 491
6. 854
1338728.666
3. 307
-2.146
.608
-728.564
215. 592
-5718283.703
2465.167
6. 052
.983
738. 564
243. 022
5801885.394
2478.403
a. Based on 60 samples.
k val ue equals 4161.644.
b. Loss func tion
95% Trimmed Range
Lower Bound
Upper Bound
.795
.795
229. 307
229. 307
2471.785
2471.785
14
100000
-1.5644
y = 7E+09x
2
R = 0.947
-0.6579
Average
R2
y = 142750x
2
R = 0.8795
10000
-0.6451
y = 1E+06x
2
R = 0.9338
-0.4203
Power law fits
.93
y = 8587.9x
2
R = 0.8639
-0.7624
y = 23999x
2
R = 0.9981
1000
-0.3933
y = 21567x
2
R = 0.9533
-0.4728
y = 11616x
-0.6254
y = 24166x R2 = 0.8888
2
-0.7764
R = 0.9381
y = 30224x
2
R = 0.9443
Pop (k)
-1.8002
y = 705358x
2
R = 0.9453
q entropy fits
100
.984
10
1
1
0.99
10
100
Bin Size (k)
0.98
1000
900 Data
900 Fitted
1000 Fitted
1000 Data
1300 Fitted
1300 Data
1350 Fitted
1350 Data
1400 Fitted
1400 Data
1450 Fitted
1450 Data
1500 Fitted
1500 Data
1970 Fitted
1970 Data
1950 Fitted
1950 Data
1900 Fitted
1900 Data
D1900 Fitted
1800 Fitted
D1800 Fitted
1800 Data
Power (1970 Data)
Power (1950 Data)
Power (1900 Data)
Power (1800 Data)
Power (1350 Data)
Power (1500 Data)
10000
Power (1450 Data)
Power (1400 Data)
0.97
0.96
0.95
0.94
900
1000
1100
1200
1300
1400
1500
1600
1700
1800
Figure 4: Variation in R2 fit for q to the q-entropy model – China 900-1970
Key: Mean value for runs test shown by dotted line.
1900
2000
15
commensurability & lowest bin convergence to Y0
Table 2: Correlations among the commensurate-ordering variables in Table 3
Pop
Y0
31.6K Communalities
Total Chinese Population
.88
Y0 Estimate
.75**
.95
Bin Estimate at 31.6K
.81** .96**
.97
Κ
.70** .81**
.90**
.91
* p <.05 ** p < .01
:
At city bin size 10±2 thousand or greater 95% of
the city distribution (as a fraction of Y0) is present:
this is the effective (smallest) city sizes for all
periods
16
Y0=3228
Population
largest=530
Angle u
Dynamics
ü=-Ou
17
Table 6: Total Chinese population oscillations and q
q ranges
Endogenous secular population cycle
‘Early’
‘Late’
Population Crash
pop. rise
pop. rise
Maximum
q~3
‘abnormal’
q~1.7
‘rigid’
q~1.5
Zipfian
q~1
‘random’
q~.5 - .8
‘chaotic’
q~0 ‘flee
the cities’
1000 1.37
1450 1.50
1500 1.34
1925 1.39
1300 0.85
1350 0.85
1400 1.24
1700 1.00
1750 1.29
1900 1.14
1100 1.72
1850 1.85
1575 1.35
1600 1.48
Exceptions
Economy Exception
Captured deurbanized
1800 2.77
1825 2.99
1970 1.49
1150 1.4
1550 1.04
1950 1.06
1200 0.54
1650 0.8
1875 <1?
1250 0.02
18
20
Turchin’s secular cycle dynamic-China
-200
(a) Han
ChinaChina
(a)
Han
60
60
15
0
100
200
(b) Tang
China
(b) Tang
China
Population
Instability
10
30
5
10
40
Instability
Population, mln
40
300
15
50
Instability
Population, mln
50
-100
0
30
5
20
??????
6
0
-200
-100
0
100
200
300
400
500
20
6600
700
800
900
0
1000
Year
Figure
8:China
Turchin secular cycles graphs for China up to 1100
(b) Tang
60
50
15 (2005), with population numbers between the Han
Note: (a) and (b) are from Turchin
and Tang Dynasties filled
in. Sociopolitical instability in the gap between Turchin’s
Population
Instability
Han and Tang graphs has
not been measured.
19
Example: Kohler on Chaco
Kohler, et al. (2006) have replicated such cycles for pre-state
Southwestern Colorado for the pre-Chacoan, Chacoan, and postChacoan, CE 600–1300, for which they have “one of the most
accurate and precise demographic datasets for any prehistoric
society in the world.” Secular oscillation correctly models those
periods “when this area is a more or less closed system,” but, just as
Turchin would have it, not in the “open-systems” period, where it “fits
poorly during the time [a 200 year period] when this area is heavily
influenced first by the spread of the Chacoan system, and then by its
collapse and the local political reorganization that follows.”
Relative regional closure is a precondition of the applicability of the
model of endogenous oscillation.
Kohler et al. note that their findings support Turchin’s model in terms of
being “helpful in isolating periods in which the relationship between
violence and population size is not as expected.
20
City Systems
China – Middle Asia – Europe
World system interaction dynamics
The basic idea of this series is to look at rise and fall of cities
embedded in networks of exchange in different regions over the last
millennium… and
How innovation or decline in one region affects the other
How cityrise and cityfall periods relate to the cycles of population and
sociopolitical instability described by Turchin (endogenous dynamics
in periods of relative closure)
How to expand models of historical dynamics from closed-period
endogenous dynamics to economic relationships and conflict
between regions or polities, i.e., world system interaction dynamics
21
Sufficient statistics to include population and q parameters plus spatial distribution and
network configurations of transport links among cities of different sizes and functions.
Population P
Rural and
Urban
Y0
22
China – Middle Asia - Europe
The basic idea of the next series will be to
measure the time lag correlation between
variations of q in China and those in the
Middle East/India, and Europe.
This will provide evidence that q provides a
measure of city topology that relates to city
function and to city growth, and that
diffusions from regions of innovation to
regions of borrowing
23
Sufficient statistics to include population and q parameters plus spatial distribution and
network configurations of transport links among cities of different sizes and functions.
Population P
Rural and
Urban
Y0
24
10
8
6
4
31.6 39.8 50.1 63.1 79.4 100
126
159 200
251
316
398
501
631
794 1000 1259 1585 1995 2512 3162 3981 5012 6310
bin and actual population size data
Figure 5: Chinese Cities, fitted q-lines
Transforms: natural log
c900
min
c1000
VAR000
c1100
VAR000
c1150
VAR000
c1200
VAR000
c1250
VAR000
c1300
VAR000
c1350
VAR000
c1400
VAR000
c1450
VAR000
c1500
VAR000
c1550
VAR000
c1575
VAR000
c1600
VAR000
c1650
VAR000
c1700
VAR000
c1750
VAR000
c1800
VAR000
c1825
VAR000
c1850
VAR000
c1875
VAR000
c1900
VAR000
c1914
VAR000
c1925
VAR000
c1950
c1970
25
26
Table 6: Total Chinese population oscillations and q
Endogenous secular population cycle
Exceptions
q ranges
‘Late’
pop. rise
Population
Maximum
Crash
‘Early’
pop. rise
Economy
Captured
Exceptio
n
deurbanized
q~3
‘abnormal’
1800 2.77
1825 2.99
q~1.7
‘rigid’
q~1.5
Zipfian
1100 1.72
1850 1.85
1925 1.39
1600 1.48
q~1
‘random’
1550 1.04
1950 1.06
q~.5 - .8
‘chaotic’
1200 0.54
q~0
‘flee
the cities’
1150 1.4
1970 1.49
1300 0.85
1350 0.85
1400 1.24
1700 1.00
1750 1.29
1900 1.14
1650 0.8
1875
<1?
1250
0.02
27
China – Middle Asia - Europe
The basic idea of this series of
28
Population P
Rural and
Population P
Rural and
Urban
Y0
Urban
Y0
q and κ
29
Modeling City Size Data with a Double-Asymptotic Model
(Tsallis q-entropy)
Deriving the two Asymptotic Coefficients (q,Y0)
and the crossover parameter (kappa: ҝ)
for 24 historical periods, 900-1970
from Chandler’s data in the largest world cities in each
checking that variations in the parameters
for adjacent periods entail real urban system variation
and that these variations characterize historical periods
then testing hypotheses about how these variations
tie in to what is known about
World system interaction dynamics
good lord, man, why would you want to do all this?
That will be the story
30
Download