Inter-regional migration in Europe: a spatial interaction

advertisement
Inter-regional migration in Europe: a spatial
interaction modelling perspective
Adam Dennett*, Kimberley Claydon†, Pablo Mateos†
*Centre for Advanced Spatial Analysis
†Department of Geography
University College London
Presentation to the British Society for Population Studies –
Annual Conference, 9th September 2011
Presentation Outline
• Introduction to the ENFOLD-ing project and motivation for
this work
• Modelling migration - spatial interaction models
• A comparison of modelling methodology – entropy
maximising models vs. statistical models
• A spatial interaction modelling perspective on inter-regional
migration in Europe – work in progress…
ENFOLD-ing
• Explaining, Modelling and Forecasting Global Dynamics - 5
year EPSRC project
• Understand global dynamics through a model-based analysis
and to develop an associated forecasting capability
• Four substantive areas as key ‘understanding’ and modelling
challenges in the context of globalisation:
• trade, and economic development
• migration, and global demography
• security
• development aid
Migration and global demography stream
• Challenge – year 1: to assemble the data and tools to enable
us to build a dynamic model of global migration flows –
principally international, country-to-country flows but also
interested in city/regional scales…
• Data:
• ‘Slow dynamics’ – population counts, migrant stocks,
other demographic data
• ‘Fast dynamics’ – migration flows
• Supplementary data – distance matrices, language
associations, economic data, trade flows, flight data,
surname associations, currency, colonial ties etc…
• Tools:
• Theories of migration and associated models
Modelling migration
• Two main camps: “Probabilistic” and “Deterministic” – Plane
(1982); Stillwell (1978)
• Probabilistic: rate-based Markov-style demographic
migration models – e.g. DEMIFER project, ONS SNPP
• Yr 2000, Pop = 10,000, In-migrants = 500, Rate = 0.05
• Yr 2001, Pop = 10,500, Rate = 0.05, Est In-migrants = 525
• Reliable, historical time-series a prerequisite
• Deterministic: examine causal relationships between
predictor variables and migration moves – e.g. increasing
population = increasing numbers of migrants; increasing
distance between origin and destination = decreasing
numbers of migrants.
• Global data patchy so our initial focus on deterministic
migration models…
Spatial Interaction Models
• Most common deterministic models used in migration analysis
• Based on gravity models: 𝑀𝑖𝑗 =
𝑃𝑖 𝑃𝑗
𝑑𝑖𝑗
(Zipf, 1946)
(1)
• Estimated migration (𝑀) between origin i and destination j is
proportional to product of populations at origin (𝑃𝑖 ) and
destination (𝑃𝑗 ) and inversely proportional to distance between
them (𝑑𝑖𝑗 )
• In a migration model, populations can be substituted for total in(𝑂𝑖 ) and out-migrants (𝐷𝑗 ) and the frictional effect of distance
decays exponentially
So:
𝑀𝑖𝑗 = 𝐾𝑂𝑖 𝐷𝑗 𝑒 −𝛽𝑑𝑖𝑗
(2)
where:
𝑗 𝑀𝑖𝑗
−𝛽𝑑𝑖𝑗
𝑗 𝑂𝑖 𝐷𝑗 𝑒
𝑖
𝐾=
𝑖
(3)
• The distance decay parameter (𝛽) is calibrated within the model…
Calibrating spatial interaction models
• Different techniques can be used to calibrate the parameters of
SIMs and this has led to a noticeable bifurcation in approaches…
1. Entropy Maximising spatial interaction models
• Developed after pioneering work by Wilson (1971)
• Used in migration models by Stillwell (1978), Pooler (1994),
Plane (1982), Fotheringham (1983)
• Use mathematical programs (usually bespoke – coded from
scratch in Fortran, VB, Java etc.) to calibrate parameters
though computational search algorithms
2. Statistical spatial interaction models
• OLS regression, Poisson Regression and log-linear models
• Used in migration modelling by Willekens (1983), Flowerdew
(2010), Boyle et al. (1998), Cohen et al. (2008), Mayda (2010),
Abel (2010), Raymer (2007)
• Calibrated using regression algorithms available in most offthe-shelf statistical software packages – R, SPSS, Stata etc.
Spatial interaction models – best approach?
• Advocates from both camps of spatial interaction modelling
frequently justify their method as preferable
• For anyone new to migration modelling the choice of which
route to take is unclear
Question:
• Is one approach preferable to the other in terms of:
a) model predictions?
b) other model outputs (e.g. parameter information)
c) ease of application?
• To answer these questions we turn to an empirical
example…
Spatial Interaction Models – an empirical example
Inter-regional (NUTS2) migration – Austria, 2006
AT12
AT13
1131
1887
0
14055
20164
0
379
1597
1110
2973
2027
3498
378
1349
424
978
128
643
25741
26980
Destination
AT21
AT22
AT31
AT32
AT33
AT34
Oi
69
738
98
31
43
19
4016
416
1276
1850
388
303
159
20080
1080
1831
1943
742
674
407
29142
0
1608
328
317
469
114
4897
1252
0
1081
622
425
262
8487
346
1332
0
2144
821
274
10638
310
851
2117
0
630
106
5790
490
670
577
546
0
569
4341
154
328
199
112
587
0
2184
4117
8634
8193
4902
3952
1910
89575
AT12
AT13
103
84
0
46
46
0
217
250
130
159
141
186
201
244
344
388
454
498
Destination
AT21
AT22
AT31
AT32
AT33
AT34
221
132
215
247
391
505
217
130
141
201
344
454
250
159
186
244
388
498
0
92
152
93
195
306
92
0
125
122
262
376
152
125
0
82
208
315
93
122
82
0
145
259
195
262
208
145
0
114
306
376
315
259
114
0
Flows
AT11
Origin
AT11
AT12
AT13
AT21
AT22
AT31
AT32
AT33
AT34
Dj
0
1633
2301
85
762
196
49
87
33
5146
Distance
AT11
Origin
AT11
AT12
AT13
AT21
AT22
AT31
AT32
AT33
AT34
0
103
84
221
132
215
247
391
505
16001
Spatial Interaction Models – an empirical example
• Data in the flow matrix can be modelled by using the data in
the distance matrix and re-scaling this information subject to
marginal constraints in the flow matrix
• Using the statistical modelling approach, an additive Poisson
regression model which does this would take the form:
ln 𝑀𝑖𝑗 = πœ† + πœ†π‘‚π‘– + πœ†π‘—π· + 𝛽𝑑𝑖𝑗
Unsaturated log-linear model:
overall effect + origin &
destination main effect
parameters – these are
categorical predictors,
equivalent to dummy variables
and constrain the estimates
(4)
Continuous
predictor variable
multiplied by β
parameter
Spatial Interaction Models – an empirical example
• The Poisson model can be run in R using the GLM package
with a command similar to:
AustriaExp <- glm(Data~Origin+Destination+Dij+
offset(log(Offset)), family=poisson(link="log"),data=Austria)
The offset in the model is a matrix with 0s in the
diagonal cells and 1s in all other cells to force the
modelled diagonals to = 0
• The GLM package in R will calibrate a series of parameters –
dummy parameters for the origin and destination main
effects, and overall main effect and a slope parameter
associated with the continuous distance variable…
Spatial Interaction Models – an empirical example
AT11
AT12
AT13
Origin
Destination
AT11
AT12
AT13
AT21
AT22
AT31
AT32
AT33
AT34
0
979
2261
104
338
191
87
44
14
1027
0
14365
503
1614
1602
584
299
96
2921
17686
0
949
3149
2733
1020
516
164
πœ†π‘‚π‘–
1.545
2.441
AT21
174
799
1225
0
932
630
591
416
132
0.6992
AT22
AT31
450
329
2044
2639
3239
3657
743
653
0
1304
1003
0
601
1166
314
674
97
222
0.9487
1.29
AT32
AT33
147
74
946
482
1340
677
602
422
767
400
1145
660
0
642
644
0
201
986
0.7427
1.195
AT34
24
164
229
142
132
232
213
1049
0
1.497
2.185
0.1878
0.6643
0.7426
0.2133
0.6677
0.3999
0.9887
6.2050
πœ†π‘—π·
R2 = 0.975
𝛽 = -0.007915
• 𝑀12 = 979 = 𝑒π‘₯𝑝 6.2050 + 0 + 1.497 + −0.007915 × 103 +
Spatial Interaction Models – an empirical example
• The multiplicative form of the Poisson model above is very similar
to the gravity model in Equation 1:
• 𝑀𝑖𝑗 = 𝐾𝑂𝑖 𝐷𝑗 𝑒 𝛽𝑑𝑖𝑗 - in log form the ‘main effects’ of this model =
ln 𝑀𝑖𝑗 = ln 𝐾 + ln 𝑂𝑖 + ln 𝐷𝑗
• But in log form the main effects of the Poisson model =
ln 𝑀𝑖𝑗 = ln 𝑇 + ln
𝑂𝑖
𝑇
+ ln
𝐷𝑗
𝑇
(6)
(7)
Where:
𝑇 = 𝑖 𝑗 𝑀𝑖𝑗
(8)
• These main effects models in 7 and 8 produce identical results
• When we incorporate space back into the model, the Poisson
model ≠ the gravity model as each main effect parameter is a
constraint (gravity only has an overall K constraint).
• Including space, the entropy maximising equivalent of the Poisson
model is a doubly constrained spatial interaction model…
Spatial Interaction Models – an empirical example
• The doubly constrained entropy maximising spatial
interaction model equivalent to the Poisson model takes the
form:
𝑀𝑖𝑗 = 𝐴𝑖 𝑂𝑖 𝐡𝑗 𝐷𝑗 𝑒 𝛽𝑐𝑖𝑗
(8)
Where
𝐴𝑖 =
And
𝐡𝑗 =
1
𝑗 𝐡𝑗 𝐷 𝑗 𝑒
𝛽𝑐𝑖𝑗
1
𝑖 𝐴 𝑖 𝑂𝑖
𝛽𝑐
𝑒 𝑖𝑗
(9)
(10)
• Model programmed in VBA – constraints (equivalent to main
effects ‘dummy’ parameters in Poisson model) calculated
using iterative procedure (Senior, 1979)
• 𝛽 parameter calibrated using Newton-Raphson routine
(other routines available – see Batty, 1972 for thorough
comparison…)
Spatial Interaction Models – an empirical example
Entropy maximising model results
Entropy
AT11
Origin
AT11
AT12
AT13
AT21
AT22
AT31
AT32
AT33
AT34
Dj
0
967
3068
159
457
301
125
55
15
5146
R2 = 0.977
AT12
AT13
924
2386
0
14980
18400
0
687
1085
1940
3189
2525
3531
805
1156
358
508
102
144
25741
26980
𝛽 = -9.51213
Poisson
AT11
Origin
AT11
AT12
AT13
AT21
AT22
AT31
AT32
AT33
AT34
Dj
Destination
AT21
AT22
AT31
AT32
AT33
AT34
Oi
92
336
169
70
30
8
4016
415
1495
1490
474
206
55
20080
806
3019
2559
836
359
95
29142
0
1079
677
670
427
112
4897
846
0
1082
624
278
71
8487
694
1412
0
1325
667
184
10638
689
817
1329
0
691
178
5790
449
373
685
707
0
1207
4341
127
102
203
196
1295
0
2184
4117
8634
8193
4902
3952
1910
89575
0
1027
2921
174
450
329
147
74
24
5146
AT12
AT13
979
2261
0
14365
17686
0
799
1225
2044
3239
2639
3657
946
1340
482
677
164
229
25741
26980
Destination
AT21
AT22
AT31
AT32
AT33
AT34
104
338
191
87
44
14
503
1614
1602
584
299
96
949
3149
2733
1020
516
164
0
932
630
591
416
132
743
0
1003
601
314
97
653
1304
0
1166
674
222
602
767
1145
0
644
201
422
400
660
642
0
986
142
132
232
213
1049
0
4117
8634
8193
4902
3952
1910
Oi
4016
20080
29142
4897
8487
10638
5790
4341
2184
89575
Spatial Interaction Models – an empirical example
• Reasons for slight difference in between results for entropy
maximising model and Poisson model?
• It’s all in the 𝛽 parameter !
• Entropy model calibrates parameter as 𝛽 = -9.51213
• Poisson model calibrates parameter as 𝛽 = -0.00791
• But quirk of entropy program means all distances needed to
be divided by 1000… so, multiply Poisson parameter by 1000
and 𝛽 = -7.91533 – similar to the Entropy value
• If we plug this value into the entropy model (something
which is much easier to do with a home-made bespoke
model – the opposite can’t be done in R with the Poisson
model) – hey presto, identical results!
Interim conclusions
• Poisson regression and entropy maximising spatial
interaction models produce identical results when the
distance decay parameter is the same
• The only difference between the two approaches is in the
calibration of the distance decay parameter
• Statistical packages such as R and SPSS will calibrate these
parameters automatically using methods which are not fully
documented.
• R uses the ‘Iteratively Reweighting Least Squares’ algorithm
(produces comparable maximum likelihood estimates to the
Newton-Raphson routine (Green, 1984)) – and will provide
parameter estimates for as many dummy variables and
covariates as specified)
Interim conclusions
• Interpretation of parameters produced in R can be difficult –
whilst different coding schemes can be chosen, an intuitive
scheme such as Raymer’s (2007) ‘total reference category’
cannot
• Bespoke entropy maximising program offers more flexibility
but with the drawback of computer programming knowledge
required
• Calibration of more than 1 parameter in the entropy model
considerably more complicated than just 1 (for non-expert
programmers / mathematicians!)
Modelling inter-regional migration in Europe
• Experimentation with migration models led us to exploring
inter-regional migration in Europe – reliable data for model
calibration
• Could our experimentation offer any new perspectives or fill
gaps in data?
Question:
• Can we use inter-regional, intra-country data to effectively
model inter-regional, inter-country flows in a post-Schengen
open-border Europe?
• This is very much still Work in progress, but…
European NUTS2 regional system
Data collected for countries in 2006 –
collated for the DEMIFER project
Doubly constrained models – national results
Country
Code
Country
R2
FI
SE
AT
HU
SK
NL
DK
NO
BG
CZ
UK
PL
CH
BE
RO
DE
IT
ES
FR
Finland
Sweden
Austria
Hungary
Slovakia
Netherlands
Denmark
Norway
Bulgaria
Czech Republic
United Kingdom
Poland
Switzerland
Belgium
Romania
Germany
Italy
Spain
France
0.996
0.974
0.972
0.963
0.948
0.936
0.930
0.919
0.901
0.889
0.884
0.877
0.788
0.772
0.745
0.715
0.699
0.621
0.549
Generalise
d 𝛽 (power
function)
-0.754
-0.771
-0.747
-0.567
-0.773
-1.279
-0.969
-0.814
-0.825
-0.807
-0.927
-1.068
-0.867
-1.049
-0.763
-0.760
-0.718
0.154
1.093
• Can national parameters be
used and applied to regions?
• 𝛽 parameter closer to positive
= migrants less deterred by
distance
• Noticeable variation, e.g.
Netherlands – highest
negative 𝛽
• Spain and France 𝛽 values
unreliable – they are positive
(which would indicate
propensity to migrate
increases with distance) but
poor fits mean in reality this is
unlikely to be the situation
• Regional parameters may
provide better model inputs…
Origin/destination specific
• We can decompose the national 𝛽 parameters to regional
parameters by calibrating them separately for origins and
destinations…
𝛽
𝑀𝐼𝑗 = 𝐴𝐼 𝐡𝑗 𝑂𝐼 𝐷𝑗 𝑑𝑖𝑗𝐼
𝛽
𝑀𝑖𝐽 = 𝐴𝑖 𝐡𝐽 𝑂𝑖 𝐷𝐽 𝑑𝑖𝑗𝐽
(11)
(12)
after Stillwell (1978)
• Equivalent Poisson models are
ln 𝑀𝑖𝑗 = πœ† + πœ†π‘‚π‘– + πœ†π‘—π· + πœ†π‘‚π‘– ∗ ln 𝛽𝑑𝑖𝑗 (13)
ln 𝑀𝑖𝑗 = πœ† + πœ†π‘‚π‘– + πœ†π‘—π· + πœ†π‘—π· ∗ ln 𝛽𝑑𝑖𝑗
(14)
• Whilst model flow estimates are similar (not identical due to
algorithm issues discussed), parameters from Poisson model are
difficult to interpret as 𝛽 interacting with dummy variables –
consequently we chose entropy models as our vehicle...
Variation in distance decay parameters calibrated on
intra-country (internal), inter-regional flows, 2006
R2 = 0.941
R2 = 0.944
Multi-level spatial interaction models
Notation
Description
𝑀𝐼𝐽
Country level migration matrix
𝑀𝑖𝑗
NUTS2 region level migration matrix
𝐼𝐼
𝑀𝑖𝑗
Intra-country, inter-NUTS2 regional
matrix
Origin/row totals at inter-country level
𝑂𝐼 =
𝑀𝐼𝐽 = 𝑀𝐼+
𝐽≠𝐼
𝐽
𝐷 =
𝐼𝐽
+𝐽
𝐼𝐼
𝑀𝑖𝑗
𝐼𝐼
𝑀𝑖+
𝑀 =𝑀
Destination/column totals at intercountry level
𝐼≠𝐽
𝑂𝑖 =
=
Origin/row totals at NUTS2 level within
country
𝑗≠𝑖
𝐷𝑗 =
𝑀𝑖𝑗 = 𝑀+𝑗
𝐽𝐽
𝐽𝐽
Destination/column totals at NUTS2
level within country
𝐼𝐽
𝑀𝑖𝑗
=
𝐼𝐽
𝑀𝑖+
Origin/row totals at NUTS2 level where
NUTS2 region not a member of country
=
𝐼𝐽
𝑀+𝑗
•
𝑖≠𝑗
𝑂𝑖𝐼
=
𝑗∉𝐽
𝐽
𝐷𝑗
𝐼𝐽
𝑀𝑖𝑗
=
𝑖∉𝐼
Destination/column totals at NUTS2
level where NUTS2 region not a
member of country
•
Specifying multi-level constraints in
an entropy maximising model
means we can model unknown
flows incorporating maximum
amount of known information…
These constraints can estimated for
a crude version of the model using
𝑂𝑖 and 𝐷𝑗 distributions applied to
𝑂𝐼 and 𝐷 𝐽 totals
Multi-level model results
• Full matrix of flows for
2006 modelled (Eq 11 &
12) using O/D specific 𝛽
parameter estimates from
internal flows and crude 𝑂𝑖
𝐷𝑗 estimates
• In most cases model underpredicts internal flows,
suggesting too many
internal migrants
distributed internationally:
country border effects
under-estimated
• UK, DE, PO, RO – model
over predicts: border
effects over-estimated
Avg gross
Avg gross
error
error (Origin (Destination
specific
specific
beta)
beta)
Country
Code
Country
AT
Austria
-957.670
-896.175
BE
Belgium
-625.557
-662.539
BG
Bulgaria
-1073.268
-1220.228
CH
Switzerland
-839.091
-1060.700
CZ
Czech Republic
-1017.628
-1128.391
DE
Germany
101.323
114.369
DK
Denmark
-3058.586
-2268.791
ES
Spain
-375.229
-494.283
FI
Finland
-3280.723
-3421.611
FR
France
-780.847
-747.982
HU
Hungary
-2538.885
-2536.447
IE
Ireland
IT
Italy
-595.699
-593.918
NL
Netherlands
-890.721
-981.864
NO
Norway
-1666.206
-1766.215
PL
Poland
185.229
136.879
RO
Romania
329.574
97.110
SE
Sweden
-2212.665
-2306.671
SI
Slovenia
-2018.675
-2017.708
SK
Slovakia
-1177.098
-1097.057
UK
United Kingdom
424.536
434.086
-13455.477 -13429.681
Conclusions
• Poisson migration models and entropy maximising migration
models using distance to distribute flows produce identical
migration predictions given identical distance decay parameters
• Differences in outputs from our experiments are down to
algorithms used to calibrate 𝛽 parameters – trade off between
proprietary software ‘black box’ and additional complexity
involved and knowledge required to build bespoke software
• Entropy maximising models calibrated on internal migration data
can produce estimates of international (intra-EU), inter-regional
flows – a multi-level SIM framework enables known information to
be built into constraints
• In our trial model, despite open EU borders, internal migration
under-predictions in model suggest border effects underestimated for most counties. Model predicts too many migrants
flowing between countries (with notable exceptions) – internal
inter-regional migration poor predictor of international interregional migration
Future work
• Develop multi-level SIM framework fully so internal
migration flows within EU SIM constrained to known
country level information on internal migration (which
should improve inter-country estimates)
• Distance decay parameter has large effect on model outcome
– where this cannot be calibrated directly (due to insufficient
data) can we model the parameter using other covariates?
• Developing a hierarchy of model constraints – if total
migrant flows not available, what are the next best
constraints? Total populations? Migrant stocks? GDP?
References
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Abel, G.J. (2010), 'Estimation of international migration flow tables in Europe', Journal of the Royal Statistical
Society: Series A (Statistics in Society).
Batty, M. and Mackie, S. (1972), 'The calibration of gravity, entropy, and related models of spatial interaction',
Environment and Planning, 4 (2), 205-33.
Boyle, P.J., Flowerdew, R., and Shen, J. (1998), 'Modelling inter-ward migration in Hereford and Worcester: The
importance of housing growth and tenure', Regional Studies, 32 (2), 113 - 32.
Cohen, J., Roig, M., Reuman, D., and GoGwilt, C. (2008), 'International migration beyond gravity: a statistical
model for use in population projections', Proceedings of the National Academy of Sciences, 105 (40), 15269-74.
Flowerdew, R. (2010), 'Modelling migration with poisson regression', in J. Stillwell, O. Duke-Williams, and A.
Dennett (eds.), Technologies for Migration and Commuting Analysis: Spatial Interaction Data Applications: IGI
Global.
Fotheringham, A. S. (1983), 'A new set of spatial-interaction models: the theory of competing destinations',
Environment and Planning A, 15 (1), 15-36.
Stillwell, J. (1978), 'Interzonal migration: some historical tests of spatial-interaction models', Environment and
Planning A, 10, 1187-200.
Plane, D. A. (1982), 'An information theoretic approach to the estimation of migration flows', Journal of Regional
Science, 22 (4), 441-56.
Pooler, J. (1994), 'An extended family of spatial interaction models', Progress in Human Geography, 18 (1), 17-39.
Mayda, A. (2010), 'International migration: a panel data analysis of the determinants of bilateral flows', Journal of
Population Economics, 23 (4), 1249-74.
Raymer, J. (2007), 'The estimation of international migration flows: a general technique focused on the origin destination association structure', Environment and Planning A, 39, 985-95.
Willekens, F. (1983), 'Log-linear modelling of spatial interaction', Papers in Regional Science, 52 (1), 187-205.
Wilson, A. (1971), 'A family of spatial interaction models, and associated developments', Environment and
Planning A, 3, 1-32.
Zipf, G.K. (1946), 'The P1 P2 / D hypothesis: on the intercity movement of persons', American Sociological Review,
11 (6), 677-86.
Thank you
Adam Dennett
a.dennett@ucl.ac.uk
http://adamdennett.co.uk
Download