2008 Integrated Census - Israel Pnina Zadka Yael Feinstein

advertisement
2008 Integrated Census - Israel
Pnina Zadka
Yael Feinstein
Central Bureau of Statistics Israel
1
international seminar 2010 Beyond
2012korea
7/23/2016
Content
• The census principles
• Integrated census methodology
• Administrative sources
• Under coverage and over coverage samples
• Census weights
• Lessons learned
• Future Plans
2
international seminar 2010 Beyond
2012korea
7/23/2016
Israel
Population 7,800,000 (2011)
Area 22,072 km2
1191 Localities:
239 Urban localities
952 Rural Localities
About Half of the Urban Localities
subdivided into Statistical Areas
(Census tract)
3
16 ‫ יולי‬23
Census principles
•The census aim -
to provide reliable demographic, social
and economic information about the entire population in the
country, with maximum geographic resolution up to Statistical
Areas (SA).
The Integrated census is designed to make use of
administrative data sources (registers) and Direct Data
Collection (DDC).
Augment and correct administrative data sources.
Target population – usual residents in the country.
•
•
•
4
international seminar 2010 Beyond
2012korea
7/23/2016
Integrated census
•
•
•
•
5
Combines Administrative and Traditional Censuses
The CPR is the main component of the Administrative
part, Additional administrative sources for data completion
and updates
Two census surveys compose the traditional part:
•
•
Area sample to estimate under coverage
CPR sample to estimate over coverage
Applies Dual System Estimation approach to generate
Census estimates.
international seminar 2010 Beyond
2012korea
7/23/2016
•
•
•
•
Administrative sources:
Central Population Register
Central Population Register (CPR) is the backbone of the
integrated census
Unique PIN granted upon birth or immigration (universally
used in the country)
The CPR includes demographic variables.
The CPR carries inherent errors and incompatibility with
census definitions
Omission of residents (foreigners)
Inclusion of non-residents (emigrants)
Purposely incorrect addresses registration
•
•
•
6
international seminar 2010 Beyond
2012korea
7/23/2016
Improved Administrative File (IAF)
IAF augments the CPR using the following
processes:
•CPR freeze to census reference date
Emigrants tagging using:
•
• Border control registrations
• Eligibility to health care services and to social
security benefits
7
•Administrative families formation
•Addresses geo-coded
international seminar 2010 Beyond
2012korea
7/23/2016
IAF -Over and Under Coverage
• Global (national) under-coverage: individuals that
•
•
•
8
are not registered in the IAF are usual residents
population, e.g. foreign workers.
Local (statistical area / municipality) undercoverage: Individuals that live in the enumerated
area but are registered in a different address in the
IAF
Global over-coverage: individuals who live abroad
but are registered with an address in the country
Local over-coverage: Individuals who are registered
in one SA but live in a different SA.
international seminar 2010 Beyond
2012korea
7/23/2016
Dual System Estimation approach
• Modified capture-recapture methodology
• Area sample to estimate under coverage
• Total area in the country divided into 40K enumeration
•
•
cells (EC) nested in SA within municipalities
Each EC contains approximately 50 households on the
average (ranging between 30-70)
17% sample of cells
• CPR sample to estimate over coverage
• All individuals with address registered in sampled
9
international seminar 2010 Beyond
2012korea
EC
7/23/2016
Under coverage sample - U
A field survey in which the population in the
sampled EC was fully enumerated. From this
sample the estimate of under-coverage in the
IAF is calculated.
Under coverage estimate-The proportion of
enumerated residents in each estimation group
in the SA who are registered in a different SA.
10
international seminar 2010 Beyond
2012korea
7/23/2016
Over coverage sample -O
The over coverage sample O -A sample from
IAF (the integrated register);
All household/ individuals who are registered
in IAF in an EC sampled to the U sample.
Over coverage estimate-The proportion of
enumerated residents in each estimation group
in the SA who are registered the SA and live
in another SA
11
international seminar 2010 Beyond
2012korea
7/23/2016
U and O sample
Two main reasons for preferring the
integrated (dependent) sample
1. Efficiency and reduced response burden
2. Smaller population size variances
12
international seminar 2010 Beyond
2012korea
7/23/2016
Coverage samples -illustration
IAF list of residents in EC
within SA
Area (EC)
1
1
2
2
3
4
7
3
4
6
5
5
6
7
13
U sample
international seminar 2010 Beyond
2012korea
O sample7/23/2016
Statistical formulation
• To estimate the under and over-coverage of IAF
•
14
in homogeneous groups according to coverage
patterns .
To compute the census weight for every
individual in the IAF in order to obtain
systematic demographic estimates of the usual
population in the country by selected
demographic characteristics .
international seminar 2010 Beyond
2012korea
7/23/2016
The Integrated Census methodology
DDC
IAF
IN
OUT
IN
U11
U21
X U1+ + X
OUT
U12
U22
U2+
U+1
U+2
N=U11+U12+U21+U22
15
international seminar 2010 Beyond
2012korea
7/23/2016
The Integrated Census parameters
s
11
s
1
U
enumerated and registered in this SA
pˆ1 

U
enumerated in this SA
s
X
in IAF but do not belong to this SA
ˆ
  s
estimator of sampled cells size
Nˆ
U U
s
ˆ
N 
s
U11
s
1
16
s
1
international seminar 2010 Beyond
2012korea
7/23/2016
The Integrated Census parameters
Note that
number of individuals in IAF in this Area
p1   
U1+ +
X
N
The Actual parameters are not known and are
estimated from the Sample. The estimator for the
population size will be:
N̂ 
17
Number of individuals listed in IAF in this area
p̂1  ˆ
international seminar 2010 Beyond
2012korea
7/23/2016
The Integrated Census Weights
The census weight :
1
ˆ 
w
ˆ1  ˆ
p
Every record (individual) in the IAF gets a census
weight according to the appropriate estimation group.
Estimator for a group is the sum of the census weights
of the individuals in the group.
18
international seminar 2010 Beyond
2012korea
7/23/2016
Collective housing (Institutional population)
• Register of institutions based on list
obtained from headquarters of organization
• List of residents in each institution
(manual/digital)
• Full enumeration CAPI demographic
questionnaire
• 20% sample - self administered paper
questionnaire
• Census weight=1
19
international seminar 2010 Beyond
2012korea
7/23/2016
Socio-economic data sources
• Administrative sources
• Income tax files
• Social security allowances
• Heavy vision impairment
• Direct data collection – U sample survey
• CAPI questionnaire
• Sample of households – all individuals residing
• Questionnaire:
http://www.cbs.gov.il/mifkad/gues_2008_e.pdf
20
international seminar 2010 Beyond
2012korea
7/23/2016
Lessons learned
• Applying conceptual definition in
administrative files
Updated addresses in the CPR
• Sampling cells
• Updated GIS layer
•
21
international seminar 2010 Beyond
2012korea
7/23/2016
Quality assessment of the Administrative
families formation
• AF – administrative family
• EH –enumerated household in either the U-sample
survey or the O-sample survey.
• The assumption is that data collected in either of
•
22
the two integrated census surveys is the “actual
true household type and size”
Discrepancies stemming from the CPR and from
the algorithm procedure
international seminar 2010 Beyond
2012korea
7/23/2016
Assement AF results
• 86.5% AF=EH
• 13.5 discrepancies between AF and EH
• 7.8%
•
•
•
23
- AF contains members who were not
enumerated in either of the two surveys
4.3% - AF member living in a different address on the
census date
0.8% - AF erroneously merged more than one HH
0.6% - Living at the same address but in a separate
dwelling
international seminar 2010 Beyond
2012korea
7/23/2016
Summary of AF assessment
• Almost 8% of the discrepancies between the
sources were due to under-coverage of the
field work. For these cases methodology
can not be assessed
• Over 4% of the discrepancies were due to
un-updated CPR addresses
Less than 2% were caused by flaws in the
methodology
•
24
international seminar 2010 Beyond
2012korea
7/23/2016
Weaknesses of the CPR as backbone
• Internal migration updated late – about 20%
of the address not updated timely
• Newly married couples
• Only formal family relations are recorded –
missing childless cohabitating couples
• Localities with no street names
• No apartment identification in multidwelling houses
25
international seminar 2010 Beyond
2012korea
7/23/2016
Sampling cells
• Methodology of constructing sampling cells
appropriate for urban areas but produced
too many empty cells in Rural areas
(buildings used for agriculture purposes
erroneously defined as dwellings)
26
international seminar 2010 Beyond
2012korea
7/23/2016
GIS layers
• Layer update lag 18 months to 30 months.
• Newly build area not represented
• Misclassification of dwellings
• Improperly merged building in densely
build areas
27
international seminar 2010 Beyond
2012korea
7/23/2016
GIS
• About 5% of the buildings were missing in
the original layer and were added during the
pre-enumeration process by a crew of GIS
specialists
• Problems in defining cell borders in large
locality without address lists
28
international seminar 2010 Beyond
2012korea
7/23/2016
Future Plans
• Rolling integrated census
• Methodological principles
• Cost
• Organizational benefits
29
international seminar 2010 Beyond
2012korea
7/23/2016
Thank You
30
international seminar 2010 Beyond
2012korea
7/23/2016
Download