2008 Integrated Census - Israel Pnina Zadka Yael Feinstein Central Bureau of Statistics Israel 1 international seminar 2010 Beyond 2012korea 7/23/2016 Content • The census principles • Integrated census methodology • Administrative sources • Under coverage and over coverage samples • Census weights • Lessons learned • Future Plans 2 international seminar 2010 Beyond 2012korea 7/23/2016 Israel Population 7,800,000 (2011) Area 22,072 km2 1191 Localities: 239 Urban localities 952 Rural Localities About Half of the Urban Localities subdivided into Statistical Areas (Census tract) 3 16 יולי23 Census principles •The census aim - to provide reliable demographic, social and economic information about the entire population in the country, with maximum geographic resolution up to Statistical Areas (SA). The Integrated census is designed to make use of administrative data sources (registers) and Direct Data Collection (DDC). Augment and correct administrative data sources. Target population – usual residents in the country. • • • 4 international seminar 2010 Beyond 2012korea 7/23/2016 Integrated census • • • • 5 Combines Administrative and Traditional Censuses The CPR is the main component of the Administrative part, Additional administrative sources for data completion and updates Two census surveys compose the traditional part: • • Area sample to estimate under coverage CPR sample to estimate over coverage Applies Dual System Estimation approach to generate Census estimates. international seminar 2010 Beyond 2012korea 7/23/2016 • • • • Administrative sources: Central Population Register Central Population Register (CPR) is the backbone of the integrated census Unique PIN granted upon birth or immigration (universally used in the country) The CPR includes demographic variables. The CPR carries inherent errors and incompatibility with census definitions Omission of residents (foreigners) Inclusion of non-residents (emigrants) Purposely incorrect addresses registration • • • 6 international seminar 2010 Beyond 2012korea 7/23/2016 Improved Administrative File (IAF) IAF augments the CPR using the following processes: •CPR freeze to census reference date Emigrants tagging using: • • Border control registrations • Eligibility to health care services and to social security benefits 7 •Administrative families formation •Addresses geo-coded international seminar 2010 Beyond 2012korea 7/23/2016 IAF -Over and Under Coverage • Global (national) under-coverage: individuals that • • • 8 are not registered in the IAF are usual residents population, e.g. foreign workers. Local (statistical area / municipality) undercoverage: Individuals that live in the enumerated area but are registered in a different address in the IAF Global over-coverage: individuals who live abroad but are registered with an address in the country Local over-coverage: Individuals who are registered in one SA but live in a different SA. international seminar 2010 Beyond 2012korea 7/23/2016 Dual System Estimation approach • Modified capture-recapture methodology • Area sample to estimate under coverage • Total area in the country divided into 40K enumeration • • cells (EC) nested in SA within municipalities Each EC contains approximately 50 households on the average (ranging between 30-70) 17% sample of cells • CPR sample to estimate over coverage • All individuals with address registered in sampled 9 international seminar 2010 Beyond 2012korea EC 7/23/2016 Under coverage sample - U A field survey in which the population in the sampled EC was fully enumerated. From this sample the estimate of under-coverage in the IAF is calculated. Under coverage estimate-The proportion of enumerated residents in each estimation group in the SA who are registered in a different SA. 10 international seminar 2010 Beyond 2012korea 7/23/2016 Over coverage sample -O The over coverage sample O -A sample from IAF (the integrated register); All household/ individuals who are registered in IAF in an EC sampled to the U sample. Over coverage estimate-The proportion of enumerated residents in each estimation group in the SA who are registered the SA and live in another SA 11 international seminar 2010 Beyond 2012korea 7/23/2016 U and O sample Two main reasons for preferring the integrated (dependent) sample 1. Efficiency and reduced response burden 2. Smaller population size variances 12 international seminar 2010 Beyond 2012korea 7/23/2016 Coverage samples -illustration IAF list of residents in EC within SA Area (EC) 1 1 2 2 3 4 7 3 4 6 5 5 6 7 13 U sample international seminar 2010 Beyond 2012korea O sample7/23/2016 Statistical formulation • To estimate the under and over-coverage of IAF • 14 in homogeneous groups according to coverage patterns . To compute the census weight for every individual in the IAF in order to obtain systematic demographic estimates of the usual population in the country by selected demographic characteristics . international seminar 2010 Beyond 2012korea 7/23/2016 The Integrated Census methodology DDC IAF IN OUT IN U11 U21 X U1+ + X OUT U12 U22 U2+ U+1 U+2 N=U11+U12+U21+U22 15 international seminar 2010 Beyond 2012korea 7/23/2016 The Integrated Census parameters s 11 s 1 U enumerated and registered in this SA pˆ1 U enumerated in this SA s X in IAF but do not belong to this SA ˆ s estimator of sampled cells size Nˆ U U s ˆ N s U11 s 1 16 s 1 international seminar 2010 Beyond 2012korea 7/23/2016 The Integrated Census parameters Note that number of individuals in IAF in this Area p1 U1+ + X N The Actual parameters are not known and are estimated from the Sample. The estimator for the population size will be: N̂ 17 Number of individuals listed in IAF in this area p̂1 ˆ international seminar 2010 Beyond 2012korea 7/23/2016 The Integrated Census Weights The census weight : 1 ˆ w ˆ1 ˆ p Every record (individual) in the IAF gets a census weight according to the appropriate estimation group. Estimator for a group is the sum of the census weights of the individuals in the group. 18 international seminar 2010 Beyond 2012korea 7/23/2016 Collective housing (Institutional population) • Register of institutions based on list obtained from headquarters of organization • List of residents in each institution (manual/digital) • Full enumeration CAPI demographic questionnaire • 20% sample - self administered paper questionnaire • Census weight=1 19 international seminar 2010 Beyond 2012korea 7/23/2016 Socio-economic data sources • Administrative sources • Income tax files • Social security allowances • Heavy vision impairment • Direct data collection – U sample survey • CAPI questionnaire • Sample of households – all individuals residing • Questionnaire: http://www.cbs.gov.il/mifkad/gues_2008_e.pdf 20 international seminar 2010 Beyond 2012korea 7/23/2016 Lessons learned • Applying conceptual definition in administrative files Updated addresses in the CPR • Sampling cells • Updated GIS layer • 21 international seminar 2010 Beyond 2012korea 7/23/2016 Quality assessment of the Administrative families formation • AF – administrative family • EH –enumerated household in either the U-sample survey or the O-sample survey. • The assumption is that data collected in either of • 22 the two integrated census surveys is the “actual true household type and size” Discrepancies stemming from the CPR and from the algorithm procedure international seminar 2010 Beyond 2012korea 7/23/2016 Assement AF results • 86.5% AF=EH • 13.5 discrepancies between AF and EH • 7.8% • • • 23 - AF contains members who were not enumerated in either of the two surveys 4.3% - AF member living in a different address on the census date 0.8% - AF erroneously merged more than one HH 0.6% - Living at the same address but in a separate dwelling international seminar 2010 Beyond 2012korea 7/23/2016 Summary of AF assessment • Almost 8% of the discrepancies between the sources were due to under-coverage of the field work. For these cases methodology can not be assessed • Over 4% of the discrepancies were due to un-updated CPR addresses Less than 2% were caused by flaws in the methodology • 24 international seminar 2010 Beyond 2012korea 7/23/2016 Weaknesses of the CPR as backbone • Internal migration updated late – about 20% of the address not updated timely • Newly married couples • Only formal family relations are recorded – missing childless cohabitating couples • Localities with no street names • No apartment identification in multidwelling houses 25 international seminar 2010 Beyond 2012korea 7/23/2016 Sampling cells • Methodology of constructing sampling cells appropriate for urban areas but produced too many empty cells in Rural areas (buildings used for agriculture purposes erroneously defined as dwellings) 26 international seminar 2010 Beyond 2012korea 7/23/2016 GIS layers • Layer update lag 18 months to 30 months. • Newly build area not represented • Misclassification of dwellings • Improperly merged building in densely build areas 27 international seminar 2010 Beyond 2012korea 7/23/2016 GIS • About 5% of the buildings were missing in the original layer and were added during the pre-enumeration process by a crew of GIS specialists • Problems in defining cell borders in large locality without address lists 28 international seminar 2010 Beyond 2012korea 7/23/2016 Future Plans • Rolling integrated census • Methodological principles • Cost • Organizational benefits 29 international seminar 2010 Beyond 2012korea 7/23/2016 Thank You 30 international seminar 2010 Beyond 2012korea 7/23/2016