National Employment Survey Unit Methodology Division, CSO Project Team: Kevin McCormack, Dr. Mary Smyth, Sinead Phelan, Ann O’Dwyer Overview Structure of Earnings Survey EU Regulation – 4 years → met by National Employment Survey (NES) Microdata:60,000 employees- Annual & Hourly earnings; Hours worked: Age Gender Education Occupation NACE Full/part-time Nationality Length of Service EU Annual Earnings, GPG National Earnings Statistics RMFs NES Publication Example of Tables Mean hourly earnings in October 2007 by educational attainment, full/parttime status and sex Level of Educational Attainment Male Female Total Full-time Part-time Full-time Part-time Full-time Part-time € € € € € € Primary or Lower Secondary 17.62 13.15 14.78 13.06 16.88 13.08 Higher Secondary 18.68 12.44 16.36 14.56 17.78 14.15 Post Leaving Cert 20.00 13.31 15.91 15.11 18.89 14.80 Third level nondegree 23.06 14.75 19.37 17.20 21.02 16.90 Degree or higher 31.44 20.06 27.20 23.07 29.18 22.47 Total 21.69 14.11 20.42 15.69 21.17 15.40 SES - ADP Structure of Earnings Survey - Administrative Data Project Project Goal: 2011 & 2012 Annual Earnings Data required - EU & Nationally Administrative Data Response Burden, Cost Effective, Quality, Representative NES Annual Publication Roll-out Infrastructure: 2013 SES 2014 5 Modules 1) Research & Identify Potential Sources – ADS 2) Linking Data Sources 3) Modelling non-available characteristics 4) Construction of the SESADS 5) Publish Results (M1) Research & Identify ADS 7 Administrative Data Sources 2 External Revenue P35L Dept. Social Protection 5 CSO Census EHECS SILC • CBR • QNHS Fig. 1: SESADS primary data sources DSP P35L CBR QNHS SESADS EHECS SILC COP (M2) Linking Data Sources An analysis was undertaken of the data fields contained within the SESADS sources. Unique Identifiers: Per_IdNo. (PPS No. anonymised) - employees Ent_nbr (unique Enterprise Number ) - employers Most suitable unique identifiers (UI) to link: CSO’s data sources, DSP and Revenue Commissioners P35L data files Fig.2: Construction of the SESADS SESADS COP/QNHS/SILC Per_IdNo & ICA DSP Per_IdNo., Demographics CBR/EHECS Per_IdNo. Ent_nbr, Enterprise location, Size, and NACE P35L Per_IdNo. Ent_nbr, Gross annnual earnings, Weeks worked Occupation, NACE, Demographics, Education, Earnings Identity Correlation Approach (1) Census No Unique Identifier Linking social data sources (Census) is a greater challenge for the CSO. No Unique Identifiers (UIs), such as a PPS No. UIs were developed by following an identity correlation approach (ICA), e.g. combining date of birth, Gender , County live and NACE. E.g. 29101990|F|CORK|85| This identity correlation approach enabled the social data sources to be linked • SESADS Currently contains 1 million of the approx. 1.3 million F/P time employees in the State Quality checked 800,000 records, Representative of the NACE sectors, Identity Correlation Approach (2) Annual Births YoB = 63,000 DoB 63,000 / 365 days = 173 Gender ÷ 2 = NACE ÷ 14 = 6 (17) County ÷ 26 (3) = 1 (5) 86 E.g. 29101990|F|85|CORK| On completion of Module 2 - SESADS will contain all employees in the State, Gross Annual/Weekly Earnings classified by: Variables Sources NACE, Gender, Enterprise Size group, Public/Private sector, Weeks worked, ------------------------------------------ Occupation, Area of residence, Education, Age, Nationality. P35L CBR EHECS DSP ------------------- COP QNHS SILC Module 3: Modelling of non-available characteristics Employee characteristics to be modelled are: (1) Hours Worked (2) Annual bonuses (3) BIK (benefit in kind) (4) full/part-time employment status for employees. A multiple imputation methodology will be employed to carry out this stage of the Project. EHECS,QNHS and SILC data sources will be leveraged to provide the base information. Once this model is completed, the SESADS will fulfill both the Eurostat annual and 4- yearly Eurostat SES earnings requirements. Module 4: Construction of the SESADS The SESADS will be constructed in the CSO’s Administrative Data Centre (ADC) Structures (known as layers) consistent with those as outlined in the ESSnet on microdata linking and data warehousing in statistical production. SES – EU microdata format Module 5: Publication of Results The first set of SES statistics for 2011 and 2012 (gender pay gap and average earning) were submitted to Eurostat in November 2013. Finalised datasets with more detail will be available mid- 2015 NES Publication Timetable SESADP – signed off 2015 SES 2011 & 2012 Data NES Publication Roll out Project infrastructure for 2013 & 2014 data Assess by end 2015 SES 2014 Microdata- submitted to Eurostat mid-2016 ends- Cost Benefit Analysis Business Survey V’s 15 persons -Cost € 1.5 million -T+ 18 months -Quality Data Edits -Sample (70K) -Burden (10K Ents, 70K ees) -- SESADP 1 FTE (3.5) € 0.1 m (€0.2m) T+ 10 months Revenue data 1 million None Thanks to: CSO Divisions: Cork Dublin STS – cross division support EHECS ADC CENSUS Earnings Analysis CBR QNHS SILC IT Etc.