Integrated Data Infrastructure (IDI) Project manager – Guido Stark Linking data across government How Statistics New Zealand maintains privacy in large scale data integration projects June 2012 2008 LEED – MSD Benefit data History Person to Business Link 2004 Student Loans (SL) Student Loans IR MSD (StudyLink) SLAM EMS Self-employed IR Education Tertiary MoE 2009 Employment Outcomes of Tertiary Education (EOTE) 2005 Linked Employer-Employee Data (LEED) Person to Business Link EMS Self-employed IR Person to Business Link EMS Self-employed IR 2007 Student Loans and Allowances (SLA) Student Loans & Allowances IR MSD (StudyLink) SLAM 2007 Prototype Longitudinal Business Database Benefits MSD (BDD) Education Tertiary MoE 2010 EOTE + secondary education data Education Secondary MoE Person to Business Link EMS Self-employed IR Education Tertiary MoE Education Tertiary MoE 2011 LEED – Household Labour Force Survey (HLFS) LBD Person to Business Link EMS Self-employed IR HLFS Stats NZ 2 Statistics 2020 – Te Kāpehu Whetū Achieving the statistical system of the future Lead the Official Statistics System • Responding to customer needs Obtain more value from official statistics Transform the way we deliver statistics • • Maximise the use of administrative data Increase use of data integration Create a responsive, customer-focused, influential, and sustainable organisation Data integration raises real and perceived privacy risks 3 Integrated Data Infrastructure (IDI) Project manager – Guido Stark 4 Data Integration Policy Four key principles 1. The public benefits of integration outweigh both: • privacy concerns about use, and • risks to the integrity of the Official Statistics System or other activities of government 2. The integrated data will only be used for statistical or research purposes 3. The data integration will be conducted in an open and transparent manner 4. Data will not be integrated where an explicit commitment has been given to respondents that would preclude such action. 5 The triangle Privacy Safe Data Security Confidentiality 6 IDI security Secure workspaces Limited access Access control Regular audits of access and use Output control 7 IDI confidentiality Encryption / transformation IDI rules • • • Aggregating Rounding Suppression Output checking Statistical vs operational 8 IDI privacy Privacy Impact Assessment • www.stats.govt.nz/IDI Risks • • • Information used in a way that is detrimental to their personal circumstances Information might be released that identifies them and aspects of their personal circumstances Unrelated information might be collected about them in an ever-growing database for non-specific purposes (i.e. ‘Big Brother’). Benefits • • • Potential for new official statistics Provides an evidence base for research, evaluation, and policy formulation Meets Statistics NZ’s strategic priorities Data Integration Policy The Privacy Act 1993 Principle 1: Purpose of collection of personal information Principle 2: Source of personal information Principle 12: Unique identifiers The Statistics Act 1975 Section 3: Official statistics and coordination Section 15: Independence of the Government Statistician Section 37: Security of Information Section 21: Declaration of secrecy 10 Information flows Restricted access in accordance with Statistics NZ’s Security Framework Load source data Source data Unique identifiers are encrypted Clean source data Create link tables Names, addresses, date of birth removed Access for bona-fide statistical purposes to required data sources only Clean source data Create core tables SNZ derived unique identifier available Outputs Confidentiality checks Linking Core tables The triangle Privacy Safe projects Safe researchers Safe Data Security Safe access Confidentiality Safe output 12 Use of the IDI Developing regular measures of immigrant outcomes Developing tertiary education outcome indicators How successful is NZ in retaining qualifications Intellectual property and productivity Mapping post-compulsory school pathways and outcomes The impact of immigration on the labour market The influence of education on outcomes What is the impact of gaining qualifications for beneficiaries Who doesn’t participate in tertiary education Benefit to work transitions The effect of wage subsidies on individual and firm employment Access Making the most of the IDI Secondment Statistics NZ datalab • • • • • • Legal compliance Bona fide research Non-regulatory Proven researcher Confidentiality Suitable data source Integrated Data Infrastructure (IDI) ? ? ? ? Migration data DoL Education secondary & tertiary LEED LBD MoE Student Loans & Allowances Central Linking Concordance Inland Revenue MSD (StudyLink) SLAM EMS Self-employed (CLC) Business data Person to business link Inland Revenue Benefits HLFS / NZIS Survey MSD (BDD) Outputs Relevant releases Dynamic datasets Cutting edge cubes Powerful research 15 Questions? IDI Project manager – Guido Stark guido.stark@stats.govt.nz June 2012