The Use of Administrative Sources for Statistical Purposes Steven Vale United Nations Economic Commission for Europe Day 1 09.00 – 10.00 Introductions and overview of the course 10.00 – 12.30 Introduction to administrative sources - definitions, benefits and quality considerations 14.00 – 15.30 Frameworks for the access and use of administrative data 15.45 – 17.30 Frameworks in Finland and the Medstat countries Day 2 09.00 – 10.30 Common problems and solutions 10.45 – 12.30 Common problems and solutions 14.00 – 15.30 Presentations from participating countries 15.45 – 17.30 Presentations from participating countries and discussion Day 3 09.00 – 10.45 Introduction to Matching 11.00 – 12.30 Presentations from participating countries 14.00 – 15.45 Administrative sources in statistical registers 16.00 – 17.30 Presentations from participating countries and discussion Day 4 09.00 – 10.30 Case Study – The use of administrative sources in Finland 10.45 – 12.30 Case Study – The use of administrative sources in Finland 14.00 – 15.00 International work in the field of administrative sources 15.00 – 15.30 Questions and answers 15.45 – 17.00 Closing session, feedback and evaluation Format of the Course • Presentations and case studies – Course leaders – Participants • Group exercises and discussions – Watch out for the orange screen! • Time for questions – but please ask if anything is unclear The Use of Administrative Sources for Statistical Purposes Steven Vale United Nations Economic Commission for Europe Group Exercise What are Administrative Sources? In less than 20 words! Gordon Brackstone, Statistics Canada (1987) Four distinguishing features of administrative data 1. The agent that supplies the data to the statistical agency and the unit to which the data relate are different (in contrast to most statistical surveys); 2. The data were originally collected for a definite non-statistical purpose that might affect the treatment of the source unit; 3. Complete coverage of the target population is the aim; 4. Control of the methods by which the administrative data are collected and processed rests with the administrative agency. Eurostat ‘CODED’ Glossary: An administrative source is the organisational unit responsible for implementing an administrative regulation (or group of regulations) for which the corresponding register of units and the transactions are viewed as a source of statistical data. Source: OECD and others, "Measuring the Non-Observed Economy: A Handbook", A Wider Definition? First introduced in the Final Report of the Eurostat internal Task Force on Administrative Sources, 1997 Types of Data Source Data Sources Primary (Statistical) Secondary (Non-statistical) Public Sector Private Sector Narrow Definition Data Sources Primary (Statistical) Secondary (Non-statistical) Public Sector Private Sector Wider Definition Data Sources Primary (Statistical) Secondary (Non-statistical) Public Sector Private Sector Administrative sources are sources containing information which is not primarily collected for statistical purposes. Reasons for this definition • Increasing privatisation of government functions • Growth of private sector data and “value-added re-sellers” • User interest in new types of data Examples of Administrative Sources • Tax data - Personal income tax - Value Added Tax (VAT) - Business / profits tax • Social security data • Health / education records • Registration systems for persons / businesses / property / vehicles • Identity cards / passports / driving licenses • Electoral register • Register of farms • Local council registers • Building permits • Licensing systems e.g. television, sale of restricted goods, import / export • Published business accounts • Internal accounting data • Data held by private businesses: - credit agencies - business analysts - utility companies - telephone directories - retailers with store cards etc. Store Cards In return for a few benefits, users give the stores a lot of data: • Name, address, sex, age • Family circumstances (e.g. baby products, toys, pet food) • Indicators of work status and income (time of shopping and type of goods) • Car ownership (petrol purchases) The Benefits of Using Administrative Sources Cost • Surveys are expensive, a census is worse, data from administrative sources are often “free” • Less staff are needed to process administrative data - no need for response chasing. Population census costs 2000-2001 • Austria, €56m, €6.90 per person • UK, €367m, €6.20 per person • Finland, €0.8m, €0.20 per person Source: Eurostat – Documentation of the 2000 round of population and Housing censuses in the EU, EFTA and Candidate Countries; Table 22 Response Burden • Using administrative sources: – Reduces the burden on data suppliers – Allows statistics to be compiled more frequently with no extra burden • Data suppliers complain if they are asked to provide the same information many times by different government departments Coverage • Administrative sources usually offer better coverage of target populations, and can make statistics more accurate: – No survey errors – No (or low) non-response • Better coverage gives: – Better small-area data – More detailed information Timeliness • Producing statistics from administrative sources can sometimes be quicker than using surveys • No need for: – forms design; – pilot surveys; – sample design etc. Public Image • Making more use of existing data can enhance the prestige of a statistical organisation by making it seem more efficient • The concept of “Joined-up government” is politically appealing Group Discussion What are the actual and potential benefits of using administrative sources in your countries? Quality and Administrative Sources Quality • Are data from administrative sources as good as data from surveys? • Who should judge this? • How can we measure quality? • How should we report and communicate quality? Definition of Quality International Standard ISO 9000/2005 defines quality as; 'The degree to which a set of inherent characteristics fulfils requirements.’ What does this mean? • Whose requirements? – The user of the goods or services • A set of inherent characteristics? – Users judge quality against a set of criteria concerning different characteristics of the goods or services • Therefore, quality is all about providing goods and services that meet the needs of users (customers) Quality Components • Different statistical organisations use different lists of components - but all lists are quite similar • UNECE list: Relevance Accessibility Accuracy Clarity Timeliness Comparability Punctuality Relevance • Are the statistics that are produced needed? • Are the statistics that are needed produced? • Do the concepts, definitions and classifications meet user needs? Accuracy • The closeness of statistical estimates to true values • In the past: quality = accuracy • Administrative sources can help to improve accuracy by removing survey errors Timeliness • The length of time between data being made available and the event or phenomenon they describe Punctuality • The time lag between the actual delivery date and the promised delivery date Accessibility • The ways in which users can obtain or benefit from statistical services (pricing, format, location, language etc.) Clarity • The availability of additional material (e.g. metadata, charts etc.) to allow users to understand outputs better Comparability • The extent to which differences are real, or due to methodological or measurement differences – Comparability over time – Comparability through space (e.g. between countries / regions) – Comparability between statistical domains (sometimes referred to as coherence) Other Considerations • Cost / efficiency • Integrity / trust • Professionalism – Adherence to international standards (e.g. UN Fundamental Principles of Official Statistics) Quality Measurement • How can we measure the quality of data from administrative sources? • There are established methods for measuring the quality of survey data, but these are not always relevant for administrative data Three Aspects of Quality • To understand the quality of administrative sources we need to consider: – Quality of incoming data – Quality of processing – Quality of outputs Incoming Data • Timeliness • Completeness – are there any missing units or variables? • Comparability with other sources • Quality check survey? • Knowledge of the source is vital! Processing • • • • • Quality of matching / linking Outlier detection and treatment Quality of data editing Quality of imputation Keep raw data / metadata to refer back to if necessary Outputs • Are the users satisfied? • Are the outputs comparable with data from other sources? • What is the impact on time series? • Are the outputs cost-effective? • Quality reports to measure and communicate differences? Quality Reports • Formats proposed by Eurostat Quality Working Group for data from: – A single source – Combined sources See Eurostat paper:“Quality assessment of administrative data for statistical purposes” Metadata • Knowledge and documentation of the source is vital to help us to understand quality: – How the data are collected – – – – Why they are collected How they are processed Concepts and definitions used etc… Group Discussion Experiences of quality measurement in practice?