Regional Workshop for African Countries on Compilation of Basic Economic Statistics Pretoria, 23-26 July 2007 Administrative Data and their Use in Economic Statistics Vladimir Markhonko United Nations Statistics Division Contents Definitions Advantages of using administrative data Common problems Quality of administrative data Using administrative data in practice Conclusions Vladimir Markhonko 12/7/2007 2 Definition of administrative data narrow definition wider definition Vladimir Markhonko 12/7/2007 3 Narrow Definition of Administrative Data Data available in records of governmental agencies administering various governmental programmes Examples: tax records, customs declarations, social security records Vladimir Markhonko 12/7/2007 4 Wider Definition of Administrative Data Data available in records of Governmental agencies + Organizations operating in private sector Vladimir Markhonko 12/7/2007 5 Administrative sources of data are sources containing information which is not primarily collected for statistical purposes. Vladimir Markhonko 12/7/2007 6 Reasons for a wider definition Privatisation of some governmental functions Growth of private sector organizations collecting statistically significant data User interest in new types of data which might not be collected by NSOs Cost efficiency Vladimir Markhonko 12/7/2007 7 Benefits of Administrative Data Cost Surveys / censuses are expensive, administrative data are often “free” Response burden Reduced burden on data suppliers Statistics can be compiled more frequently with no extra burden Vladimir Markhonko 12/7/2007 8 Coverage Full coverage of target population No survey errors and lower non-response Better small-area data Timeliness can be improved for some types of data but not for all Public image National statistical office is perceived as more efficient both in terms of cutting costs and provision of better data Vladimir Markhonko 12/7/2007 9 An example: population census costs for some European countries in 2000-2001 - €6.2 per person Austria - €6.9 per person Finland - €0.2 per person UK due to extensive use of administrative data Source: Eurostat – Documentation of the 2000 round of population and Housing censuses in the EU, EFTA and Candidate Countries; Table 22 Vladimir Markhonko 12/7/2007 10 Typical problems Administrative units do not always coincide with statistical units Need to perform data conversion Difficulty in profiling of the more complex cases Gives a better understanding of complex business structures Expensive and needs trained staff Vladimir Markhonko 12/7/2007 11 Different definitions and classifications Conversion tables needed for different classifications Administrative and statistical priorities are often different Timeliness Data may arrive too late Data relate to a different time period Vladimir Markhonko 12/7/2007 12 Change Risk of changes in government policy, thresholds, definitions, coverage etc. Need contingency plans Data management from multiple sources Matching / linking issues Data conflicts – priority rules Vladimir Markhonko 12/7/2007 13 Quality of Administrative Data Administrative data will be better than survey data in some aspects but not in others It is important to look at overall quality Do the data quality meet the needs of users? Vladimir Markhonko 12/7/2007 14 Three Aspects of Quality Quality of incoming data Quality of processing (matching, merging, ...) Quality of outputs - likely to be different to survey based outputs, but are they better? Vladimir Markhonko 12/7/2007 15 Quality Measurement How to measure the quality of data from administrative sources? Comparing sources Quality check surveys Knowledge of source (metadata) Quality reports Vladimir Markhonko 12/7/2007 16 Using Administrative Data Conversion to statistical concepts and definitions Linking / Matching Exact Matching - linking records from two or more sources, often using common identifiers – unique identifier of units Probabilistic Matching - determining the probability that records from different sources should match, using a combination of variables Vladimir Markhonko 12/7/2007 17 Shift in paradigm: Some statistical offices prefer first to create a database populated with administrative data and use statistical surveys only to fill the gaps Implies a high degree of trust in quality of administrative data Vladimir Markhonko 12/7/2007 18 Conclusions Administrative sources should be defined in the widest sense There are many benefits in using administrative data, particularly the reduced costs of data and better coverage There are problems when using administrative data, but an acceptable solution usually can be found Vladimir Markhonko 12/7/2007 19 Most problems can be solved by effective planning and detailed knowledge of the source The benefits are greater than the problems encountered UNSD plans to prepare a Handbook on Use of Administrative Data and put good country practices on its website. Vladimir Markhonko 12/7/2007 20 Thank you for your attention. Vladimir Markhonko 12/7/2007 21