Administrative Data and their Use in Economic Statistics

advertisement
Regional Workshop for African Countries on Compilation of Basic
Economic Statistics
Pretoria, 23-26 July 2007
Administrative Data and
their Use in Economic
Statistics
Vladimir Markhonko
United Nations Statistics Division
Contents
 Definitions
 Advantages
of using administrative
data
 Common problems
 Quality of administrative data
 Using administrative data in practice
 Conclusions
Vladimir Markhonko 12/7/2007
2
Definition of
administrative data
narrow definition
wider definition
Vladimir Markhonko 12/7/2007
3
Narrow Definition of
Administrative Data
Data available in records of
governmental agencies administering
various governmental programmes
Examples: tax records, customs
declarations, social security records
Vladimir Markhonko 12/7/2007
4
Wider Definition of
Administrative Data
Data available in records of
Governmental agencies
+
Organizations operating in
private sector
Vladimir Markhonko 12/7/2007
5
Administrative sources of data are
sources containing information
which is not primarily collected for
statistical purposes.
Vladimir Markhonko 12/7/2007
6
Reasons for a wider definition
 Privatisation
of some governmental
functions
 Growth
of private sector organizations
collecting statistically significant data
 User
interest in new types of data which
might not be collected by NSOs
 Cost
efficiency
Vladimir Markhonko 12/7/2007
7
Benefits of Administrative Data
 Cost

Surveys / censuses are expensive,
administrative data are often “free”
 Response


burden
Reduced burden on data suppliers
Statistics can be compiled more
frequently with no extra burden
Vladimir Markhonko 12/7/2007
8
 Coverage



Full coverage of target population
No survey errors and lower non-response
Better small-area data
 Timeliness

can be improved for some types of data but
not for all
 Public

image
National statistical office is perceived as
more efficient both in terms of cutting costs
and provision of better data
Vladimir Markhonko 12/7/2007
9
An example: population census costs for
some European countries in 2000-2001
- €6.2 per person
 Austria - €6.9 per person
 Finland - €0.2 per person
 UK

due to extensive use of administrative
data
Source: Eurostat – Documentation of the 2000
round of population and Housing censuses in the
EU, EFTA and Candidate Countries; Table 22
Vladimir Markhonko 12/7/2007
10
Typical problems
 Administrative
units do not always
coincide with statistical units
 Need to perform data conversion
 Difficulty in profiling of the more
complex cases


Gives a better understanding of
complex business structures
Expensive and needs trained staff
Vladimir Markhonko 12/7/2007
11
 Different

definitions and classifications
Conversion tables needed for different
classifications
 Administrative
and statistical priorities
are often different
 Timeliness


Data may arrive too late
Data relate to a different time period
Vladimir Markhonko 12/7/2007
12
 Change


Risk of changes in government
policy, thresholds, definitions,
coverage etc.
Need contingency plans
 Data


management
from multiple sources
Matching / linking issues
Data conflicts – priority rules
Vladimir Markhonko 12/7/2007
13
Quality of Administrative Data
 Administrative
data will be better
than survey data in some aspects
but not in others
 It
is important to look at overall
quality
 Do
the data quality meet the needs
of users?
Vladimir Markhonko 12/7/2007
14
Three Aspects of Quality
 Quality
of incoming data
 Quality
of processing
(matching, merging, ...)
 Quality
of outputs - likely to be
different to survey based outputs,
but are they better?
Vladimir Markhonko 12/7/2007
15
Quality Measurement
 How
to measure the quality of
data from administrative
sources?




Comparing sources
Quality check surveys
Knowledge of source (metadata)
Quality reports
Vladimir Markhonko 12/7/2007
16
Using Administrative Data
 Conversion
to statistical concepts and
definitions
 Linking / Matching


Exact Matching - linking records from two
or more sources, often using common
identifiers – unique identifier of units
Probabilistic Matching - determining the
probability that records from different
sources should match, using a
combination of variables
Vladimir Markhonko 12/7/2007
17
 Shift
in paradigm:
 Some
statistical offices prefer first to
create a database populated with
administrative data and use statistical
surveys only to fill the gaps
 Implies
a high degree of trust in quality
of administrative data
Vladimir Markhonko 12/7/2007
18
Conclusions
 Administrative
sources should be
defined in the widest sense
 There are many benefits in using
administrative data, particularly the
reduced costs of data and better
coverage
 There are problems when using
administrative data, but an acceptable
solution usually can be found
Vladimir Markhonko 12/7/2007
19
 Most
problems can be solved by
effective planning and detailed
knowledge of the source
 The
benefits are greater than the
problems encountered
 UNSD
plans to prepare a Handbook on
Use of Administrative Data and put
good country practices on its website.
Vladimir Markhonko 12/7/2007
20
Thank you for your attention.
Vladimir Markhonko 12/7/2007
21
Download