Administrative Data and their Use in Economic Statistics Vladimir Markhonko

advertisement
Administrative Data and
their Use in Economic
Statistics
Vladimir Markhonko
United Nations Statistics Division
Vladimir Markhonko 12/7/2007
Contents
 Definitions
 Advantages
of using administrative
data
 Common problems
 Quality of administrative data
 Using administrative data in practice
 Conclusions
Vladimir Markhonko 12/7/2007
Narrow Definition
Data Sources
Primary
(Statistical)
Secondary
(Non-statistical)
Public
Sector
Vladimir Markhonko 12/7/2007
Private
Sector
Wider Definition
Data Sources
Primary
(Statistical)
Secondary
(Non-statistical)
Public
Sector
Vladimir Markhonko 12/7/2007
Private
Sector
Administrative sources
are sources containing
information which is not
primarily collected for
statistical purposes.
Vladimir Markhonko 12/7/2007
Reasons for this Definition
 Privatisation
of some government
functions
 Growth of private sector “value-added
re-sellers”
 User interest in new types of data
Vladimir Markhonko 12/7/2007
Benefits of Administrative Data
 Cost

Surveys / censuses are expensive,
administrative data are often “free”
 Response


burden
Reduced burden on data suppliers
Statistics can be compiled more
frequently with no extra burden
Vladimir Markhonko 12/7/2007
Benefits of Administrative Data
 Coverage



Full coverage of target population
No survey errors and lower non-response
Better small-area data
 Timeliness
(sometimes!)
 Public image

Making use of existing data can enhance
the prestige of a statistical organisation by
making it seem more efficient
Vladimir Markhonko 12/7/2007
Population Census Costs
2000-2001
€367m, €6.2 per person
 Austria, €56m, €6.9 per person
 Finland, €0.8m, €0.2 per person
 UK,
Source: Eurostat – Documentation of the 2000
round of population and Housing censuses in the
EU, EFTA and Candidate Countries; Table 22
Vladimir Markhonko 12/7/2007
Common Problems
 Administrative
units do not always
coincide with statistical units
 Conversion via automatic rules for
simple cases
 Profiling for more complex cases


Gives a better understanding of
complex business structures
Expensive and needs trained staff
Vladimir Markhonko 12/7/2007
Vladimir Markhonko 12/7/2007
Common Problems
 Different


definitions and classifications
Administrative and statistical priorities are
often different
Conversion matrices needed for different
classifications
 Timeliness


Data arrive too late
Data relate to a different time period
Vladimir Markhonko 12/7/2007
Lag in12/7/2007
days
Vladimir Markhonko
1000
950
900
850
800
750
700
650
600
550
500
450
400
350
300
250
200
150
100
50
0
Frequency (thousands)
VAT Birth Lags
200
180
160
140
120
100
80
60
40
20
0
VAT Birth Lags
 2/3
of businesses are on the register
within 2 months of start-up
 Mean lag = 4 months due to “outliers”
 Median = Approx. 40 days
 Some pre-register - negative lags
Vladimir Markhonko 12/7/2007
Common Problems
 Change


Risk of changes in government policy,
thresholds, definitions, coverage etc.
Need contingency plans
 Data


management
from multiple sources
Matching / linking issues
Data conflicts – priority rules
Vladimir Markhonko 12/7/2007
Quality of Administrative Data
 There
are many aspects to quality
 Administrative data will be better than
survey data in some aspects but not
in others
 It is important to look at overall quality
 Do the data meet the needs of users?
Vladimir Markhonko 12/7/2007
Three Aspects of Quality
 Quality
of incoming data
 Quality of processing
(matching, merging, ...)
 Quality of outputs - likely to be
different to survey based outputs,
but are they better?
Vladimir Markhonko 12/7/2007
Quality Measurement
 How
to measure the quality of data
from administrative sources?




Comparing sources
Quality check surveys
Knowledge of source (metadata)
Quality reports / templates
Vladimir Markhonko 12/7/2007
Quality Templates
Companies House Data
• Framework: Contract
• Frequency: Quarterly updates, continuous
on-line access
• Timeliness: Good
• Quality:
Good
• Delivery:
CD-ROM / Internet
• Key content: Legal name, company number
Vladimir Markhonko 12/7/2007
Using Administrative Data
 Conversion
to statistical concepts and
definitions
 Linking / Matching
–
–
Exact Matching - linking records from
two or more sources, often using
common identifiers
Probabilistic Matching - determining the
probability that records from different
sources should match, using a
combinationVladimir
of variables
Markhonko 12/7/2007
UK Business Register
VAT
Survey
inputs
Satellite
registers
Company
registrations
PAYE
Geographic
information
systems
Business
Register
Dun and
Bradstreet
Vladimir Markhonko 12/7/2007
Vladimir Markhonko 12/7/2007
Satellite Registers
Vladimir Markhonko 12/7/2007
Examples of Satellite Registers
 Tourism
- hotel register (category,
number of beds)
 Transport - vehicle or ship register
(type, capacity)
 Distributive trades - buildings register
(building size, sales area)
Vladimir Markhonko 12/7/2007
Conclusions
 Administrative
sources should be
defined in the widest sense
 There are many benefits in using
administrative data, particularly
reduced costs
 There are problems when using
administrative data, but usually
someone has found a solution
Vladimir Markhonko 12/7/2007
Conclusions
 Most
problems can be reduced by
effective planning and detailed
knowledge of the source
 The benefits are often greater than
the costs
Vladimir Markhonko 12/7/2007
Thank you for your attention.
Vladimir Markhonko 12/7/2007
Download