Administrative Data Seminar - an added dimension to official statistics

advertisement
Realizing the statistical potential
of administrative data
John Dunne, John Hayes
Central Statistics Office, Ireland
Paper presented at the U.N.E.C.E. Seminar on New Frontiers for Statistical Data Collection,
31 October-02 November, 2012, Geneva
Introduction
• This paper describes the progression towards an Irish
Statistical System, a holistic system based on the exploitation
of administrative data, comprehending linkages to survey data
and other administrative data.
• The paper focuses on the role of the CSO’s Administrative
Data Centre, which has the dual purpose of acting as clearing
house for administrative data and promoting the development
of the Irish Statistical System.
2
The National Statistics Board
• In 2009, the National Statistics Board (NSB) laid out
a strategy1 for achieving an Irish Statistical System.
Amongst the implementation priorities identified is:

Developing systems to ensure that the statistical value of existing
survey and administrative data is maximized.
• The NSB paper also identified three critical
infrastructural requirements in developing the Irish
Statistical System:



1
A unique business identifier and a central business register;
A unique personal identifier;
Spatial and geographic data capture.
Strategy for Statistics, 2009-2014, http://www.nsb.ie/media/nsbie/pdfdocs/StrategyforStatistics20092014.pdf
3
Policy progression
• In 2011, an NSB position paper1 elaborated on some of the core
objectives of the earlier document, advocating, in particular:

The development of the infrastructure to maximise the use of data
sources, including the compilation of registers of persons, businesses,
and buildings, with linkage between each such register – “joined-up” data.
• The government Public Sector Reform Plan2, published in 2011,
further supports the development of the Irish Statistical System
with the following stated objectives:


Improved sharing of data on businesses across the Public Service,
including the development of business registers linkable to that of the
Revenue Commissioners;
Developing a code of practice for data gathering and its use for statistical
purposes across the Public Service, including promoting consistent
approaches to identifiers, classifications, and geo-spatial/postcode data.
1Double
paper The Irish Statistical System: The Way Forward and Joined Up Government Needs Joined Up Data
http://www.nsb.ie/media/nsbie/pdfdocs/NSB%20ISS%20Position%20Papers.pdf
2
http://per.gov.ie/wp-content/uploads/Public-Service-Reform-pdf3.pdf
4
The Statistics Act, 1993
• The CSO was established statutorily under the Statistics
Act, 19931. This legislation assigns certain powers to the
Director General of the CSO with respect to data held by
public authorities:
 The Director General may require a public body to provide
copies of any records in its charge for statistical purposes;
 The Director General may require a public body to co-operate
with him on assessing the statistical potential of its records
and in developing its recording methods and systems for
statistical purposes;
 A public body shall consult with the Director General, and
accept his reasonable recommendations, if it proposes to
introduce or revise any system for the storage and retrieval of
information or to make a statistical survey.
1
http://www.irishstatutebook.ie/1993/en/act/pub/0021/print.html
5
A joined-up data system (after Thygesen1)
1 The
importance of the archive statistical idea for the development of social statistics and population and housing censuses in
Denmark, Thygesen, Lars, 2011 http://ww4.dst.dk/upload/nordbotten_and_denmark_final_draft_4.pdf
6
Joined-up data and the CSO
• The CSO’s Business Register is fully aligned with administrative
sources from the Revenue Commissioners.
• Linkage between persons and businesses is available to the CSO from
employer tax returns to the Revenue Commissioners.
• There exists in Ireland a comprehensive buildings database for the
state, called the Geodirectory1, available on a commercial basis.
• Ireland does not yet have a post code system, but this is planned for
2013.
• The Department of Social Protection maintains the master list of official
Personal Public Service Numbers (PPSN) in the state. This list is the
basis of the CSO’s Person Activity Register, which identifies each
person’s engagement with key administrative systems.
1
http://www.geodirectory.ie/
7
The CSO’s Administrative Data Centre
• The Administrative Data Centre (ADC) is the CSO unit designated
as the conduit for data transfers from other government bodies and
is the central repository for received data from those bodies.
• This unit currently maintains over fifty different administrative data
flows serving the statistical production systems in the CSO.
• ADC controls access to the data in accordance with confidentiality
obligations under national and EU legislation.
• Subject to these criteria, ADC may also make anonymized data
available as Research Micro Files (RMFs) to external researchers.
8
9
Following the setting-up of the ADC...
10
ADC interaction with other public bodies
• ADC policy is to implement institutional-level Memorandums of
Understanding (MoUs) to underpin the flow of administrative data to
the CSO, as distinct from having data flow-specific MoUs.
• In the case of the Office of the Revenue Commissioners, the MoU1
has led to a relationship which has allowed the CSO to adopt a
business register that is based on the Revenue Commissioners’
registration system and to use the Revenue Customer Number as a
common business identifier between the two bodies.
• The government has charged the CSO with developing a statistical
code of practice for the Irish public service. The ADC is progressing
this objective through its chairing of the Statistician Liaison Group, a
forum of statistical units across the public service.
1
http://www.cso.ie/en/aboutus/descriptionsandfunctions/memorandumofunderstandingbetweenthecsoandrevenue/
11
ADC – technical aspects
• Data received from other government bodies are converted to SAS
datasets and held in a warehouse environment having Source,
Analysis, and External Researcher tiers.
• In the case of person-based administrative data, ADC anonymizes
such files before making them available to CSO users, as Analysis tier
data flows.
• All CSO staff have access, via a data portal, to core metadata and
summary statistics on all administrative data held.
• The data model for the administrative data held in the ADC domain is a
hierarchical model:

Data flow  Data flow instance  Instance version  Datasets
12
ADC – technical aspects
• An example of an Analysis tier data flow is the P35 (employee)
dataset, which links person- and business-based registers as
illustrated here:
13
The future – concrete objectives
• The key challenge for the CSO will be to avail of the
increasing opportunities for joining up available administrative
data sources. Steps to complete a fully joined-up data system
in Ireland might include:
•
•
•
The implementation in public administration systems of a link
between a person and a residence, where the residence is itself
identified by a location or (x,y)-based identifier;
The mandatory use of the PPSN in the engagement of persons
with the state through the different life stages;
The implementation of a unique business identifier for
businesses interacting with the state, and the linking of this
identifier with a building identification number.
14
The future – critical success factors
• Statistical code of practice for the Irish public sector
• Partnership approach to development of joined-up data
• Delivery of projects which deliver value for policy purposes
15
Conclusion
The Irish Statistical System continues to face significant
challenges in the years ahead; however in the words of W.
Edwards Deming, “It is not necessary to change. Survival is
not mandatory.”
16
Download