Overseas Access

advertisement
Access to Microdata
The Australian Bureau of Statistics
Approach
Teresa Dickinson
teresa.dickinson@abs.gov.au
This talk...
Legislation and policy
Access modes
–Confidentialised unit record files (CURFs)
–Other
Overseas access to ABS microdata
Outside
Census
and
Statistics
Act
ABS Outputs
ABS Outputs
High
High
ABS
analysis/Consultancy
Protection
Regulation 7A
Assist Performance
of Statistical functions
Section 16A
Assist Statistician in
carrying out functions
ABS
On-site Lab
Remote
access
CD-ROM
access
Specialised
tables
Low
Published
tables
Low
Detail
High
Australian Legislation
A number of legislative provisions, either directly or
indirectly, can facilitate access to microdata
Our legislation allows release of microdata but only
“in a manner that is not likely to enable the identification of the particular
person or organisation to which it relates”
We can release information about businesses (not
individuals) 'to assist the statistician perform statistical
functions' - involves collaborations to support the ABS
workprogram
We can second certain individuals to the ABS to 'assist the
Statistician perform statistical functions'
Why provide deeper access to
microdata? The Benefits
Valuable (and high quality) data is under-utilised.
Researchers may try to collect substitute data sets in order
to obtain microdata, which is a waste of public resources (to
obtain what is probably lower quality data).
Government agencies may look to use alternative data
providers to obtain survey data for research and analysis
purposes, resulting in lower quality data (which may not be
as widely accessible)
Risks of providing access
Misuse - deliberate and inadvertent
Lead to beliefs by respondents that researchers have the
potential to identify their data, and possibly even use it
against them
Loss of trust in processes and work of national statistical
offices, leading to reduced response rates
A shift in emphasis...
From risk avoidance to risk management
Production of microdata files from household collections is
now routine
–well developed polices and processes exist
Beginning to explore ways of making business microdata
more accessible, given that it is rare to be able to produce a
confidentialised file
Communication with respondents?
Engaging with requests for overseas access on a case-bycase basis
Policy response - where ABS is
heading
Four layers of protection
–Protection in the data
–Access method
–User education / partnership
–Audit and sanctions
Increased variety of access channels
–CD-ROM, Remote Access Datalab, ABS Datalab,
collaborations
–different combinations but giving the required protection
Policy - who gets access, and how
Researchers - government or academic - with a particular
statistical purpose
Undertakings - legally enforceable within Australia
–won't attempt to identify or match
–won't share access etc.
–will abide by rules in a manual
Undertakings made by the institution and individuals who will
work with the data
Organisational level undertakings approved by a Deputy
Australian Statistician
Pricing
Australian Government agencies must charge for some
information products according to a set of guidelines
There is recovery of the marginal costs for development and
dissemination of CURFs
Access to a microdata file is $A1,200 (+10% GST for
Australian users)
Policy - creation of files
Subject area creates files using a set of rules devised by the
methodology area (e.g. standard categories for some
variables)
Methodologists vet the files, making changes as necessary
to 'ensure' confidentiality, and 'declare' that the risks of
spontaneous identification are acceptably low
The Australian Statistician gives in-principle approval for
release of the microdata file
What the client sees...
One stop shop - all the information about how to access
microdata is on our website
One client contact point - the CURF Management Unit
(CMU). Submits undertakings through this channel and they
provide access once it has been approved
Internally however lots of areas involved
–CMU
–Subject areas
–Methodology (assurance of confidentiality and auditing
of output)
–Policy area
ABS CURFs
ACCESS MODE
BASIC
EXPANDED
SPECIALIST
Less detailed data available
for analysis
Generally more detailed data
available for analysis
May provide high level of
detail for analysis
May include data for
collections where previously
CURFs could not be
produced
May allow for integration with
other datasets in a way that
does not identify individuals
CD-ROM
Yes
Remote Access Data
Lab (RADL)
ABS On-site data lab
(ABSDL)
Yes
Yes
Yes
Which CURFs?
CURFs are available from a range of ABS surveys (68 in total):















Aboriginal and Torres Strait Islander Social Survey
Aspects of Literacy
Australians' Employment and Unemployment Patterns
Business Longitudinal Survey
Census of Population & Housing
Child Care Survey
Disability, Ageing and Carers Survey
General Social Survey
Household Expenditure Survey
Income and Housing Costs Survey
Labour Mobility Survey
National Health Survey
Mental Health and Wellbeing of Adults Survey
Time Use Survey
Women's' Safety Survey
How Researchers use CURFs
 University Sector
- Ph.D. Students - increasing use
- Undergraduate Students -increasing use with the remote access system lecturers set course work as students can access the CURF on line with
their individual passwords, less security risk than on CD-ROM
 Government Departments use CURFs as a basis to understand the
population to develop public policy
 Recent increase in Government Departments using consultants to
do CURF analysis for their purposes.
 Commercial Research Centres use CURFs to develop models for
policy analysis.
Examples of work arising from CURFs
Ellis, R.P. and Savage, E. (2004) Where do you run after you run for cover?
A model of the demand for private health insurance in Australia,
Australian Health Economics Conference, Melbourne, November 2004.
Cumpston, J. (2004) Models of the Future of Australia,
2004 Australian Population Association Conference.
Kok-Wee Ong, The Effect of Literacy on Earnings in Australia,
UNSW School of Economics Honours Thesis
Richardson, S. Society's Investment in Children,
National Institute of Labour Studies working paper WP151, Flinders University.
Remote Access Data Laboratory
(RADL)
A remote system that allow users to undertake analyses
in SAS, SPSS, or SDATA on ABS CURFs
Instead of a CD-ROM users get a username and
password
There are various rules about printing records and
detailed tables - but looking at a few records is
permitted
Output is (electronically) audited. 94% of jobs are
returned within 2 minutes
- Remaining jobs are manually audited and most are
returned within 1 day
A random sample of all jobs are audited
Audit
Audit is critical to monitor user behaviour
All code and output stored
Cumulative file of all unit data viewed
All jobs have a chance of being inspected
Emerging issues
 Clients require more functionality
–e.g. Output format to spreadsheet not text
–Ideally clients would like an interactive system
 Clients want more detailed data
 Clients want more business data
 Clients want longitudinal data
 Clients continue to be price sensitive
ABS On-site data lab (ABSDL)
 Secure room and desktop
 Locked down computer
 Automatic logging of client activity
 No data transmitting devices
 No data or output to enter or leave the room with
the client.
ABSDL (cont.)
 Specialist or interactive access to Expanded CURFs
–More detailed and/or sensitive data
–Potential future economic survey data
 Interactive system
–SAS, SPSS, STATA, Excel
 All 8 State & Territory ABS Offices on demand basis
Collaborations
A way to broaden ABS workprogram by bringing in expertise
to 'assist the Statistician with statistical functions'
A way of providing access, for selected partners, to business
microdata that can't be produced as a CURF
Designed to be of use to both ABS and researcher
Access is akin to on-site data lab, but data may be close to
recognisable (e.g. simply identifiers removed)
Still working out processes etc., but they are proving time
consuming (and therefore expensive) to establish and run
Will never be in the position of undertaking large number of
collaborations
Overseas Access - ABS data to other
organisations
Have a policy
Undertakings not legally valid overseas - but we can apply
sanctions
Access on a project-by-project basis under these conditions
–project is of genuine benefit to Australian policy making
–organisation is known to us and trusted
–access is through RADL (almost always)
Processes to apply, pricing etc. are identical to Australian
access
Overseas access - international data
repositories (e.g. LIS)
Challenging!
Requires establishment of a genuinely collaborative
relationship
Processes etc. worked out on a case-by-case basis, but are
congruent with our overall policies
Detail of data to be released (must) be less than our CURFs
Download