Presentation - Felix Ritchie

advertisement
UK Data Access Practices
Felix Ritchie
Overview
• The legislative model
• The data model
• The security model
• Developments
• Current key concerns
The legislative model (1)
• Mixture of statutes and common law until…
• Statistics and Registration Services Act 2007
–
–
–
–
Didn’t abolish existing gateways for research
Created a new gateway – ‘Approved Researchers’
Allowed for cross-govt data sharing…
…but not for research purposes unless specifically
agreed
– Clarified limits of European data sharing
– ONS given a statutory duty to support research
The legislative model (2)
• No theoretical limits on who can have access to
enormous range of govt data
– both within govt and in academia
• …but not a free-for-all
• ONS has a duty to protect confidentiality
– even for Approved Researchers
– data release has to be consistent with need
→ the data model
The data model (1)
• ‘Spectrum’ of access points balancing
– value of data
– ease of use
– disclosure risk
• for a given level of confidentiality, maximise
data use and convenience
• no ‘one-size-fits-all’ solution
– no absolute prohibitions
– trade-off is made explicit
– users determine appropriate level of access
Use of confidential data: the access spectrum
Type of
access
None
VML
ONS sites
Little
SDC of
inputs
None
Restrictions on
users
Many
SDC of
outputs
Complete
Examples:
Secure
data
service
Special
licences
Licensed
data archive
Internet
Complete
RDCs
Anonymisation
VML
Govt sites
Census
data
Original
data
Data for ONS
linking
Enterprise
data
Original
data
Identified data
for ONS linking
Identifiable
data for
analysis
Complete
None
None
ONS
contractor
Govt.
users only
Anon.
CD-ROM
Web tables
Web tables
The data model (2)
• Options should cover most cases
– Can’t be perfect in every case
– But the jump from one solution to another reflects
data utility and patterns of research use
• Pretty efficient
–
–
–
–
Fairly transparent
Users balance their own costs/benefits
Economies of scale delivering mass solutions
eg UKDA, VML
• How do we define/describe access points?
→ the security model
The security model (VML version)
• valid statistical purpose
safe projects
• trusted researchers
+
safe people
• anonymisation of data
+
safe data
• technical controls around data
+
safe setting
• disclosure control of results
+
safe outputs
 safe use
Use of confidential data: the access spectrum for ONS data at present
Safety
criterion
VML
SDS
(provisional)
One-off
cases
“Special
Licence”
UK Data
Archive
Internet
People*
ARs/ Civil
Servants
ARs
?
ARs
UK academics
Anyone
Projects
Scrutiny by
MRP
Scrutiny by
MRP
Scrutiny by
MRP
Scrutiny by MRP
Academic
projects
None
Data (in
theory)
Any
Unidentified
Anonymised, low
risk of
identification
Anonymised,
almost no risk
of identification
Anonymised,
no risk of
identification
Anonymised, low
risk of
identification
Anonymised,
almost no risk
of identification
Anonymised,
no risk of
identification
N/A
Data (in
practice)
Unidentified
Unlinkable?
Settings
Secure thin
client
Secure thin
client
?
Use on restricted
IT systems
Use by
academics only
None
Outputs
ONS staff
checked
SDS staff
checked,
ONS
guidelines
?
Researchers
agree to follow
ONS guidelines
No checking
No checking
*AR = Approved Researcher
Access: a summary
• No theoretical restrictions
• wide ranging and flexible legal basis
Remote access in the UK: the VML (1)
• Probably the most important research data
resource in the UK after the UK Data Archive
(and the internet)
• Expanding access from other govt depts.
• Data acquisitions:
–
–
–
–
internal ONS versions of social datasets
Other government dept data
Administrative data
Census 2011 detailed microdata?
Remote access in the UK: the VML (2)
• Highly theorised
– Particularly in disclosure control
• Strong researcher relationship
– compulsory training gives initial investment in
researcher buy-in
• Next stage: full cost-benefit analysis
– Planning model in context of new alternatives
– CBA to include purpose of RDC
Developments in remote access
• VML clones being set up in academia
– Possibly elsewhere in govt too
– No possibility of VML being accessible over
internet in near future
– Likely to develop into a two-tier system
• VML practices and models adopted
– for increasing range of data
– across wider range of operations
Current key concerns
• IT
– lack of resource
– still some basic operational issues unresolved
• Delays in increasing access points
– partly money, partly IT, partly culture
• Demand growth
– 30%-50% each year 2003-2008
– Likely to be higher 2009-10
Current potential concerns
• Potential in Statistics Act
– possibility for ONS’ policies to be challenged
– surprising (unwelcome) demands for information?
• social data in VML partially a pre-emptive response
• New data types bringing new rules
• Fragmentation of RDC practice in UK
Background concern: fear of the new
• Relative risk still poorly understood
– Example
• VML temporarily closed for potential security flaw
• One data area returned to old non-VML solution: letting
external visitors log on using ONS staff usernames
• VML was re-opened after a week for ONS staff and only three
weeks later for external visitors
• But the flaw could only be exploited by ONS staff…
• Resistance to virtual solutions in favour of familiar
– remote access always seen as a limitation despite
much better data quality
– ‘distributed access’ no substitute for ‘distributed data’
Not current key concerns
• Staff resources
– Fast training time
– Supportive researcher base
• researcher buy-in => very lean processes
• Methodological issues
– RDC-specific SDC methods proving robust
• Legal issues
– Statistics law so far proving flexible enough to provide
reasonable responses to all needs
• “reasonable”=ONS and researchers happy that balance
between access and confidentiality is fair
Summary
• Clear legislative model and strong theoretical
basis
– policy decisions relatively easy
• Main difficulty for ONS is managing
expansion of demand
– meeting ONS internal needs (just, for now)
– long way off meeting external demands
Contact
• Felix Ritchie
felix.ritchie@ons.gsi.gov.uk
• Microdata Analysis and User Support
maus@ons.gsi.gov.uk
VML resources
Target June 09
Current
Minimum
G6
G7
G7
SEO
HEO/RO
Strategic
management
Operational
management
HEO/RO
HEO/RO
EO
Operations
Support
AO/AA
Operations and analysis
Strategic resources
Download