Privacy Enhancing Technologies

advertisement
ICT in Health Research
Challenges and Opportunities
for Privacy Protection
From Obstruction to Construction
1
Speakers
Filip De Meyer
Department of Medical Informatics & Statistics
University Hospital
Ghent – Belgium
Filip.DeMeyer@UGent.Be
Frank Robben
General manager
Crossroads Bank for Social Security & eHealth platform
Brussels - Belgium
Frank.Robben@mail.fgov.be
“The Modern World is a Data Driven World”
risks
&
challenges
benefits
&
opportunities
Setting a knowledge claim means that researchers
start a project with certain assumptions
about how they will learn and what they will learn during their inquiry.
These claims might be called paradigms
(Lincoln and Guba, 2000; Martens, 1998)
Research hypothesis generation
basic research
...
observational
epidemiological studies
A priori defined associations: a fraction of possible relations
paradigm shift
deductive → inductive
Data trawling in search of associations with
statistical significance
Lancet 1996; 348:1152-53
Changing research models
• data trawling/fishing
• genome wide association studies (bio-identity !)
• data mining of association studies (basic, family history,
genetics, epigenetics, transcriptomics,...)
• translational medicine (“bench to bedside”)
• personalised medicine
• (bidirectional) integration of EHR & clinical research
• world wide service provision (e.g. genetic testing)
• preservation of samples (regeneration of bio-identity)
• PHR & patient empowerment
Informational Privacy awareness
“People don’t react to reality;
they react to their perceptions of reality”
Different perceptions
health
research
regulatory
authorities
“Perform research”
“Enforce Protection of
personal privacy”
data privacy
protection
services
“Provide protective solutions that are effective”
Specificity of a privacy protection context
European Level
(DPD)
national legislation
other
Regulations
(e.g. CGP)
Specific privacy
context
local
ethics committees
Importance of
Privacy Policy !
data
subject
Data categories
• anonymous data
– data that cannot be related to an identified or
identifiable person by anyone
– are not personal data => privacy protection
regulation does not apply
Data categories
• coded data
– data that cannot be related to an identified or
identifiable person by the controller of the data
processing, but that can be related to an
identified or identifiable person by someone else
(e.g. an intermediary organization)
– are personal data => privacy protection law
applies
Data categories
• non-coded personal data
– data that can be related to an identified of
identifiable person by the controller of the data
processing
– are personal data => privacy protection law
applies
Evaluation of identifiability
• an identifiable person is one who can be identified,
directly or indirectly, in particular by reference to an
identification number or to one or more factors
specific to his physical, physiological, mental,
economic, cultural or social identity
• to determine whether a person is identifiable,
account should be taken of all the means likely
reasonably to be used either by the controller of the
data processing or by any other person to identify
the said person
Basic principles of privacy protection law
• fair and lawful processing
• purpose limitation
– personal data have to be collected for specified, explicit
and legitimate purposes
– and must not be further processed in a way incompatible
with those purposes
• proportionality
– personal data have to be adequate, relevant and not
excessive in relation to the purposes for which they are
collected or further processed
– avoid identification longer than necessary
Basic principles of privacy protection law
• transparency for the data subject, a.o.
– about the purposes of data processing
– about the identity of the controller of the data processing
• obligations of the controller of the data processing,
a.o.
– informing the data subject
– keeping processed data accurate and up to date
– guaranteeing sufficient information security
• rights of the data subject
Privacy principles are challenged
• large amounts of data (proportionality)
• undefined research hypotheses (purpose limitation)
• genetic data: identifiability nor information content
fully defined
• bio-identities challenge de-identification schemes
• distributed data repositories (who is the controller ?)
• cloud computing & international service provision
• etc.
A common misunderstanding
Research usage of data collected from a patient for diagnostic or treatment
purposes
secondary use of data
Research usage of data collected from a patient enrolled in a study (within
the defined study)
primary use of data
Privacy protection = risk management
•
•
•
•
balance between research benefits and privacy risks
privacy legislation is a reference
determine the privacy risks (research models)
effective data privacy protection framework
– organisational and physical protection
– protect (unauthorised) access to the data
– apply privacy enhancing technologies
• complementary restrictions on the use of data (e.g.
non- discrimination legislation on data use)
• define and come to terms over residual risk
Good practice
• if possible, secondary use of data for research
purposes should be conducted on anonymous data
• if research is not possible based on anonymous data,
secondary use of data for research purposes should
be conducted on coded data, with appropriate
guarantees
• only if research is not possible based on anonymous
or coded data, secondary use of data for research
purposes can be conducted on non-coded personal
data, with appropriate guarantees
Example: Belgian regulation
• secondary use of coded data for research purposes
– notification to the Privacy Commission prior to further
processing for research purposes
• specific motivation of the need for coded data
• complementary information in case of need for processing of
coded sensitive or health data
– coding prior to further processing for research purposes
• by the controller of the original data or an intermediary
organization when data originate from one controller
• by an intermediary organization when the data originate from
several controllers
• the intermediary organization needs to be independent from the
controller of the further processing for research purposes
Example: Belgian regulation
• secondary use of coded data for research purposes
– coded data may only be disclosed to the controller of the
further processing for research purposes after receipt of
the proof of the notification to the Privacy Commission
– information duty of the controller of the original data or
the intermediary organization towards the data subjects,
unless
• impossibility to inform the data subjects
• information duty involves a disproportionate effort
• data are coded by an intermediary organization being an
administrative authority having the explicit legal task to act as an
intermediary organization (e.g. the eHealth platform)
Example: Belgian regulation
• secondary use of non-coded personal data for
research purposes
– notification to the Privacy Commission prior to further
processing for research purposes
• specific motivation of the need for non-coded personal data
– explicit informed consent of the data subjects prior to
further processing for research purposes, unless
• data are public
• information duty involves a disproportionate effort (notification
duty to the Privacy Commission in case of sensitive or health data)
Example: Belgian regulation
• secondary use of non-coded personal data for
research purposes
– non-coded personal data may only be disclosed to
the controller of the further processing for
research purposes after receipt of the proof of
the notification to the Privacy Commission
Example: Belgian regulation
• authorization of exchange of health data
– every exchange of non-coded personal data has to be
authorized either by the data subject, either by the law,
either by a specialized sectoral committee of the Privacy
Commission
– every coding of data by the eHealth platform has to be
authorized by the sectoral committee, indicating whether
the encoding should be reversible or irreversible
– every anonymizing of data by the eHealth platform has to
be authorized by the sectoral committee
Example: Belgian Sectoral Committee
• established within the Privacy Commission
• consists of
– 2 members of the Privacy Commission
– 4 medical doctors appointed by Parliament
• tasks
– to provide authorizations for (electronic) exchange of personal health
data, in situations not regulated by law
– to determine information security policies with regard to the
processing of personal health data
– to give advice and recommendations with regard to information
security related to the processing of personal health data
– to handle complaints with regard to the violation of information
security policies during the processing of personal health data
Example: Belgian implementation
• creation of the eHealth platform, having as a mission
– to optimize healthcare quality and continuity
– to optimize safety
– to simplify administrative formalities for all healthcare
actors
– to reliably support healthcare policy and research
through
– a well-organised, mutual electronic service and
information exchange between all healthcare actors
– with the necessary guarantees in the area of information
security, privacy protection and professional secrecy
Belgian eHealth platform: board of
directors
• 7 representatives of the health care providers and
institutions
• 7 representatives of the sickness funds and patient
organizations
• 7 representatives of the public services with
competences in health care
• representatives of the Ministers of Health, Social
Affairs, Computerization and Budget
• representatives of the Order of Physicians and the
Order of Pharmacists with advisory vote
Belgian eHealth platform: basic architecture
Patients, healthcare providers
and institutions
Health Portal
VAS
VAS
VAS
VAS
RIZIV-INAMI
site
VAS
VAS
VAS
VAS
Care provider
software
Healthcare
institution
software
eHealth
platform
Portal
VAS
VAS
VAS
VAS
VAS
VAS
VAS
VAS
MyCareNet
VAS
VAS
VAS
VAS
Users
Basic services
eHealth platform
Network
ADS
Suppliers
ADS
ADS
ADS
ADS
ADS
Belgian eHealth platform: basic services
•
•
•
•
•
•
coordination of electronic processes
web portal (https://www.ehealth.fgov.be)
integrated user and access management
logging management
system for end-to-end encryption
personal electronic mailbox for each healthcare
supplier
• electronic time stamping
• coding and anonymizing
• reference directory
Belgian eHealth platform: coding and
anonymizing
Belgian eHealth platform: coding and
anonymizing
Privacy by design
• start from privacy risk analysis (privacy impact
analysis)
• attack models (observational data) /residual risk
definition
• obtain and document authorisations
• involve research project key actors
• verify ethical/privacy constraints for secondary use
• record privacy related metadata for data assets
• use Privacy Enhancing Technologies (PET)
• protect research data from de-identification
•  aim: automated enforcing of privacy policy rules
“Information security is, a journey,
not a destination”
data security vulnerabilities
impacts
PET
policy
enforcing
threats
access
control
physical
protection
Breach and Incident Reporting ?
Studies conducted on behalf of the European Network and
Information Security Agency (ENISA) recommend that the EU
should introduce a comprehensive security-breach
notification law.
Complementary building blocks
• “traditional” data security
– encryption, authentication, authorisation, audit
trails, signatures,...
– physical protection of assets
• privacy/security policies and procedures
– IRBs in research organisations
– enforcing/ training/awareness
• Privacy Enhancing Technology
“Traditional” data security
•
•
•
•
control access to systems, data assets
based upon authorisation for roles
attributed to individuals
trustworthy sources to support security
decisions (identies, roles, authorisations)
• awareness/enforcement of security policies
• integrate into protected research
environments (circles of trust)
• increased interoperability (standards !)
Privacy Enhancing Technology
•
•
•
•
•
•
•
•
•
complementary to access control of data assets
based on identity management and de-identification
various identity domains/realms
set of privacy enhancing functions and methods
combination of third party service provision,
software agents and tools
privacy violation detection
requires trusted service provision  “TTP”
linkage functionality otherwise not allowed !
use of cryptographic techniques
Pfitzman-Hansen terminology
•
•
•
•
•
anonymity
unlinkability
unobservability
pseudonymity
etc.
http://dud.inf.tu-dresden.de/Anon_Terminology.shtml
The role of PETs ?
PETs can help to design information and communication systems and
services in a way that minimises the collection and use of personal
data and facilitate compliance with data protection rules. The use of
PETs should result in making breaches of certain data protection rules
more difficult and/or helping to detect them.
Memo/07/159 of the EU-Commission
Examples of PET functions
• de-identification of personal data
• “coding” (pseudonymisation) of personal data
• linking and aggregating de-identified or
personal data
• controlled re-identifications
• etc.
Example of a PET application
(cervical cancer research)
PAP smear/
clinical data
One-shot extraction
of personal data
questionnaire
pseudonymised data
Follow-up
live updates
with personal data
Privacy
Protection
Services
Case
repository
(de-id. data)
Reduction of identifying information
Privacy policy
delete identifier
transform date
personal data
produce nym
delete data items
encrypt data items
…
de-identified
data
Tools for PET application
(DICOM example)
examples
replaced by nym
original
cleared
Make a “data protection” configuration
once… run it several times…
XML example
The concept of identification
a
d
c
h
set of data subjects
f
b
e
g
set of characteristics
A data subject is identified (within a set of data subjects) if it can be singled out among
other data subjects.
Some associations between characteristics and data subjects are more persistent in
time (e.g. a national security number, date of birth) than others (e.g. an e-mail
address).
Determining identifiability
“To determine whether a person is identifiable, account should be
taken of all the means likely reasonably to be used either by
the controller or by any other person to identify the said
person; whereas the principles of protection shall not apply to
data rendered anonymous in such a way that the data subject is
no longer identifiable; whereas codes of conduct within the
meaning of Article 27 may be a useful instrument for providing
guidance as to the ways in which data may be rendered
anonymous and retained in a form in which identification of the
data subject is no longer possible”. (Recital 26 of the DPD)
 refine the concept of identifiability/anonymity.
 take into account “means likely and “any other
person” in through re-identification risk analysis
Levels of de-identification ?
(ISO/IEC DTS 25237)
• Level 1: removal of clearly identifying data
(“rules of thumb”)
• Level 2: static, model based re-identification
risk analysis (include “attacker models”)
• Level 3: continuous re-identification risk
analysis of live databases (e.g. outlier issues)
Targets for de-identification can be set
and liabilities better defined in risk
analysis and policies
Requirements for PET-TTPs
• legal status of provider must be clear and
transparent
• independent of the (data sources) and destinations
• using state-of-art ICT and cryptographic technologies
• transparent service level agreements
• internal security procedures documented and
verifiable (technical and organisational/procedural)
• no “security through obscurity”
• standards for service provision/interfacing
• …
PET- issues to be addressed
•
•
•
•
•
•
•
•
•
•
•
differences in perception on basic concepts of identifiability
controlled re-identification part of legislation ?
de-identification is not “processing” in DPD sense
trustworthy operation of PET-TTPs
incident reporting : when , how ?
genomic data and bio-identity
requirements for incidental findings reporting in research
re-identification risk analysis
attack models
ID management in data governance
...
We thank you for your attention
Any questions ?
Download