ICT in Health Research Challenges and Opportunities for Privacy Protection From Obstruction to Construction 1 Speakers Filip De Meyer Department of Medical Informatics & Statistics University Hospital Ghent – Belgium Filip.DeMeyer@UGent.Be Frank Robben General manager Crossroads Bank for Social Security & eHealth platform Brussels - Belgium Frank.Robben@mail.fgov.be “The Modern World is a Data Driven World” risks & challenges benefits & opportunities Setting a knowledge claim means that researchers start a project with certain assumptions about how they will learn and what they will learn during their inquiry. These claims might be called paradigms (Lincoln and Guba, 2000; Martens, 1998) Research hypothesis generation basic research ... observational epidemiological studies A priori defined associations: a fraction of possible relations paradigm shift deductive → inductive Data trawling in search of associations with statistical significance Lancet 1996; 348:1152-53 Changing research models • data trawling/fishing • genome wide association studies (bio-identity !) • data mining of association studies (basic, family history, genetics, epigenetics, transcriptomics,...) • translational medicine (“bench to bedside”) • personalised medicine • (bidirectional) integration of EHR & clinical research • world wide service provision (e.g. genetic testing) • preservation of samples (regeneration of bio-identity) • PHR & patient empowerment Informational Privacy awareness “People don’t react to reality; they react to their perceptions of reality” Different perceptions health research regulatory authorities “Perform research” “Enforce Protection of personal privacy” data privacy protection services “Provide protective solutions that are effective” Specificity of a privacy protection context European Level (DPD) national legislation other Regulations (e.g. CGP) Specific privacy context local ethics committees Importance of Privacy Policy ! data subject Data categories • anonymous data – data that cannot be related to an identified or identifiable person by anyone – are not personal data => privacy protection regulation does not apply Data categories • coded data – data that cannot be related to an identified or identifiable person by the controller of the data processing, but that can be related to an identified or identifiable person by someone else (e.g. an intermediary organization) – are personal data => privacy protection law applies Data categories • non-coded personal data – data that can be related to an identified of identifiable person by the controller of the data processing – are personal data => privacy protection law applies Evaluation of identifiability • an identifiable person is one who can be identified, directly or indirectly, in particular by reference to an identification number or to one or more factors specific to his physical, physiological, mental, economic, cultural or social identity • to determine whether a person is identifiable, account should be taken of all the means likely reasonably to be used either by the controller of the data processing or by any other person to identify the said person Basic principles of privacy protection law • fair and lawful processing • purpose limitation – personal data have to be collected for specified, explicit and legitimate purposes – and must not be further processed in a way incompatible with those purposes • proportionality – personal data have to be adequate, relevant and not excessive in relation to the purposes for which they are collected or further processed – avoid identification longer than necessary Basic principles of privacy protection law • transparency for the data subject, a.o. – about the purposes of data processing – about the identity of the controller of the data processing • obligations of the controller of the data processing, a.o. – informing the data subject – keeping processed data accurate and up to date – guaranteeing sufficient information security • rights of the data subject Privacy principles are challenged • large amounts of data (proportionality) • undefined research hypotheses (purpose limitation) • genetic data: identifiability nor information content fully defined • bio-identities challenge de-identification schemes • distributed data repositories (who is the controller ?) • cloud computing & international service provision • etc. A common misunderstanding Research usage of data collected from a patient for diagnostic or treatment purposes secondary use of data Research usage of data collected from a patient enrolled in a study (within the defined study) primary use of data Privacy protection = risk management • • • • balance between research benefits and privacy risks privacy legislation is a reference determine the privacy risks (research models) effective data privacy protection framework – organisational and physical protection – protect (unauthorised) access to the data – apply privacy enhancing technologies • complementary restrictions on the use of data (e.g. non- discrimination legislation on data use) • define and come to terms over residual risk Good practice • if possible, secondary use of data for research purposes should be conducted on anonymous data • if research is not possible based on anonymous data, secondary use of data for research purposes should be conducted on coded data, with appropriate guarantees • only if research is not possible based on anonymous or coded data, secondary use of data for research purposes can be conducted on non-coded personal data, with appropriate guarantees Example: Belgian regulation • secondary use of coded data for research purposes – notification to the Privacy Commission prior to further processing for research purposes • specific motivation of the need for coded data • complementary information in case of need for processing of coded sensitive or health data – coding prior to further processing for research purposes • by the controller of the original data or an intermediary organization when data originate from one controller • by an intermediary organization when the data originate from several controllers • the intermediary organization needs to be independent from the controller of the further processing for research purposes Example: Belgian regulation • secondary use of coded data for research purposes – coded data may only be disclosed to the controller of the further processing for research purposes after receipt of the proof of the notification to the Privacy Commission – information duty of the controller of the original data or the intermediary organization towards the data subjects, unless • impossibility to inform the data subjects • information duty involves a disproportionate effort • data are coded by an intermediary organization being an administrative authority having the explicit legal task to act as an intermediary organization (e.g. the eHealth platform) Example: Belgian regulation • secondary use of non-coded personal data for research purposes – notification to the Privacy Commission prior to further processing for research purposes • specific motivation of the need for non-coded personal data – explicit informed consent of the data subjects prior to further processing for research purposes, unless • data are public • information duty involves a disproportionate effort (notification duty to the Privacy Commission in case of sensitive or health data) Example: Belgian regulation • secondary use of non-coded personal data for research purposes – non-coded personal data may only be disclosed to the controller of the further processing for research purposes after receipt of the proof of the notification to the Privacy Commission Example: Belgian regulation • authorization of exchange of health data – every exchange of non-coded personal data has to be authorized either by the data subject, either by the law, either by a specialized sectoral committee of the Privacy Commission – every coding of data by the eHealth platform has to be authorized by the sectoral committee, indicating whether the encoding should be reversible or irreversible – every anonymizing of data by the eHealth platform has to be authorized by the sectoral committee Example: Belgian Sectoral Committee • established within the Privacy Commission • consists of – 2 members of the Privacy Commission – 4 medical doctors appointed by Parliament • tasks – to provide authorizations for (electronic) exchange of personal health data, in situations not regulated by law – to determine information security policies with regard to the processing of personal health data – to give advice and recommendations with regard to information security related to the processing of personal health data – to handle complaints with regard to the violation of information security policies during the processing of personal health data Example: Belgian implementation • creation of the eHealth platform, having as a mission – to optimize healthcare quality and continuity – to optimize safety – to simplify administrative formalities for all healthcare actors – to reliably support healthcare policy and research through – a well-organised, mutual electronic service and information exchange between all healthcare actors – with the necessary guarantees in the area of information security, privacy protection and professional secrecy Belgian eHealth platform: board of directors • 7 representatives of the health care providers and institutions • 7 representatives of the sickness funds and patient organizations • 7 representatives of the public services with competences in health care • representatives of the Ministers of Health, Social Affairs, Computerization and Budget • representatives of the Order of Physicians and the Order of Pharmacists with advisory vote Belgian eHealth platform: basic architecture Patients, healthcare providers and institutions Health Portal VAS VAS VAS VAS RIZIV-INAMI site VAS VAS VAS VAS Care provider software Healthcare institution software eHealth platform Portal VAS VAS VAS VAS VAS VAS VAS VAS MyCareNet VAS VAS VAS VAS Users Basic services eHealth platform Network ADS Suppliers ADS ADS ADS ADS ADS Belgian eHealth platform: basic services • • • • • • coordination of electronic processes web portal (https://www.ehealth.fgov.be) integrated user and access management logging management system for end-to-end encryption personal electronic mailbox for each healthcare supplier • electronic time stamping • coding and anonymizing • reference directory Belgian eHealth platform: coding and anonymizing Belgian eHealth platform: coding and anonymizing Privacy by design • start from privacy risk analysis (privacy impact analysis) • attack models (observational data) /residual risk definition • obtain and document authorisations • involve research project key actors • verify ethical/privacy constraints for secondary use • record privacy related metadata for data assets • use Privacy Enhancing Technologies (PET) • protect research data from de-identification • aim: automated enforcing of privacy policy rules “Information security is, a journey, not a destination” data security vulnerabilities impacts PET policy enforcing threats access control physical protection Breach and Incident Reporting ? Studies conducted on behalf of the European Network and Information Security Agency (ENISA) recommend that the EU should introduce a comprehensive security-breach notification law. Complementary building blocks • “traditional” data security – encryption, authentication, authorisation, audit trails, signatures,... – physical protection of assets • privacy/security policies and procedures – IRBs in research organisations – enforcing/ training/awareness • Privacy Enhancing Technology “Traditional” data security • • • • control access to systems, data assets based upon authorisation for roles attributed to individuals trustworthy sources to support security decisions (identies, roles, authorisations) • awareness/enforcement of security policies • integrate into protected research environments (circles of trust) • increased interoperability (standards !) Privacy Enhancing Technology • • • • • • • • • complementary to access control of data assets based on identity management and de-identification various identity domains/realms set of privacy enhancing functions and methods combination of third party service provision, software agents and tools privacy violation detection requires trusted service provision “TTP” linkage functionality otherwise not allowed ! use of cryptographic techniques Pfitzman-Hansen terminology • • • • • anonymity unlinkability unobservability pseudonymity etc. http://dud.inf.tu-dresden.de/Anon_Terminology.shtml The role of PETs ? PETs can help to design information and communication systems and services in a way that minimises the collection and use of personal data and facilitate compliance with data protection rules. The use of PETs should result in making breaches of certain data protection rules more difficult and/or helping to detect them. Memo/07/159 of the EU-Commission Examples of PET functions • de-identification of personal data • “coding” (pseudonymisation) of personal data • linking and aggregating de-identified or personal data • controlled re-identifications • etc. Example of a PET application (cervical cancer research) PAP smear/ clinical data One-shot extraction of personal data questionnaire pseudonymised data Follow-up live updates with personal data Privacy Protection Services Case repository (de-id. data) Reduction of identifying information Privacy policy delete identifier transform date personal data produce nym delete data items encrypt data items … de-identified data Tools for PET application (DICOM example) examples replaced by nym original cleared Make a “data protection” configuration once… run it several times… XML example The concept of identification a d c h set of data subjects f b e g set of characteristics A data subject is identified (within a set of data subjects) if it can be singled out among other data subjects. Some associations between characteristics and data subjects are more persistent in time (e.g. a national security number, date of birth) than others (e.g. an e-mail address). Determining identifiability “To determine whether a person is identifiable, account should be taken of all the means likely reasonably to be used either by the controller or by any other person to identify the said person; whereas the principles of protection shall not apply to data rendered anonymous in such a way that the data subject is no longer identifiable; whereas codes of conduct within the meaning of Article 27 may be a useful instrument for providing guidance as to the ways in which data may be rendered anonymous and retained in a form in which identification of the data subject is no longer possible”. (Recital 26 of the DPD) refine the concept of identifiability/anonymity. take into account “means likely and “any other person” in through re-identification risk analysis Levels of de-identification ? (ISO/IEC DTS 25237) • Level 1: removal of clearly identifying data (“rules of thumb”) • Level 2: static, model based re-identification risk analysis (include “attacker models”) • Level 3: continuous re-identification risk analysis of live databases (e.g. outlier issues) Targets for de-identification can be set and liabilities better defined in risk analysis and policies Requirements for PET-TTPs • legal status of provider must be clear and transparent • independent of the (data sources) and destinations • using state-of-art ICT and cryptographic technologies • transparent service level agreements • internal security procedures documented and verifiable (technical and organisational/procedural) • no “security through obscurity” • standards for service provision/interfacing • … PET- issues to be addressed • • • • • • • • • • • differences in perception on basic concepts of identifiability controlled re-identification part of legislation ? de-identification is not “processing” in DPD sense trustworthy operation of PET-TTPs incident reporting : when , how ? genomic data and bio-identity requirements for incidental findings reporting in research re-identification risk analysis attack models ID management in data governance ... We thank you for your attention Any questions ?