Agenda • • • • Problem Existing Approaches The e-Lab Is DRM the solution? Climate Change Problem • Potentially identifiable data required for effective research • Individuals have a right to confidentiality and privacy • Potentially identifiable data should not be: – Redistributed • Release under defined conditions – Linked to other data • Risk of deductive disclosure • Potentially identifiable data should be: – Stored securely – Destroyed after use Potentially Identifiable Information • Individual records even if they do not include variables, such as names, full postcodes, and dates of birth which would make them obviously identifiable; • Tabular data, based on small geographic areas, with cell counts of fewer than five cases/events (or where counts of less than five can be inferred by simple arithmetic) – hereafter referred to as “sparse cells” • Tabular data containing cells that have underlying population denominators of less than approximately 1,000 – Source UKACR Existing approaches • Locked rooms, locked down machines – Used by many national statistical services • Does not scale Existing approaches • Policy – User bound by terms and conditions or contract of employment or professional governance bodies UKACR Policy • the intended use(s) of the data should be stated clearly • the use(s) of the data should be justified and the data should not be used for any other purpose(s) • the data should not be passed on to other third parties or released into the public domain • the data should be kept securely for the period of time that can be justified by the stated purpose, and then destroyed • no attempt should be made to identify information pertaining to particular individuals or to contact individuals • no attempt should be made to link the data to other data sets, unless agreed with the data providers Existing approaches • Policy – User bound by terms and conditions or contract of employment or professional governance bodies • Policing – Doesn’t scale North West e-Health • Joint Project: SRFT, SPCT, UoM Founded on UoM/ Salford NHS experience and expertise • Based on the establishment of an e-Lab federation: “that will allow the partners to pool and develop their expertise and resources, acting together for mutual benefit and for the benefit of other stakeholders and clients” • NWDA core-funding • Potential for self-sustaining entity What is an e-Lab ...an information system bringing together data, analytical methods and people for timely, high-quality decision-making Information Governance • Designed for minimal disclosure • Only release items that user “Needs to know” • Only release items that user “Has the right to know” • Determined by the “e-Lab Governance Board” Information Governance • Technical safeguards – Audit trails & monitoring – Anonymisation and Inference control • Operational procedures – Users sign up to terms and conditions of use; bound by employment contracts – Spot checks • Governance Board + NREC Research Database Approval NHS Trust EHR Users Governance E-Lab Data Store 2. Pseudonymisation, classification and integration Trust Systems Clinical Clinical Data Data Trust e-Lab Integrated EHR Non-clinical Non-clinical Data Data 1. Integration of primary and secondary care records E-Lab Repository Trust e-Lab 3. Perform Data Query 2. Access control module authorizes request E-Lab Repository 4. Anonymisation and inference control User Data Store 8. Storage 1 .User logs on and submits query Access Control e-Lab Tools 9. Data analysis and visualization NHS NHS Trust E-Lab NHS Trust Governance Users Data Store E-Lab Data Store Data Store NWeH Broker Federated E-Lab Governance NWeH NHS Trust EHR Users Governance Governance EHR E-Lab Users EHR Users NHS Trust e-Lab NWeH – e-Lab Federation E-Lab Repository E-Lab Repository 5. Per request keyed pseudonymisation 5. Per request keyed pseudonymisation 3. Broker performs distributed query; generate pseudonym keys 2. Access control module authorizes request NHS Trust e-Lab Broker 6. Data integration 7. Anonymisation and inference control User Data Store 8. Storage 1 .User logs on and submits query Access Control e-Lab Tools 9. Data analysis and visualization e-Labs Pseudonymised Data Flows e-Lab Broker Secondary Pseudonymised Data Flows Data Users DRM Solution? • DRM used to prevent re-distribution • DRM used to prevent modification • DRM used to prevent linking to other data DRM problems • Not fail safe? • Better than just stopping the “casual attacker”? • Perception is easy to crack or by-pass