Mining electronic health records: towards better research applications and clinical care Standardising the representation of clinical information: for patient care and for research Dipak Kalra Professor of Health Informatics University College London EHR trends • • • • • • • • • • Georges De Moor Patient-centered (gatekeeper?), life long records Multi-disciplinary / multi-professional Transmural, distributed and virtual Structured and coded (cf. semantic interoperability) More metadata and coding at a granular level ! Intelligent (cf. decision support), clinical pathways… Predictive (e.g. genetic data, physiological models) More sensitive content (privacy protection) Personalised Pervasive: bio-sensors, wearables... Capturing and combining diverse sources of information Clinical trials, functional genomics Population health registries Decision support, knowledge management and analysis components Integrating information Whittington Hospital Centering services on citizens Healthcare Record John Smith DoB : 12.5.46 Environmental data Medical devices, Bio-sensors Dipak Kalra Creating and using knowledge Mobile devices Clinical applications Social computing: forums, wikis and blogs The rich re-use of Electronic Health Records Wellness Fitness Complementary health Citizen in the community rapid bench to bed translation Social care Occupational health School health real-time knowledge directed care Point of care delivery Teaching Research Clinical trials explicit consent implied consent Disease registries Screening recall systems Continuing care (within the institution) Education Research Epidemiology Data mining de-identified implied consent +/- consent Long-term shared care (regional national, global) Dipak Kalra Public health Health care management Clinical audit Requirements the EHR must meet: ISO 18308 The EHR shall preserve any explicitly defined relationships between different parts of the record, such as links between treatments and subsequent complications and outcomes. The EHR shall preserve the original data values within an EHR entry including code systems and measurement units used at the time the data were originally committed to an EHR system. The EHR shall be able to include the values of reference ranges used to interpret particular data values. The EHR shall be able to represent or reference the calculations, and/or formula(e) by which data have been derived. The EHR architecture shall enable the retrieval of part or all of the information in the EHR that was present at any particular historic date and time. The EHR shall enable the maintenance of an audit trail of the creation of, amendment of, and access to health record entries. Dipak Kalra Interoperability standards relevant to the EHR Business requirements Information models Computational services ISO 18308 EHR Architecture Requirements HL7 EHR Functional Model ISO EN 13940 Systems for Continuity of Care ISO EN 12967-1 HISA Enterprise Viewpoint EHR system reference model openEHR EHR interoperability Reference Model ISO/EN 13606-1 HL7 Clinical Document Architecture Clinical content model representation openEHR ISO/EN 13606-2 archetypes ISO 21090 Healthcare Datatypes ISO EN 12967-2 HISA Information Viewpoint EHR Communication Interface Specification ISO/EN 13606-5 ISO EN 12967-3 HISA Computational Viewpoint HL7 SOA Retrieve, Locate, and Update Service DSTU Security EHR Communication Security ISO/EN 13606-4 ISO 22600 Privilege Management and Access Control ISO 14265 Classification of Purposes of Use of Personal Health Information Clinical knowledge Terminologies: SNOMED CT, etc. Clinical data structures: Archetypes etc. ISO EN 13606-1 Reference Model Dipak Kalra In a generated medical summary List of diagnoses and procedures 1993 Procedure Appendicectomy 1996 Diagnosis Meningococcal meningitis 1997 Procedure Termination of pregnancy 2003 Diagnosis Acute psychosis 2006 Diagnosis Schizophrenia Can we safely interpret a diagnosis without its context? Dipak Kalra Clinical interpretation context Emergency Department Reason for encounter Brought to ED by family Symptoms “They are trying to doctor, kill me” Junior Mental state exam Diagnosis Certainty Management plan Dipak Kalra Seen by junior doctor emergency situation, Hallucinations a working hypothesis so Delusions schizophrenia of persecution is not a Disordered thoughts reliable diagnosis Schizophrenia Working hypothesis Admission etc..... Examples of clinical interpretation context • within the overall clinical story - past, present - intended treatments, planned procedures • clinical circumstances of an observation - e.g. standing, fasting • presence / absence / certainty of the finding • hypotheses, concerns • a diagnosis for a relative - but not the patient! • confidence and evidence - seniority of the author - justification, clinical reasoning, guideline references Dipak Kalra Examples of medico-legal context • Authorship, responsibilities, signatories • Dates and times - occurrence, clinical encounter, recording, schedules, intentions • Information subjects - whose record is this? (who is the patient?) - about whom is this observation? (e.g. family history) - who provided this information • Version management • Access privileges - which need to be defined in ways that can be interpreted across organisational and national boundaries • Consents Dipak Kalra Clinical information standards • Formally model clinical domain concepts - e.g. “smoking history”, “discharge summary”, “fundoscopy” • Encapsulate evidence and professional consensus on how clinical data should be represented - published and shared within a clinical community, or globally imported by vendors into EHR system data dictionaries • Support consistent data capture, adherence to guidelines • Enable use of longitudinal EHRs for individuals and populations • Define a systematic EHR target for queries: for decision support and for research Archetypes (openEHR and ISO 13606-2) Dipak Kalra Example archetype for adverse reaction Dipak Kalra openEHR Clinical Knowledge Manager Using archetypes for querying EHR repositories Dipak Kalra Example clinical questions • Find the age and gender of patients who have been diagnosed with Hodgkin's disease, where the initial diagnosis occurred between the ages 50 and 70 inclusive • What is the percentage of patients diagnosed with primary breast cancer in the age range 30 to 70 who were surgically treated and had post operative haematoma/seroma? • What percentage of patients with primary breast cancer who relapsed had the relapse within 5 years of surgery? • What is the average survival of patients with Chronic Myeloid Leukaemia (CML) and both with and without splenomegaly at diagnosis? Dipak Kalra Semantic interoperability • New generation personalised medicine underpinned by ‘-omics sciences’ and translational research needs to integrate data from multiple EHR systems with data from fundamental biomedical research, clinical and public health research and clinical trials • Clinical data that are shared, exchanged and linked to new knowledge need to be formally represented to become machine processable. • This is more than just adopting existing standards or profiles, it is “mapping clinical content to a commonly understood meaning” • One can exchange in a perfectly standardised message complete meaningless information, hence the importance of content-related quality criteria (clinically meaningful) and of true semantic interoperability Dipak Kalra EHR and knowledge integration Research Epidemiology Evidence on treatment effectiveness Medical Knowledge Bio-sciences Pathological processes Diseases and treatments Clinical outcomes Clinical audit Care plans Health Records Descriptions, findings, intentions Professionalism and accountability These areas need to be represented consistently to deliver meaningful and safe interoperability Dipak Kalra Prompts, reminders privacy record structure and context Consistent representation, Rich EHR interoperability access and interpretation clinical terminology systems terminology sub-sets value sets and micro-vocabularies term selection constraints post-co-ordination terminology binding to archetypes semantic context model categorial structures Dipak Kalra terminology systems architecture identifiers for people policy models structural roles functional roles purposes of use care settings pseudonymisation EHR reference model data types near-patient device interoperability archetypes templates workflow guidelines care pathways continuity of care Semantic interoperability resource priorities • Widespread and dependable access to maintained collections of coherent and quality-assured semantic resources - clinical models, such as archetypes and templates rules for decision making and monitoring workflow logic • which are - mapped to EHR interoperability standards bound to well specified multi-lingual terminology value sets indexed and correlated with each other via ontologies referenced from modular (re-usable) care pathway components • SemanticHealthNet will establish good practices in developing such resources Dipak Kalra using practical exemplars in heart failure and coronary prevention involving major global SDOs, industry and patients Accelerating and leveraging knowledge discovery • We need to accelerate the discovery of new knowledge from large populations of existing health records • EHRs can provide population prevalence data and fine grained co-morbidity data to optimise a research protocol, and help identify candidates to recruit - almost half of all pharma Phase III trial delays are due to recruitment problems Dipak Kalra Electronic Health Records for Clinical Research • The IMI EHR4CR project runs over 4 years (2011-2014) with a budget of +16 million € – – – – 10 Pharmaceutical Companies (members of EFPIA) 22 Public Partners (Academia, Hospitals and SMEs) 5 Subcontractors One of the largest public-private partnerships • Providing adaptable, reusable and scalable solutions (tools and services) for reusing data from EHR systems for Clinical Research • EHRs offer significant opportunity for the advancement of medical research, the improvement of healthcare, and the enhancement of patient safety 3 The EHR4CR Scenarios • • • • Protocol feasibility Patient identification recruitment Clinical trial execution Serious Adverse Event reporting • across different therapeutic areas (oncology, inflammatory diseases, neuroscience, diabetes, cardiovascular diseases etc.) • across several countries (under different legal frameworks) 9 EHR4CR will deliver • Requirements specification – – • for EHR systems to support clinical research for integrating information across hospitals and countries Innovative Business Model – – for sustainability to stimulate the marketplace • Technical Platform (tools and services) • Pilots for validating the solutions: – – – different scenarios different therapeutic areas several countries 5 CHAPTER Centre for Health service and Academic Partnership in Translational E-Health Research Co-ordinator: Prof Harry Hemingway TRANSLATIONAL CYCLE T1: Omics and phenotyping Data quality and Acquisition Consent & Access Biostatistics T4: Supporting decision making for health gain •Clinician •Patient •Organisation CLINICAL RESEARCH PROGRAMMES Visualisation Cardiovascular (UCLH BRC, QMUL BRU) Maternal & Child health (GOSH BRC) Infection (BRC, HPA) Neurodegeneration (UCLH, BRU) Eyes (Moorfields, BRC) Curation & Sharing INFORMATICS CYCLE Computational / semi-automated analysis T2: Novel trial delivery CHAPTER Integration Linkage T3: Patient journey quality and outcomes The IMI is a unique Public-Private Partnership (PPP) between the pharmaceutical industry represented by the European Federation of Pharmaceutical Industries and Associations (EFPIA) and the European Union represented by the European Commission EMIF Project Vision To enable and conduct novel research into human health by utilising human health data at an unprecedented scale ‘Think Big’ •Access to information on > 40 million patients •AD research on 10-times more subjects than ADNI •Metabolics research on > 20,000 obese & T2DM subjects •Linkage of clinical and omics data •Development of a secure (privacy, legal) modular platform •Continue to build a network of data sources and relevant research Think Big Co-ordinator Janssen – Bart Vannieuwenhuyse 60 partners (3 consortia + Efpia) 170 individuals involved 14 European countries represented 48 MM € worth of resources (in-kind / in-cash) “3 projects in one” Project objectives EMIF: one project – three topics 1. EMIF-Platform: Develop a framework for evaluating, enhancing and providing access to human health data across Europe, to support the two specific topics below as well as research using human health data in general – Lead: Prof. Johan van der Lei, Erasmus University Rotterdam 2. EMIF-Metabolic: Identify predictors of metabolic complications in obesity, with the support of EMIF-Platform – Lead: Prof. Ulf Smith, University of Gothenburg 3. EMIF-AD: Identify predictors of Alzheimer’s Disease (AD) in the preclinical and prodromal phase, with the support of EMIF-Platform – Lead: Prof. Simon Lovestone, King’s College London EMIF – platform for modular extension EMIF governance Prevention algorithms Risk factor analysis EMIF - AD Call 5 TBD Predictive screening CNS Call 5 Risk stratification Patient generated data Research Topics EMIF - Metabolic Metabolic EMIF - Platform Data Privacy Analytical tools Semantic Integration Information standards Data access / mgmt IMI Structure and Network Researcher Browsing through directory of “data fingerprints” Controlled data access based on usage rights (Private Remote Research Environments) AD Metabolics Metabolics 3 Common Data Model Cohorts Cohorts Cohorts Cohorts Principle: EMIF will offer a platform to integrate available data allowing pooled analysis Principle: EHR data enables the search for patients with specific characteristics to form new cohorts. Data enrichment Patient selection 2 EHR datasets EHR datasets Historic patient data allowing “roll-back” to study trajectories Source of new epidemiology insights for patient subsegments 4 Cross Validation Analytical tools / methods 1 AD Long-term view Clinical Care incident monitoring & detection outcome analysis retrieval of similar patient history care management patients at risk re-admission prevention diagnosis & treatment assistance Clinical Research System biology Lead identification Biomarker definition Clinical trial Execution Market Access Ongoing safety tracking