Pharmacogenomics Ontology (PHONT) Network Resource Webinar April 6, 2012 Outline • Why PHONT • Background, Goals, Relevance • What PHONT • Early process, course correction • Standards selection, discussion • Status of work • Whither PHONT Background • Biology and Medicine have become big science • Requires interoperability • Clinical and research standards are emerging • Meaningful Use driving clinical operations • De facto genomic databases, ontologies • Advantages and requirements for interoperability • Meta-analyses and comparisons • EMR data integration, harvesting, outcomes Meaningful Use 13 January, 2010 Reporting Requirements 169pp Interoperability Standards 35pp 4 Goals - Present • Collaboratively evolve framework from representing PGRN data in comparable and consistent form • Facilitate interoperability within and without PGRN community • Clinical EMR data exchange • Larger-scale genomics and biology communities • Ultimately add value to PGRN sites • Library of standard models and value sets • Services to map extant data to common form Means to Achieve Goals • Identification of standard data models, in alignment with national requirements • Informed by data dictionary harmonization • Adaptation of models when needed • Infrastructure for centralized curation activities • PGRN specific terminology services • Standardization of value sets • Derived from national standards and norms • LOINC, SNOMED, RxNorm, …. • GO, dbGaP, NCBI, CDISC, … PHONT Activities • Data Dictionary Analysis • Semantic and syntactic analyses of PGRN 4483 variables • Impact: survey nature and diversity of PGx data elements (DEs) • Data Element Semantic Annotation and Harmonization • Normalization, semantic annotation, categorization • Impact: Identifies overlap and focus for standardization • Data Element Standardization • Mapping PGx DEs to existing data standards • Impact: Identification of standards gaps PHONT: Infrastructure • Infrastructure Development • • • • LexEVS (SNOMED, LOINC, RxNorm, …) CEM repository, OpenMDR Semantic annotation pipeline, CEM mapping, Curation Impact: Supports harmonization and standardization activities • Educational Resource Development • Information about standards and data representation • Rendering of PGRN site specific dictionaries • Impact: Lower the barrier of adoption PHONT: PGRN Collaborations • Translational PGx Project (TPP) • CPIC guidelines in clinical environments • Role: Semantically consistent EMR integration • PGx discovery in patient Populations (PG-Pop) • Patient cohorts through electronic phenotyping • Role: Adopt SHARPn phenotyping algorithms using emerge PGRN data standards • Clinical PGx Implementation Consortium (CPIC) • Therapeutic guidelines to implement PGx data • Role: Identifying existing standards and gaps PHONT: Developing Collaborations • PGx of Anticancer Agents Research (PAAR) • Service role for terminology standards • 5306 SNOMED codes for 2557 conditions • PGRN Statistical Analysis Resource (P-STAR) • Standards-grounded data dictionaries • PharmGKB • Supporting terminology, phenotype standards • Providing RxNorm and NDF-RT codes • International SSRI PGx Consortium (ISPC) • Reviewing data dictionary PHONT: Standards Liaisons • Impact current development activities, ensure PGRN requirements are addressed • Inform PGRN research sites about relevant activities Standards Development Organization (SDO) Clinical Information Modeling Initiative (CIMI) Clinical Data Interchange Standards Consortium (CDISC) Relevance to PHONT Developing and adapting Clinical Element Model (CEM) standardized models and value sets for clinical and research data representation Developing a library of harmonized and NCI Case Report Form (CRF) Work standardized data elements for NCI-funded clinical Group trials Developing message standards for genomics data HL7 Clinical Genomics Work Group used in clinical settings W3C Health Care and Life Sciences (HCLS) Work Group Extending the Translational Medicine Ontology to include PGx terms Large-Scale Informatics Consortia • NCBO • SHARPn • eMERGE • CTSA – Clinical Translational Sciences Awards • I2b2 - Informatics for Integrating Biology and the Bedside • SNOMED • WHO ICD11 Clinical phenotyping PGRN Realities - Motivation • PGRN is multi-disciplinary and data-intensive • Clinical phenotypes, Drug administration • Laboratory data, Genomics data • Data is often represented inconsistently • Difficult to compare across studies or institutions • Difficult to aggregate and integrate data • Standards are required to make data consistent and comparable • Increased semantic meaning (data and methods) • Enables accurate data transformations • Initial focus Course Correction • Engagement of PIs, designees in standardization • Web-tools for local curation of dictionaries • Expectation of meta-analyses, interoperability • Current realities • • • • Marginal overlap of PGRN domain, meta-analyses Understanding and buy-in underwhelming Emphasis on centralized curation Goal of adding value by demonstration • EHR integration, exchange, harvesting Clinical Element Models Higher-Order Structured Representations [Stan Huff, IHC] Pre- and Post-Coordination [Stan Huff, IHC] Data Element Harmonization http://informatics.mayo.edu/CIMI/ • Stan Huff – Intermountain Healthcare, HL7, LOINC • Clinical Information Model Initiative • NHS Clinical Statement • CEN TC251/OpenEHR Archetypes • HL7 Templates • ISO TC215 Detailed Clinical Models • CDISC Common Clinical Elements • Intermountain/GE CEMs Person Model Person PatientExternalId (0-M) data (II) PersonName (1-M) GivenName (0-1) Terminology data (ST) … Birthdate (0-1) data (TS) Value Set Value Set AdministrativeGender (0-1) data (CD) AdministrativeRace (0-1) AdministrativeEthnicGroup (0-1) … Value Set Person Model Examples of Variables Person PatientExternalId (0-M) data (II) Medical Record Number SSN Study ID PersonName (1-M) GivenName (0-1) data (ST) First Name Last Name … Birthdate (0-1) data (TS) AdministrativeGender (0-1) Date of Birth Year of Birth Patient Gender data (CD) AdministrativeRace (0-1) AdministrativeEthnicGroup (0-1) … Patient Race Self-Reported Ethnicity Lab Observation Model StandardLabObs Examples of Variables Alkaline phosphate Code data Potassium Creatinine PerformingLaboratory LaboratoryId data (ST) … LabInterpretation data (CD) Method data (CD) SpecimenCollected Subject … Analysis site Are liver function tests abnormal? Type of assay Specimen collection time Has blood been collected? Disease & Disorder Model DiseaseDisorder Code Atrial fibrillation data BodyLocation BodyLaterality data (CD) … Severity data (CD) StartTime data (TS) RelativeTemporalContext Subject … Examples of Variables Pulmonary embolism Are episodes of paroxysmal atrial fibrillation associated with eating? Duration of longest symtomatic episode Age of first angina Was the chest pain in the central or left chest? Chest pain or pressure in the past 4 weeks? Drug Administration Model Examples of Variables NotedDrug Code Is the patient taking a diuretic? data (CD) StartTimeUnconstrained data (TS/CD/ST) EstimatedInd Has the subject started any new medications? Date of last antihypertensives data (CO) TakenDoseLowerLimit data (PQ) RouteMethodDevice data (CD) StatusChange Subject … Medication start date Dose Have you taken digoxin in the past? Time on tamoxifen If potassium supplementation added, specify daily dose Relationships Person Person Person Semantic Link: Physician-Patient Example: Primary care physician Semantic Link: Parent-Child Example: Race of maternal grandfather Relationships Person Disease & Disorder Drug Admin. Semantic Link: Treatment-for-Disease Example: ALL treated by mercaptopurine Data Dictionary Analysis Patient Lab Observ'n Disease/ Disorder Drug Admin. Total 11 (10%) 8 (7%) 24 (22%) 32 (30%) 75 (70%) 3 (5%) 16 (27%) 1 (2%) 1 (2%) 21 (36%) PAPI 38 (12%) 170 (52%) 11 (3%) 0 (0%) 219 (68%) PAT 162 (10%) 123 (8%) 424 (26%) 179 (11%) 888 (55%) Pear 26 (3%) 72 (9%) 21 (2%) 242 (29%) 361 (43%) PGBD 10 (1%) 70 (7%) 53 (5%) 20 (2%) 153 (16%) phRAT 3 (16%) 3 (16%) 3 (16%) 8 (42%) 17 (89%) PNAT 45 (10%) 22 (5%) 64 (14%) 69 (15%) 200 (43%) XGEN 2 (3%) 0 (0%) 0 (0%) 0 (0%) 2 (3%) Mayo Paar4Kids Portion of Variables Mapped to CEMs Patient 301 (7%) Lab Observation 751 (18%) Other 2000 (46%) Disease/Disorder 601 (14%) Drug Administration 629 (15%) Categories of Variables Not Yet Mapped to CEMs Procedures 5% Adverse Events 6% Other 9% Genomics 6% Clinical Findings 41% QOL/Cognitive Assessment 33% Data Harmonization Unmapped Variables • Some variables are not currently represented by PHONT CEMs • Computed data (e.g., pharmacokinetics) • Genomic results • Work with SDOs to address these gaps • CIMI community on extant or new CEMs • HL7 and CDISC for clinical genomics data • W3C, NLM, & SNOMED pharmacogenomic ontologies • Collaborating PGRN groups • TPP, CPIC, P-STAR Future Plans • Impact of standardization • Integration into EMR systems • Phenotyping algorithms • Clinical decision support interfaces • Cohort selection • Future meta-analyses • Cross PGRN? • Among related large-scale collaborations • Query Health – ONC • Sentinal - FDA PHONT Activity Plan Develop Standardized Element Models Develop Harmonized Standards Engage External Standards Groups Study Use of Terms and DEs Develop Plug-ins to Expose Data Develop Curation Tooling Develop Infrastructure Education & Training Develop Network Collaborations Year 1 Year 2 Year 3 Year 4 Year 5 PHONT Personnel CG Chute, MD, DrPH (PI) Jyotishman Pathak, PhD (Co-I) Robert R. Freimuth, PhD (Co-I) Matthew J. Durski (Project Manager) Qian Zhu, PhD (Research Associate) Guoqian Jiang, PhD (Research Associate) Deepak K. Sharma Zonghui Lian Scott S. Bauer (Sr. Analyst Programmer) (Analyst Programmer) (Analyst Programmer) Donna Ihrke (Nosologist) Discussion • Appropriateness of proposed standards • • • • Patient Diseases and Disorders Drug Administration Lab Observations • Feasibility of prospective definition of data dictionaries and value sets