The Art of the Possible Using CPCSSN Data for Primary Care Research Family Medicine Forum Nov 16, 2012 Karim Keshavjee - EMR Consultant & Research Data Architect Ken Martin - Information and Technology Manager Outline • • • • • • • Introduction to CPCSSN CPCSSN Data Holdings A Tour of CPCSSN Data Tables Current Research Projects at CPCSSN The Art of the Possible How to use CPCSSN data for your research Goodies for Today 10 PC-PBRNs •British Columbia - BCPCReN (Wolf ) •Alberta 329 physicians in 8 provinces using 10 EMRs - SaPCReN, Calgary (Med Access, Wolf) - AFRPN, Edmonton (Med Access) •Manitoba - MaPCReN, Winnipeg (Jonoke) •Ontario - DELPHI, London (Healthscreen, Optimed, OSCAR - NorTReN, Toronto (Nightingale, xwave, Practice Solutions) - CSPC, Kingston (P&P, OSCAR, xwave) •Quebec - Q-Net, Montréal (Da Vinci, Purkinje) •Nova Scotia / New Brunswick - MarNet, Halifax (Nightingale, Purkinje) •Newfoundland - APBRN, St. John’s (Wolf , Nightingale) CPCSSN population CPCSSN Population Data Extracted on all patients in the practice, including children Studying patients with the following chronic diseases • Chronic Obstructive Lung Disease • Depression • Diabetes • Hypertension • Osteoarthritis Chronic Neurological Disease • Dementia • Epilepsy • Parkinson's Disease Data Holdings Q2 2012 5 Database Schema - ERD 6 Data Cleaning/Recoding • We clean and recode the following fields • • • • • Billing, Encounter and Problem List Diagnoses (ICD9) Medications (ATC) Lab results (LOINC) Referrals (SNOMED CT) Physical signs (Wt, Ht, BP, unit conversion, calculate BMI) • Vaccines (ATC) • Risk factors (smoking, alcohol, diet --Text) 7 Patient Demographics 368,000 Records } < 5% } < 5% 8 Provider Information 9 Billing 6.8 Million Records Dates of Encounter Original diagnosis sent for billing Text from Code Recoded by CPCSSN Original Diagnosis Code sent for billing Recoded by CPCSSN 10 Research Discussion • Useful for case finding • Useful for understanding deficiencies of using billing information for clinical research • There is some inconsistency in use of billing codes across the country • CPCSSN recodes all billing diagnosis codes to a standard version 11 Encounters 5.1 Million Records Dates of Encounter Data inconsistent across the Country CPCSSN Cleaning Not Started Active area of Cleaning E.g., Office Visit, Phone, E-mail etc 12 Research Discussion • Can we segment patients by pattern of visits? • Does pattern of visits predict other things? – Control of disease – Frequency of prescriptions – Multiple comorbidities • Does visit type affect quality of care? • Reason for Encounter is poorly captured in most EMRs 13 Problem List Diagnoses 1.8 Million Records Original Diagnosis Written by User E.g. DMT2 Recoded by CPCSSN E.g., Diabetes Mellitus, Type 2 } Not well populated Active = Problem List Inactive = Past Medical History 14 Problem List Diagnoses List of cleaned up diagnoses Chronic airway obstruction, not elsewhere classified (496) Bronchitis, not specified as acute or chronic (490) Chronic bronchitis (491) Emphysema (492) Diabetes mellitus (250) Depressive disorder, not elsewhere classified (311) Suicide and self-inflicted poisoning by solid or liquid substances (E590) Suicidal ideation (V62.84) Adjustment reaction (309) Post traumatic stress disorder (309.81) Major depressive disorder, recurrent episode (296.3) Bipolar I disorder, most recent episode (or current) (296.7) Mental disorders complicating pregnancy, childbirth, or the puerperium (648.4) Essential hypertension (401) Osteoarthrosis and allied disorders (715) Spondylosis and allied disorders (721) Total knee replacement (81.54) Total hip replacement (81.51) Polycystic ovarian syndrome (256.4) Abnormal glucose tolerance of mother complicating pregnancy childbirth or the puerperium (648.8) Secondary diabetes mellitus (249) MORE BEING ADDED SOON Other abnormal glucose (790.29) Migraine (346) Heart failure (428) Acute myocardial infarction (410) Old myocardial infarction (412) Other forms of chronic ischemic heart disease (414) Cardiac dysrhythmias (427) Essential and other specified forms of tremor (333.1) Esophageal varices with bleeding (456.0) Esophageal varices without bleeding (456.1) Angina pectoris (413) Other acute and subacute forms of ischemic heart disease (411) Calculus of kidney and ureter (592) Portal hypertension (572.3) Asthma (493) Dementias (290) Alzheimer's disease (331.0) Dementia with lewy bodies (331.82) Parkinson's disease (332) Epilepsy and recurrent seizures (345) Epileptic convulsions, fits, or seizures nos (345.9) 15 Research Discussion • Sensitivity and specificity of problem list diagnoses not currently known, so cannot determine incidence and prevalence of disease from problem list alone • Need to develop case finding criteria for diseases (includes diagnosis, meds, labs, etc) • Need to identify sensitivity and specificity of having a diagnosis in the problem list • Currently in the process of validating 8 case finding criteria across the country 16 Vital Signs 5 Million Records Name of exam (e.g., sBP) Cleaned up result (e.g, lbs -> kg, inch -> cm) Cleaned up unit of measure (e.g., unit is kg, but result was lb) 17 Research Discussion • Currently have access to – sBP/dBP – Ht – Wt – BMI – Waist circum 18 Allergies 155K Records Name of allergen Cleaned up name Data will be coded as ATC 19 Research Discussion • Not yet cleaned, but will soon clean it • Focus of cleaning will be on medication allergies – All other allergies will be retained as original text • Useful when assessing why patients are not receiving medications for a particular disease 20 Risk Factors 588K Records Name of Risk Factor (e.g., smoking) Cleaned up version of Risk Factors. Working on cleaning up Current Exposures & Cumulative Exposures 21 Research Discussion • Risk factors are actively being cleaned • Getting the status of the risk factor (i.e., smoker/non-smoker) is difficult, but easier than • Current levels of exposure (e.g., # of cig/day) • Cumulative exposure (e.g., pack years) • Alcohol use is also being cleaned up 22 Laboratory Results 3 Million Records Original Lab Result Name (e.g., Hb A1c, HGbA1c, etc) Recoded by CPCSSN 100% LOINC (e.g., HBA1C) 23 Research Discussion • Currently only capturing the following HDL TRIGLYCERIDES LDL TOTAL CHOLESTEROL FASTING GLUCOSE HBA1C URINE ALBUMIN CREATININE RATIO MICROALBUMIN GLUCOSE TOLERANCE • One site does not capture labs yet 24 Encounter Diagnoses 6.3 Million Records Original Diagnosis Recorded in Encounter (e.g., axniety) 83% Recoded by CPCSSN (Anxiety ICD-9 300) 63% Originally coded by Doctor 25 Research Discussion • Not all EMRs capture Encounter Diagnoses in a structured manner • This table is not ready for prime time across all sites, but may be useful for projects where data from just a few sites is acceptable 26 Medications 4.9 Million Records 56% Coded as DIN What the doctor ordered E.g., HCTZ 25 mg bid 91% Recoded by CPCSSN E.g., Hydrochlorthiazide 72% Coded by doctor (DIN + other) 91% Coded by CPCSSN (ATC) } Strength 56% Dose 70% Unit of Measure 84% Frequency 95% Duration 52% Dispensed 86% 27 Research Discussion • Medication name data is relatively clean • Medications coded as ATC – Allows easy grouping by class • Don’t have daily dose and months supply for many records –working on clean up 28 Referrals 600 K Records Original Text of Referral 80% Recoded by CPCSSN SNOMED-CT 29 Procedures 1.3 Million Records Original Text of Procedure Not Currently Coded by CPCSSN 30 Vaccines 960 K Records What the doctor typed 93% Recoded by CPCSSN (ATC) 46% Coded by Doctor (DIN) 31 Disease Cases Case Definitions are developed by CPCSSN and are in the process of being validated through chart reviews 173,000 Records How a Case is identified is recorded in this table Allows full traceability for each case 32 Current Research Projects at CPCSSN N=46 Association Study Attitudes Audit and feedback Case control study Case Finding Clinical Quality Improvement Continuity of Care Data Quality De-identification Denominator Descriptive Study EMR Adoption Feasibility Intervention Assessment Medication Practice Profile Prevalence Prevalence, Case finding Resource Use SES Study Treatment pattern Validation 9% 2% 2% 7% 9% 2% 2% 20% 2% 2% 2% 2% 2% 2% 2% 4% 7% 2% 7% 4% 4% 4% Research Opportunities • Population Health and Epidemiological Studies – – – – – Incidence/Prevalence of disease Impact of SES on health Rates of treatment for diseases Rates of disease control Burden of illness and multi-morbidity • Clinical –database studies – – – – – – – Comparative effectiveness Case-Control Exposure-Outcome Quality Improvement Associations Intervention-Outcome Guideline effectiveness 34 Research Opportunities • Clinical –prospective, interventional studies – Conduct pragmatic RCTs –data is already collected – Conduct in-clinic interventions – Not ready for these yet • Health Services – EMR adoption – Resource Utilization (consults, labs, procedures) – Policy Intervention (cross-province comparisons) – Patient behaviors –frequency of visits – Medical errors and patient safety 35 Research Opportunities • Health informatics – Natural language processing – Machine learning – De-identification algorithms – Predictive Analytics • eHealth and mHealth – Develop and test apps using CPCSSN data – Patient education apps with their own data – Apps for healthcare providers to educate patients about their disease with nice visualizations 36 Research Using CPCSSN Data Researcher CPCSSN Research Committee Writes Letter of Intent Reviews Letter of Intent Researcher CPCSSN Research Committee No Approved Yes Letter of Acceptance Writes 1 page, includes: Researchers, Organization, Research Title, Objective, Methodology, Data Required 1. Protocol 2. Data Access Request Form 3. Data Sharing Agreement Invoice 1. Resubmit 2. Not Feasible 3. Outside Mandate CPCSSN Data Researcher 37 Goodies For Today • Copy of the presentation: The Art of the Possible: Using CPCSSN Data for Primary Care Research • Sample of CPCSSN data for 200 patients – Anonymized and scrambled to protect patient privacy – (MS Access file format) • • • • • CPCSSN database entity relationship diagram (ERD) CPCSSN database data dictionary CPCSSN central repository data holdings summary CPCSSN Data Access Request Form Central Repository Process for Requesting Access to CPCSSN Data 38 Next Steps • Sign a License Agreement today to get your copy of the CPCSSN Data Product • Evaluate the data CPCSSN has • Plan your next grant application around CPCSSN data • Add CPCSSN Data as a budget item into your next grant application – You can contact us to get a quote 39 Contact Tyler Williamson, Senior Epidemiologist Canadian Primary Care Sentinel Surveillance Network Centre for Studies in Primary Care Queen’s University Kingston ON K7L 5E9 Tel: (613) 533-9300, Ext. 73838 Fax: (613) 533-9302 e-mail: tylerw@cpcssn.org 40 Thanks to all Funders, Stakeholders, Partners, AND sentinel Physicians Funding for this publication was provided by the Public Health Agency of Canada The views expressed herein do not necessarily represent the views of the Public Health Agency of Canada. Cette publication a été réalisée grâce au financement de l'Agence de la santé publique du Canada. Les opinions exprimées ici ne reflètent pas nécessairement celles de l'Agence de la santé publique du Canada.