HIPAA and its Implications on Epidemiological Research Using Large Databases

HIPAA and its Implications on Epidemiological Research Using Large Databases K. Arnold Chan, MD, ScD 1 Harvard School of Public Health Channing Laboratory, Birgham & Women’s Hospital and Harvard Medical School Brief outline of this presentation Using large linked automated data for public health research ● Data development processes to ensure HIPAA-compliance ● Examples ● Some thoughts ● Two types of data for public health research ● Primary data – – – ● Prospectively collected Well-designed data collection tool Informed consent Secondary data – – – – Data originally collected for other purposes May be proprietary Privacy and confidentiality (particularly important if no prior authorization) Different data systems Large linked healthcare databases ● Health insurance claims data – Medicaid – Medicare – Managed Care Organizations (MCO) ● Automated medical records ● Hospital / Clinic IT systems ● Availability of written records ● Need to contact patients / individuals ? Public health research within MCOs ● Harvard Community Health Plan (subsequently became Harvard Pilgrim HealthCare) ● Kaiser Permanente (several states) ● Group Health Cooperative (Seattle area) ● Others ● HMO Research Network – 10+ MCOs across the U.S. Public health research within MCOs ● Different types of MCOs ● Group model – Staff model – Different relationship with hospitals – Implications on data access MCOs with research programs – Separate research departments – Full-time investigators and support staff – Data elements in the MCO data ● Demographic information ● Membership – ● Office visits – ● Start date, termination date, benefit plan, ... Type of visit, diagnosis(es), special procedures Special examinations – Radiology, Laboratory examinations ● Hospitalizations ● Drug dispensings ● Linkable by a unique ID HIPAA and Research with Databases ● ● Authorization from individual research subjects not feasible Individual authorization may be waived by Institutional Review Board or Privacy Board – Minimal Risk – Data reported in aggregate fashion ● No single-case report – “Minimum necessary” principle – De-identification HIPAA and Research with Databases ● Single MCO studies – ● Multiple-MCO studies – ● Investigators and research staff are MCO employees May involve transferral of data across MCOs or to a Data Center Other types of studies not covered in this presentation – e.g. Generate a de-identified dataset for public or commercial use HIPAA and data development ● Do not move individual level data unless absolutely necessary – Generate summary tables at each study site – Combine the tables for final report – Smalley et al. Contraindicated use of cisapride: the impact of an FDA regulatory action. JAMA 2000; 284: 3036-9. HIPAA and data development ● Randomly generated Study ID to replace True ID – Crosswalk between the two stored at secured location – Destroy the crosswalk after successful linkage of data and quality check – Implications for storage and back-up HIPAA and data development ● Roll-up / transform variables – Age --> Age groups – National Drug Code --> Drug or Group of drugs – ICD-9 diagnosis code --> Disease e.g. A man born on Dec 10, 1934 with diagnosis code xxx.yy received durg 55555333-22 – 65-70 y/o m with Heart Failure received Digoxin HIPAA and data development ● Preserve temporal sequence of events but disguise the real dates ● e.g. Drug use during pregnancy study – 29 year-old received 55555-333-22 on Nov 25, 1999 and delivered a baby on Dec 10, 1999 --> – 26-30 year-old mother delivered in 1999, baby exposed to amoxicillin at -16 days HIPAA and data development ● Only extract information relevant to the study – ● e.g. A study of osteoporosis does not require information on subjects' mental health status Co-morbid conditions may be relevant – Use proxy measures to describe level of comorbidity ● ● Charlson's Index (based on concomitant diagnoses) Chronic Disease Score (based on co-medications) HIPAA and data development ● Geocoding – Describe social-economic status of study subjects based on census tract data – Send out (Study ID, address) to a geocoding firm – (Study ID, X1, X2, X3) returned ● ● ● X1 : education level X2 : income level X3 : race/ethnicity information An example Finkelstein et al. Decreasing Antibiotic Use Among US Children: The Impact of Changing Diagnosis Patterns. Pediatrics 2003; 112: 620-7. ● ● Data elements involved – Date of birth, gender – Membership – Drug dispensings – Diagnoses in close proximity to antibiotics dispensings Data from nine MCOs Finkelstein et al. Pediatric antibiotics use study ● ● Data development at each MCO – Extract antibiotics use information – Extract diagnosis of interest (infections) – Use date of birth, gender, and membership data to calculate person-time of interest Refined, aggregate data forwarded to the Data Center – Rate of antibiotics use = # of antibiotics use / 1,000 person-years for each age-gender group HIPAA and data development ● Individual identification is needed for certain types of research – Obtain medical records – Contact patient to conduct interview and/or request specimen – Linkage with external data ● ● Cancer registry National Death Index HIPAA and data development ● ● The process – Data extraction, transformation, reduction, and deidentification carried out at each MCO – Governed by State laws and local HIPAA-compliant Standard Operating Procedures – Principle of Limited Dataset / Minimum necessary The goal – Highly processed and de-identified data available for concatenation across study sites and complex analyses k-anonymity and large datasets ● The goal – A de-identified dataset at a certain level of individual anonymity A 43 year-old man with hypertension, diabetes, and anxiety, taking atenolol, rosiglitazone, and lorazepam vs. A man 40-45 taking a beta-blocker and a thiazolidenedione HIPAA, Data Storage and Access ● Implications on Data Backup Plans – ● ● Data need to be destroyed after the report is published Data only used to support pre-defined analyses Ancillary analysis are possible after IRB review and approval Epidemiology studies using large databases ● ● In the old days ... – Give me all the data, do what I say ... – What if the investigator / reviewer want to do THIS analysis ? – Use existing datasets to test new hypothesis Good research practice – Define necessary data elements according to research protocol – Pre-defined analytic plan Epidemiology studies using large databases ● Keys to protection of human subjects – Competent, responsible investigators and staff – IRB review and oversight – Data development guidelines ● – ● e.g. Good Epidemiology Practice Information technology Some reasonable rules/guidelines are better than no guideline

HIPAA and its Implications on Epidemiological Research Using Large Databases

Related documents

Products

Support

HIPAA and its Implications on Epidemiological Research Using Large Databases

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib