A brief introduction to epidemiology IMED Medicine for Innovators and Entrepreneurs Rita Popat, PhD Health Research and Policy Division of Epidemiology April 21, 2009 Outline What is epidemiology? Basic concepts Study designs Measures of associations Causality Applications using diabetes as an example 2 Definition of Epidemiology “Epidemiology is the study of the distribution and determinants of health-related states or events in populations and the application of this study to control health problems” - Last JM 3 Aims of Epidemiology 1. Determine risk factors of various diseases 2. Identify segments of the population with highest risk to target prevention and intervention opportunities 3. Evaluate the effectiveness of health programs and services in improving health of the population 4 Clinical epidemiology Epidemiology/clinical research informs all of the following: What diseases should you be looking for in patients? Which patients should you screen for disease? How are diseases are diagnosed? What does a disease diagnosis mean? What conditions cause the disease (risk factors)? How to prevent disease in your patients? How to treat diseases in your patients? What is disease prognosis? Public health policy/standard of practice. 5 IMED: Project topics(subset) Design an improvement of device for continuous glucose monitoring Design a device to predict and prevent nocturnal and post-exercise hypoglycemia. Design an improvement on nutritional management of diabetes. Identify a specific complication of diabetes (ie, cardiovascular, renal, neurological, eye, etc) and design an innovative approach to diagnosis or treatment. Evaluate the current state of information on the genetics of Type 2 diabetes and design an innovative experimental or technical approach in this area. 6 IMED: Project topics(subset) Design a novel or improved diagnostic test for insulin resistance that can be used to screen for undiagnosed type 2 diabetes. Propose an approach to identifying prognostic markers for type 2 diabetes mellitus for a target population of your choice. Design an approach to identifying genes involved in diabetes susceptibility (type 1 or type 2) in a population group with a high incidence of diabetes. Develop a project related to the issue of depression in people with diabetes. 7 Disease classification Case definition or phenotype General Phenotypic classification Presence or absence of clinical or pathologic manifestations Pathogenetic classification Underlying pathophysiologic process Specific Etiologic classification Underlying causal mechanisms 8 Why is disease classification important? Disease heterogeneity - Clinical heterogeneity influences diagnosis, management, prognosis - Etiologic heterogeneity (risk factors may vary) Diabetes TYPE 1 Diabetes Mellitus (T1DM) TYPE 2 Diabetes Mellitus (T2DM) 9 THE LANCET • Vol 359 • January 5, 2002 10 Descriptive epidemiology Frequency of disease Prevalence: number of people in the population that have the disease at a given point in time Incidence: Rate at which new cases of disease appear in the population (# of new cases / persons at risk) Note: Prevalence = Incidence x duration Who gets the disease? Frequency of disease by location, ethnicity, age, gender, and time period. 11 Descriptive epidemiology of T1DM One of the most common childhood illnesses World-wide estimated 50,000 new cases annually In Caucasion populations: 1-3 per 1000 children by the age of 20 years Prevalence in age group 1-15 yrs: .05%-0.3% Approximately 123,000 persons aged 0-19 years in the U.S. who currently have T1DM. 12 40 35 30 25 20 are about 1.5 times more likely to develop TIDM as AfricanAmericans or Hispanics Asian populations have the lowest rates 15 10 5 0 Ja pa G er n m an y Fr an ce E st on ia H ol la nd N e w Sp a Ze i n al a A nd us tr al W i A a ,U S A E ng la S nd w ed en S ar di n Fi ia nl an d IDDM incidence per 100,000 Incidence of Type 1 diabetes (ages 0-14 years) in the area American non-Hispanic whites Why the geographic and ethnic variation? a. Difference in susceptibility genes? b. Differences in prevalence of causative environmental factors? c. Combination of a and b? 13 Descriptive epidemiology of T2DM (or NIDDM) A pandemic? Affects more than 16 million Americans and 135 million people worldwide Prevalence varies in different countries and among different racial and ethnic groups In 2025, estimated worldwide prevalence among adults 5.4%; 300 million adults are estimated to have T2DM The majority of cases of T2DM in the future will occur in developing countries (e.g., India and China) 14 National Diabetes Surveillance System Incidence of diabetes Prevalence of diabetes Source: http://apps.nccd.cdc.gov/ddtstrs/ 15 http://apps.nccd.cdc.gov/d 16 Risk factors Factors that influence the risk or occurrence of disease are called risk factors (a) demographics (age, sex, race) (b) genetic factors (c) physical and biologic agents (drugs, chemical agents) (d) life-style factors (smoking, diet,exercise) 17 Some important contributions of epidemiology Risk factor Smoking DES Asbestos OCs Pesticides Disease …………………… …………………… …………………… …………………… …………………… Protective agents Folate Low dose Aspirin Lung cancer Vaginal adenocarcinoma Mesothelioma Thromboembolic stroke Parkinson’s disease Disease …………………… Neural tube defects …………………… Stroke 18 THE LANCET • Vol 359 • January 5, 2002 19 Cohort Studies Disease Exposed Target population Disease-free cohort Disease-free Disease Not Exposed Disease-free TIME 20 Cohort Studies Exposed Disease T2DM (IE+) Obese Target population Disease-free cohort Not Exposed Normal wt. Disease-free Disease T2DM (IE-) Disease-free TIME 21 Measures of association in a cohort study Is exposure associated with disease risk? Relative Risk/Relative Rate (Risk Ratio) RR = IE+ / IEUsual application: Search for causes Absolute difference : IE+ - IEUsual application: Primary prevention impact 22 Relative Risk (RR) Exposure + Exposure - Diseased + a c Nondiseased b d a I E a b RR c I E cd a+b c+d Incidence (probability) of disease among exposed Incidence (probability)of disease among unexposed Relationship of physical activity vs BMI with T2DM in women. Design, Setting, and Participants: Prospective cohort study of 37,878 women free of cardiovascular disease, cancer, and diabetes with 6.9 years of mean follow-up. Weight, height, and recreational activities were reported at study entry. Normal weight was defined as a BMI of less than 25; overweight, 25 to less than 30; and obese, 30 or higher. Active was defined as expending more than 1000 kcal on recreational activities per week. Main Outcome Measure: Incident type 2 diabetes, defined as a new self-reported diagnosis of diabetes. Weinstein et al. JAMA 292:1188-94 24 Weinstein et al. JAMA 292:1188-94 Example: Body Mass index and risk of clinical Type 2 diabetes in women T2DM + Obese[E+] Normal wt. [E-] T2DM- 762 5786 6548 178 19452 19630 0.116 762 / 6548 RR 12.6 178 /19452 0.0092 Interpretation: Obese women (BMI > 30 kg/m2) have an ~13-fold increased risk of developing T2DM compared to women with normal weight (BMI<25.0 kg/m2) 25 Advantages/Limitations: Cohort Studies Advantages: Allows you to measure true rates and risks of disease for the exposed and the unexposed groups. Temporality is correct (easier to infer cause and effect). Can be used to study multiple outcomes. Prevents bias in the ascertainment of exposure that may occur after a person develops a disease. Disadvantages: Can be lengthy and costly! More than 50 years for Framingham. Loss to follow-up is a problem (especially if nonrandom). 26 Case-Control Studies Exposed Disease (Cases) Not exposed Target population Exposed No Disease (Controls) Not Exposed 27 Measure of Association in case-control studies: Odds Ratio (OR) Exposure (E+) Disease (D+) Cases a No disease(D-) Controls b c d No exposure(E-) a+c OR = P ( E|D ) P ( E | D ) Odds of exposure among cases Odds of exposure among controls b+d = P ( E| D ) P ( E | D ) = a a c c a c b b d d b d = ad bc 28 Rare disease assumption: Odds ratio approximates the Relative Risk 2X2 Contingency table set up from a cohort study Diseased Exposure + Exposure - Non-diseased a b a+b c d c+d Rare disease assumption a a a P( D | E ) ad ( a b) b a b b RR c c c P ( D | E ) bc cd (c d ) d d 29 History of t2dm in a first degree relative and disease risk T2DM Cases Controls Family history + 346 165 Family history - 391 620 737 785 (346)(620) OR 3.325 (391)(165) Interpretation: History of T2DM in a first degree relative is associated with a 3.3-fold increased risk of T2DM 30 Meigs J et.al. JAMA. 2004;291:1978-1986. Nested case-control design a case-control study nested within a cohort study Nurses’ Health Study (cohort) to: 1976 total n=121,700 bld samples n=32,826 in 1989-1990 Cases n=737 incident T2DM Controls n=785 (matched on age, race, fasting status) 31 Endothelial Dysfunction and Risk of Type 2 Diabetes Do these data suggest that risk of T2DM increases as levels of Eselectin increase? Yes, the risk of T2DM increases as E-selectin levels increase (aka dose-response effect). Meigs, J. B. et al. JAMA 2004;291:1978-1986. 32 Advantages and Limitations: Case-Control Studies Advantages: Cheap and fast Great for rare diseases Disadvantages: Exposure estimates are subject to recall bias (those with the disease are searching for reasons why they got sick and may be more likely to report an exposure) and interviewer bias (interviewer may prompt a positive response in cases). Temporality is a problem (did exposure cause disease or disease cause exposure?) 33 Putative Risk factors for T2DM: Genetic factors T2DM risk (i) Family history…………………… 2-6 fold risk if a (genetic+nongenetic factors) parent/sibling has T2DM (ii) Twin studies…………………… concordance in MZ twins (iii) Genes …………………… ? (T2DM is a complex genetic disorder) (iv) Race/ethnicity Prevalence of T2DM (%) 40 35 30 25 20 15 10 5 0 Non-Indian Half-Indian Extent of Pima Indian heritage Full-Indian Fat-mass and obesityassociated (FTO) gene: OR for t2DM ~1.25 35 Putative Risk factors for T2DM: Non-Genetic factors T2DM risk (i) Obesity Body Mass Index Waist-to-hip ratio …………………… …………………… (ii) Physical Activity …………………… (iii) Gestational diabetes …………………… …………………… (v) Inflammation …………………… (vi) Dietary factors …………………… (iv) Low birth weight (reduced beta-cell mass) difficult to summarize!! 36 . N Engl J Med 2008;359:2220-32 Background Type 2 diabetes mellitus is thought to develop from an interaction between environmental and genetic factors. We examined whether clinical or genetic factors or both could predict progression to diabetes in two prospective cohorts. Genetic factors: Clinical risk factors: Variants in 16 genes: TCF7L2, Family history of diabetes KCNJ11, PPARG , CDKAL1, IGF2BP2, DKN2A/CDKN2B FTO, BMI, Age, Gender, Impaired HHEX, SLC30A8, WFS1, JAZF1, insulin secretion and action CDC123/CAMK1D, SPAN8/LGR5, and others…. THADA, ADAMTS9 and NOTCH2 37 . N Engl J Med 2008;359:2220-32 Conclusions As compared with clinical risk factors alone, common genetic variants associated with the risk of diabetes had a small effect on the ability to predict the future development of type 2 diabetes. The value of genetic factors increased with an increasing duration of follow-up. 38 Modifiable factors and T2DM Percentage of type 2 diabetes that is potentially preventable by life-style modifications. T2DM: Low-risk definition •body mass index <25 kg/m2 •physical activity equivalent to >30 min per day of brisk walking •Good diet (e.g., low in saturated and trans-fat, fiber) •moderate alcoholic consumption •Nonsmoking Source: Willet WC. Science, 2002 39 THE LANCET • Vol 359 • January 5, 2002 40 Intervention Studies: evaluating therapies for treatment or prevention Disease Intervention group Target population Disease-free cohort No intervention group Disease-free Disease Disease-free TIME 41 Stages in testing new therapies Phase I Unblinded, uncontrolled studies in a few volunteers to test safety Phase II Relatively small controlled clinical studies to evaluate drug effectiveness, short term side effects Phase III Relatively large, randomized, controlled, blinded trials to test effect of therapy on clinical outcomes Phase IV Large trials or observational studies after drug is approved by FDA to assess rate of serious side effects and other uses 42 Intervention Studies/Clinical trials Advantages: Allows randomization (controls for confounding) Allows double-blind assessment (controls bias) Disadvantages: • Can be lengthy and expensive • Loss to follow-up is a problem (especially if nonrandom) • Addresses a narrow clinical question • Ethical limitations 43 (N Engl J Med 2002; 346:393-403.) 44 (N Engl J Med 2002; 346:393-403.) 45 (N Engl J Med 2002; 346:393-403.) 46 Best Experiment/RCT Prospective cohort Cost and ease Retrospective cohort Case-control Correlation/x-section Proof of cause Case series Case report Best 47 Causation in Observational Studies Association does not prove causation If a putative risk factor and the occurrence of an outcome are strongly associated with each other it does not provide evidence that the risk factor causes the disease, only implies that it is correlated with outcome Non-causal explanations may cause a spurious association – study biases (measurement error, selection bias, confounding) 48 Confounding Confounding is the defined as a distortion of an exposure-outcome association brought about by the association of another factor with both the outcome and exposure Exposure (e.g., depression) Outcome (e.g.,T2DM) Confounding factor (e.g., physical activity ) 49 Depression and risk of T2DM JAMA. 2008;299(23):2751-2759 50 Depression and risk of T2DM JAMA. 2008;299(23):2751-2759 51 Selection bias A form of sampling bias due to systematic differences between those who are selected for study (or agree to participate) and those who are not selected (or refuse to participate). Information bias (measurement error) Erroneous classification of disease status or exposure into a category other than that to which it should be assigned. 52 Study designs and biases Threats to internal validity Case-control Cohort RCT Confounding Generally present Generally present Not likely (due to randomization) Selection bias Likely (e.g., when May occur due to May occur due to low response differential loss to differential loss to rates) follow up follow-up Misclassification More likely to be differential of exposure Generally nondifferential Most likely nondifferential Most likely nondifferential Misclassification of outcome Not likely; if exists then nondifferential (e.g., drop-in/drop out) Most likely nondifferential Diagnostic tests The purpose of a diagnostic test is to move the estimated probability of the presence of a disease toward either end of the probability scale. Rule out 0 Rule in 100 Almost all approaches to gathering clinical information can be considered as tests (e.g., history, physical exam) 54 Accuracy of Diagnostic tests True classification of disease status (gold standard) Present Absent Positive (present) Diagnostic Test result Negative (absent) A True positives C False negatives A+C B False positives D A+B C+D True negatives B+D Sensitivity and specificity are terms used to describe the validity (accuracy) of the diagnostic test relative to the gold standard. 55 Accuracy of Diagnostic tests Sensitivity: proportion of persons with the disease of interest who have a positive test result Sensitivity = TP/(TP+FN)=A/(A+C) Specificity: proportion of persons without the disease of interest who have a negative test result Specificity = TN/(FP+TN)= D/(B+D) 56 Accuracy of Diagnostic tests 57 Background—The BD Logic® (Becton, Dickinson and Co.; Franklin Lakes, NJ) and FreeStyle® (Abbott Diabetes Care; Alameda, CA) meters are used to transmit data directly to insulin pumps for calculation of insulin doses and to calibrate continuous glucose sensors as well as to monitor blood glucose levels. Methods—The accuracy of the two meters was evaluated in two inpatient studies conducted by the Diabetes Research in Children Network (DirecNet). Meter glucose measurements made with either venous or capillary blood were compared with reference glucose measurements made by the DirecNet Central Laboratory. 58 59 Summary Measures of disease occurrence (prevalence and incidence) Study designs for evaluating role of risk factors in disease etiology: genetic and non-genetic factors Cohort Case-control Study designs for testing interventions/therapy RCTs Evaluating utility of diagnostic tests, devices, or biomarkers for diagnosis or prognosis Accuracy – sensitivity, specificity 60 Contact information Rita Popat, PhD Clinical Assistant Professor Office: HRP Redwood Building, Rm T209 Tel: 650-498-5206 E-mail: rpopat@stanford.edu 61