EPIDEMIOLOGY CLASS NOTES 1 EPIDEMIOLOGY NOTES Chester Jones PhD. I. Epidemiology-study of occurrence, distribution, & determinates of health & illness in a population. A. Types of epidemiology: 1. Classical epidemiology—general analytic & descriptive study of health & disease. 2. Clinical epidemiology-diagnosis, & management of illness & the critical review of literature. B. Types of epidemiology: 1. Descriptive: Uses existing data to compare how mortality or morbidity may vary among certain groups. 2. Analytic: More focused on the determinates of disease C. Types of studies: 1. Retrospective—Looking back at previous morbidity or mortality & a disease. These are less expensive & quicker to perform, but are subject to confounding & bias. 2. Case control—Compares the odds of past exposure to a suspected risk factor between cases (CA--diseased individuals) & controls (CO—non-diseased people). Results in an Odds Ratio (OR). Case controlled retrospective studies are the easiest & cheapest to do. But they are prone to recall & other forms of bias. 3. Prospective— a. Longitudinal—Study of a population over a (usually) long period of time. This is the only way to determine incidence. Temporal relationship is clear. Bias & confounding are easier to control. Only way to measure incidence. Expensive. Not efficient for rare diseases. b. Cohort—A group of healthy people are identified & followed for a specific time. Exposed & unexposed participants are compared in relation to the disease incidence. This type of study is time consuming & subjects are often lost & this is not an appropriate for rare diseases. Bias & confounding are easier to control. c. Cross-Sectional Studies—A sample of (or total) population is examined at a given point-in-time. Takes a snapshot of a cohort. d. Case-Controlled Studies—Most frequent method. Better for rare diseases & those with long induction times. Efficient…but bias prone. 4. Experimental—to establish cause & affect relationships through control of variables. C. Problems: 1. Research on humans is expensive, in some cases is unethical, difficult to control. 2. Diseases (especially chronic) are often insidious & difficult to link to a behavior or an exposure. 3. Chronic diseases often require years of exposure before signs or symptoms surface (latency period). 4. Humans are exposed to multiple risks over the course of their lives—sorting out the one/s that are responsible for a disease is difficult. This is “confounding.” Confounding may be controlled by: a. Prevention (randomization, matching, & restriction). b. Analysis (stratification & multivariate techniques). 5. The number of people with the disease may be small. 6. Bias can be reduced by standardization of forms & procedures, double-blind techniques, & larger random samples. EPIDEMIOLOGY CLASS NOTES 2 D. Epidemiological Models: 1. Traditional Model: AGENT HOST ENVIRONMENT 2. Health Field Concept: BIOLOGY/HEREDITY LIFESTYLE HEALTH CARE SYSTEM ENVIRONMENT D. Terms & concepts: 1. Attributable risk—attributable risk among the population. 2. Ecological Fallacy—generalization of data. 3. Environment—physical (air pollution, sun exposure) & social/psychological (stress). 4. Etiological Fraction—how much of an exposure is attributable to the disease among the population. 5. False Association—rates in the sample erroneously projected to the population 6. Incidence—new cases of a disease that occur during a study period. Incidence studies start with healthy subjects. 7. Lifestyle—activities, behaviors, consumption patterns. 8. Morbidity—Illness rates (incidences of non-lethal flu). 9. Mortality—Death rates (number of deaths from heart attacks). a. Infant mortality rate (age at which a child is no longer considered an infant varies from country to country). In this country, we usually consider a child an “infant” until one year of age. b. Specific mortality rate—Race, gender, SES. 10. Prevalence—all cases of a certain disease that have occurred (may be retrospective). a. Point prevalence: number of cases at a specific time. b. Period prevalence: number of cases during a specific point in time. c. Cumulative prevalence: cases at any time in the past (over a lifetime you will average X number of colds). 11. Prevention—the goal is the prevention of morbidity and/or mortality a. Primary prevention—prevent the disease from occurring (eradication of small pox through vaccination). b. Secondary—early detection & treatment of the disease (mammography). c. Tertiary—rehabilitation and/or restoration of effective functioning after occurrence of a disease (post-stroke rehab.). EPIDEMIOLOGY CLASS NOTES 3 12. Protective Characteristics-- a characteristic or behavior that prevents or protects a person from developing a disease or condition. a. BMI<40 b. Regular exercise 13. Risk Factor—a characteristic or behavior that places a person at risk for developing a disease or condition. a. Smoking b. High cholesterol diet 14. Standardization—Direct used in vital statistics. 15. Standard population—a stable population (demographics my be derived from census records. E. Causality (clearer in single cause/single effect cases): 1. Temporal relationship—A causes B (A comes first). Example--rhinoviruses cause colds. 2. Specificity: A cause leads to a single effect. 3. Strength or intensity—there is a strong relationship between findings. The correlation coefficient between eating under-cooked eggs & the development of salmonella toxemia is .85. 4. Consistency—Study after study arrive at the same findings. Schairere, et al (200), Newcomb, et al (1995), Stanford, et al (1995), & Schuurman, et al (1995) have determined that women taking hormone replacement therapy are at slightly increased risk for the development of breast cancer. 5. Coherence—does the relationship make sense? Is there a true relationship between A & B? II. CRUDE MORTALITY RATE (CMR): A. This rate expresses the actual observed mortality rate in a population & is considered the starting point for the adjustment of rates. Crude Mortality Rate (CMR) determined by the following formula: Total Deaths Crude Mortality Rate Total Population B. Age-adjusted standardization (Direct & Indirect standardization) 1. Common set of weights = population (standardization is used to adjust rates to account for differences in the population composition—i.e. age gender, ethnicity) 2. Direct: In direct standardization the calculation uses the standard population when looking at the composition specific rates to determine the expected number of events. COMMUNITY A COMMUNITY B Use this as a base— add them together or a common denominator 3. Indirect—standardized mortality ratio. Common set of rates used in analytical epidemiology—what would we expect (the number of cases of a disease then compare Community A to the national standard. Indirect standardization—standard rates are used & applied to the population & compared in order to calculate the expected number of events & then compared to the observed number of events. EPIDEMIOLOGY CLASS NOTES C. Example: (Direct Calculation of Crude Mortality Rate) AGE COMMUNITY A COMMUNITY B POPULATION DEATHS DEATH RATE PER 1,000 POPULATION DEATHS DEATH RATE PER 1,000 1,000 3,000 6,000 13,000 7,000 20,000 50,000 15 3 6 52 105 1,600 1,781 15 1 1 4 15 80 CMRA=35.6 5,000 20,000 35,000 17,000 8,000 15,000 100,000 100 10 35 85 160 1,350 1,740 20 .5 1 5 20 90 CMRB=17.4 <1 1-14 15-34 35-54 55-64 >64 TOTAL Crude Mortality Rate Community A 1,781 1,000 35.6 / 1,000 50,000 Community B 1,740 1,000 17.4 / 1,000 100,000 Total Deaths Total Population Calculation of expected Deaths In Age Groups & Age-Adjusted Death Rates Age Adjusted Death Rate Population Age Expected Death Rates 1,000 AGE STANDARD POPULATION (A+B) DEATH RATE IN A PER 1,000 EXPECTED DEATHS AT A’s RATE DEATH RATE IN B EXPECTED DEATHS AT B’s RATE <1 1-14 15-34 35-54 55-64 >64 6,000 23,000 41,000 30,000 15,000 35,000 150,000 15 1 1 4 15 80 CMRA=35.6 90 23 41 120 225 2,800 3,299 20 .5 1 5 20 90 CMRB=17.4 120 11.5 41 150 300 3,150 3,772.5 TOTAL Age Adjusted Death Rates Community A 3,299 1,000 21.99 / 1,000 150,000 Community B 3,772.5 1,000 25.15 / 1,000 150,000 Expected Deaths 1,000 Total Deaths( A B) X 15 6000 90 1,000 4 EPIDEMIOLOGY CLASS NOTES 5 Death Rate Total Population 1,000 (Indirect Calculation of Expected Death Rate in Community A) Calculated Expected Deaths AGE POPULATION OF A STANDARD DEATH RATE PER 1,000 EXPECTED DEATHS IN A AT STANDARD RATE <1 1-14 15-34 35-54 55-64 >64 1,000 3,000 6,000 13,000 7,000 20,000 50,000 20 .5 1 5 20 90 136.5 20 1.5 6 65 140 1800 2,032.5 TOTAL III. ATTACK RATE: Incidence rate used to describe the occurrence of a disease. A. Formula: Total Number Ill Attack Rate 100(%) Total Number Ill Well B. Example: A local elementary school reported that there were 68 confirmed cases of head lice among its 493 students. Determine the attack rate. Attack Rate 68 100 13.79 493 1. The attack rate of head lice is therefore 13.79% of the student population. 2. “Head lice has been diagnosed in 13.79% of the student population.” IV. INCIDENCE: Measure the rate at which people without a disease develop the disease during a specific period of time. (Prospective) A. Formula: Incidence Rate Number of new cases over a specific time Total population at risk of the disease in the same time period B. Example: In a fictitious study 3,567 adolescent male subjects (ages 14-17) who use smokeless tobacco products were followed for five years. During the study period, 283 of the participants developed oral squamous cell carcinomas. Incidence Rate 283 .07934 100,00 7933.84 per 100,000 per 5 year exp osure 3567 or 7933.84 1589.77 per year exp osure 5 EPIDEMIOLOGY CLASS NOTES C. Incidence Density: Used to compensate for variations in observation periods for the study subjects. The denominator becomes person-time of observation. Incidence Density Number of new cases during the time period Total person time of observation 1. Once the person contracts the disease, then they are kicked out & no longer observed. SUBJECT NUMBER OF MONTHS IN STUDY A B C D E F TOTAL 5 6 2 4 9 12 Time in Study PERSONYEAR Study Time 5/12 6/12 2/12 4/12 9/12 12/12 .4166 .5 .166 .33 .75 1 3.162 There were a total of three events. 3 (events) x .95 95 3.162 100( person years ) 100 person years 6 EPIDEMIOLOGY CLASS NOTES 7 V. PREVALENCE: The number of people with a disease at a given point in time. A. Formula: Pr evalance Rate Total number of cases of a disease at a specific time Total population at a given time VI. RISK: A. Problems—the risks of risks: 1.The people who are exposed will not develop the illness, but only have the probability of doing so. For example, all those who smoke will not develop lung cancer. 2. Some people who are not exposed to the disease/risk factor will develop the disease. For example, a very few cases of lung cancer are reported among those who have never smoker, nor have been exposed to second-hand smoke. B. Measures of Risk: Probability statements: 1. Absolute Risk: Synonymous with incidence & means the rate of occurrence of the disease (prospective—incidence studies). 2. Relative Risk (Risk Ratio): Odds Ratio—epidemiological measures of the association between exposure to a particular factor & risk of a certain outcome. a. Formula: Re lative Risk Incidence rate among those exp osed Incidence rate among non exp osed b. Example: You are trying to determine the odds of developing Hepatitis A after eating at Bubba Burger in Slapout, Alabama. You obtain a sample of 120 controls in the town population who have not eaten at Bubba Burger, you find that 8 have positive Hepatitis A titers. Of the 56 interviewed patrons of Bubba Burger you find that 12 have positive Hepatitis A titers. c. Re lative Risk 1 (exp osed ) 1(non exp osed ) d. Relative Risk & attributable risk show an association between exposure to a factor & risk of outcome. 3. Attributable Risk: Number or proportion of cases of illness or cause of death attributable to an agent. Attributable Risk Incidence of exp osed non exp osed 1,000 Incidence of exp osed EPIDEMIOLOGY CLASS NOTES VII. CASE CONTROL STUDIES: M1 M0 N1 A. Case Exposure Rate (CAE): N0 T a N1 B. Control Exposure Rate (COE): b No C. Null Hypotheses in Case Control Studies: 1. The Null Hypothesis is retained if CAE & COE are (by differences in exposure rate) very “close.” 2. The Null Hypothesis is rejected if the CAE & COE are very different. 3. No fast & firm guidelines exist for demarking “close” (defined by researcher). a. Note: “Close” in epidemiology is defined the same way you define “close” in horseshoes and hand grenades. b. “The purpose of statistics is to serve the interests of The Party.” V. I. Lenin c. To be confident, you will need to calculate the chi significance at the 95% CI. D. In consideration of Odds Ratios—if the number 1 is included within the spread of the confidence interval, then there is a significant difference in the Odds Ratio. 1. At the number 1, the risk of contracting the disease is 50:50 (equal risk). 2. If the Odds Ratio falls below one, the variable under consideration is considered to have protective qualities. 3. Special attention should be placed on Odds Ratios that are close to the number 1. 4. When you state the odds, begin with stating the behavioral or environmental risk. For example, “Among people who eat fish, the relative incidence of stomach cancer is 2.5 times greater than who not eat fish.” 8 EPIDEMIOLOGY CLASS NOTES 9 ODDS RATIO .05 .1 .25 .5 1 1.5 2 2.5 3 A B C D a. Study A—activity found to be protective. Wide Confidence Interval hints at low power (i.e. small sample). b. Study B—activity found to be a risk factor for the disease. Narrow Confidence Interval hints at high power (i.e. large sample). c. Study C—activity found to be non-significant. Confidence Interval falls across the number 1 (50:50 chance). a. Study D—activity found to be a risk factor. Confidence Interval falls very close to the number 1 (indicating that there may be only a weak association). E. Attributable Risk: (PAR—Population Attributable Risk). VIII. CHI-TEST: A. Chi- () significance test is approximately equal to the Z-score. 1. Remember with an =.05 the Z-score = 1.96. & this is used to calculate the 95% Confidence Interval. 2. The formula for : a M 1 N1 T N1 N 0 M 1 M 2 T 2 T 1 EPIDEMIOLOGY CLASS NOTES 10 B. Calculations of Case-Control (Odds Rations & Chi Significance Tests): Working through a problem: Linking diet high in fish to stomach cancer. You begin your study. You have 132 people who have stomach cancer. You interview them & find that 45 of them have diets high in fish. You then gather a control group sample from the population of 146 people. After interviewing them, you find that 25 of them eat diets high in fish. Now for the plot: 45 87 M1=70 121 M0=208 N0=146 N1=132 1. Case Exposure Rate (CAE): CAE 25 T=278 a 45 CAE .34 or 34% 132 N1 2. Control Exposure Rate (COE): COE b 25 COE .17 or 17% 146 N0 3. Null: a. The Null Hypothesis is retained if CAE & COE are close. b. It is Null Hypothesis is rejected if CAE & COE are different. ad 4. Odds Ratio (OR): OR Relative Incidence( RI) bc a d 5445 2.5 a. Or in our case: RI b c 2175 b. So we say: “People who eat fish have a relative incidence of stomach cancer 2.5 times greater than those who do not.” EPIDEMIOLOGY CLASS NOTES 11 C. Chi-Test (used to determine if there is a significant difference): calculating for fish diet issue. a M 1 N1 T N1 N 0 M 1 M 2 T 2 T 1 1. 2. 3. 4. 132 45 .25132 11.76 278 3.25 = = 280,600,320 3.62 (132)(146)(70(208) 21,407,668 278 2 278 1 45 70 Z An alpha level of .05 equals a Z-score of 1.96 If > 1.96, there is a statistically significant difference at =.05 = 3.25 is significant at the 95% Confidence Interval because it is >1.96 D. Confidence Interval calculations (for relative incidence): 1. RI 1 Z X = 2.51 1.96 Note: 2.5 is our Odds Ratio Point Estimate 3.25 RI Po int Estimate a. 2.51 .603 then… b. 1 .603 = .4 & 1.6 then… c. 2.50.4 (Input “2.5’ into calculator & then hit the “yx” key. Input 0.4 & then push the “=” key. This will give you “1.442699906” d. 2.51.6 4.3 (Input “2.5’ into calculator & then hit the “yx” key. Input 1.6 & then push the “=” key. This will give you “4.33215527” 2. Or, with a Confidence Interval of 95% the score is somewhere between 1.4 & 4.3. Because the number 1 does not fall within this spread, it is significant. 3. We say: “Among fish eater, the best estimate is that they have a 2.5 times increased risk of stomach cancer than non-fish eaters & that we are 95% confident that their true range falls between 1.4 & 4.3.” E. Attributable Incident Rate (AI) 1. Attributable Incident Rate Among Exposed (AIE): OR 1 = 2.5 1 1.5 .60 or 60% a. AI E % OR 2.5 2.5 b. We would then say, “Among people who eat fish, 60% of their stomach cancer is attributable to their diet.” 2. Attributable Incident Rate Population (AIT): a. AIT = (AIE%)(CAE) = (60%)(34%) = (.6)(.34) = .204 = 20% b. We can then say, “If fish were eliminated from the diet, incidences of stomach cancer would be reduced by 20% in the population. EPIDEMIOLOGY CLASS NOTES 12 IX. INTERACTION (CONFOUNDING): A. Two or more variables may interact in a synergistic fashion to compound the risk of a disease. 1. Disease 1 & 2 added together is much greater than the simple sum of each. CAUSE #1 OR=2.2 OR=3.5 CAUSE #2 OR=12.3 DISEASE 2. For example: The odds ratio of contracting Black-Lung disease among coal miners may be 2.2, but among miners who smoke the OR increases to 12.25. The joint effect is more than the sum of the risks simply added together. B. We will be using the following example: “What is the risk of mouth cancer among alcoholics who smoke?” 1. 390 subjects are alcoholics (ETOH +). 110 are non-drinkers. Controls consist of 275 alcoholics & 225 non-drinkers. FULL MODEL CA CO 390 275 a b 110 225 c d 500 500 N1 EXPOSURE ETOH-YES ETOH-NO TOTALS a. CAE N0 390 .78 (78%) 500 b. COE 275 .55 (55%) 500 c. Relative Incidence: RI a d 87750 2.9 b c 30250 TOTALS 500 M1 500 M0 1000 T EPIDEMIOLOGY CLASS NOTES d. Chi-Test: a M 1 N1 T N1 N 0 M 1M 0 T 2 T 1 = 390 .5500 = 62,500,000,000 999,000,000 390 500 500 1000 (500)(500)(500)(500) 10002 1000 1 140 62.56 140 17.7 7.9 e. = 17.7 is significant at the 95% Confidence Interval because it is > 1.96 f. RIˆ1 Z X = 2.91 1.96 Note: 2.9 is our Odds Ratio 17.7 2.91 .11 = 2.9-.89=2.6 & 2.9+1.11=3.3 g. We can say that, “People who drink alcohol are at 2.9 greater risk of mouth cancer with a 95% confidence interval between 2.6 & 3.3.” 2. Now add smoking to the equation: NON-SMOKERS CA CO 75 150 a b 50 150 c d 125 300 N1 EXPOSURE ETOH-YES ETOH-NO TOTALS RI N0 TOTALS 225 M1 200 M0 425 T N0 TOTALS 440 M1 135 M0 575 T a d 11250 1. 5 b c 7500 SMOKERS EXPOSURE ETOH-YES CA 315 ETOH-NO 60 TOTALS 375 CO 125 a 75 c d 200 N1 RI b a d 23625 3.15 (round to 3.2) bc 7500 13 EPIDEMIOLOGY CLASS NOTES 14 a. Compare the ratios: RINS=1.5 RIS=3.2 b. The ratio is from .5 to 2.2 because you take out “1” from both RI’s. You subtract 1 because 1 stands for equal risk. 2.2 4.4 c. Smokers have a 4 times greater risk of mouth cancers. .5 ALCOHOL + + SMOKE + + SUBJECTS Don’t smoke & don’t drink Drink but don’t smoke Smoke but don’t drink Drink & smoke CA 50 (b group) 75 (a1) 60 (a2) 315 (a3) CO 150 (d group) 150 (c1) 75 (c2) 125 (c3) RI 1.0 1.5 2.4 7.6 Those who smoke and drink are 7.6 times more likely to develop mouth cancer than those who neither drink nor smoke. ACHRONIMS AIE%: Attributable Incident Rate Among Exposed AIT%: Attributable Incident Rate Population (Etiological Fraction) RI Po int Estimate CA: Cases CAE: Case Exposure Rate CI: Confidence Interval COE: Control Exposure Rate OR: Odds Ratio RI: Relative Incidence RI Po int Estimate