advertisement

FUNDAMENTALS OF MEDICAL RESEARCH Ed Gracely, Ph.D. Family, Community, and Preventive Medicine July 15, 2014 Descriptive statistics and standard error of the mean Purpose and Goals Attendees should be able to: Define and interpret the most common descriptive and epidemiologic statistics, including 1. Averages (mean, median) 2. Measures of variability (SD, range, interquartile range) 3. The standard error of the mean 4. Relative risks and odds ratios. 5. Absolute risk reductions and number needed to treat. 6. Apply some basic rules for reading/creating graphs and tables. A) Basic descriptive statistics The sample is the set of subjects actually included in the study. The population is the larger body of individuals, generally too large to study, to whom the results are intended to generalize. 1) Averages: The mean is the ordinary average. Add up the N values and divide by N. The symbols X (sample mean) and μ (population mean) are commonly encountered. The median is the number closest to the middle of a rank ordered set. The 50-th percentile. ~ The median is sometimes symbolized as X Decisions: The mean better uses all of the data, and is generally preferred. Its big fault is that a few very high (or very low) values can pull it quite far from the center of the data. In those cases, the median is often preferred. Q: The mean salary of major league baseball players in 1993 was $1.16 million. The median was $490,000. (Source lost). What do the mean and median tell you about this group? a Q: In a study of bacterial endocarditis, time till diagnosis had median = 60 days, mean = 82, range 0 - 490. What can you conclude about this data set? (Source lost). b 2) Measures of variability Standard deviation: Perhaps the most common measure of the variability of the data around the mean. Roughly indicates how far the "typical" value is from the mean. 1 In a symmetrical (in particular in a bell-curve "normal" distribution) 2/3 of the values will be within 1 SD of the mean, 95% within 2 SD's. This statistic is often used with the mean. Ex: Mean + SD for heart rate was 140 + 35 in a group of children admitted for an attack of asthma. (modified St Chris data). Q: If heart rate is normally distributed, within what interval did about 2/3 of the heart rates fall? How about 95%? c A z-score is the number of standard deviations a value is above or below the mean, with values below the mean having negative z scores. DEXA scan results are reported as z scores, which are a z score based on the patient's own age and sex. A DEXA "T" score is also a z-score, but using the mean and SD of a healthy young population rather than the patient's own age group. Q: An older woman has a DEXA z score of -0.8 and a T score of -2. What does this pattern mean? d 3) Other measures of variability Both are often used with the median. Range: The difference between the largest and smallest values (and often just presented as those two rather than the "difference"). Interquartile range: The difference between the 75th and 25th percentiles, also often just given as those two values, rather than as the difference. B) Standard error of the mean (SEM) The SEM indicates how closely the sample mean approximates the population mean it estimates. You can be approximately 95% sure that the true mean (that is, the population mean) for the parameter of interest is within 2 SEM's of the sample mean. The standard error of the mean (SEM or just SE) is calculated as SEM = SD/ √ (number of subjects) Ex: "After 48 hours on oral rehydration therapy, a group of 36 children had a hematocrit of 37, SD = 4". The "37" is the sample mean. So SEM = 4/√36 = 4/6 = 0.67. Thus you can be 95% sure that the population mean hematocrit for subjects like those in the example is between 37 - 2(.67) and 37 + 2(.67) = 35.66 to 38.34 Ex: 36 children with hyaline membrane disease had a mean + SE birth weight of 1400 + 50 gm (modified from Pediatrics, August 1976). Q: Within what interval are you 95% sure the population mean birth of children like those in the study weight falls? e 2 Key points/distinctions: The SD tells the variability of individual subjects/values in either the sample or the population. The SEM tells how closely the sample mean approximates the population mean. In trying to determine the variability in the data in an article, always check to see if you are being given the SD or the SEM. Since the SEM divides the SD by the square root of the sample size, it is always much smaller! Standard errors for other parameters (percents, regression statistics) are sometimes encountered. They are used in the same way as the SEM, but there is no SD involved, and the calculation is different. Normally they will be given to you in the text. C) Some simple graphs and tables guidelines Ex: Consider the following data: Note: Subjects were asked if they had any limitations (such as physical issues) on activity. Presence or absence of personal activity limits in adults surveyed, by educational attainment No activity Any limit Total Education limit Less than High 546 (73%) 204 (27%) 750 School N (%) High School or 2,581 (89%) 311 (11%) 2,892 Higher N (%) Rule: Whenever you have two or more groups to be compared on a yes/no (or other two-level) dependent variable, the best approach is to find the percentage in the more interesting level of the dependent (outcome) variable for each of the groups to be compared. Then the comparison of interest is “Of those in group 1, x% had this characteristic. Of those in group 2, y% had that same characteristic.” Q: Take my example above and put it into this form. f Ex: Consider the following data: Age Young Old Total Fail 30 6 36 Succeed 70 14 84 Total 100 20 120 A researcher argues: “70% of the Young group succeeded (70/100), whereas only 30% of them failed (30/100). Furthermore, a full 83.3% of the successful subjects were in the young group (70/84). These results clearly show the association between age group and success”. Q: This argument is not valid. Why? Use my suggested approach above to make more appropriate comparisons. g 3 Remember: In almost all applications in medicine that involve comparisons of groups on a yes/no characteristic or outcome, we are comparing rates or percentages with the characteristic. Be sure you are comparing apples to apples. And note that it is rarely appropriate to use raw numbers. There are exceptions, but they represent special cases. Rule: All graphs should be clearly labeled with a title and axis labels. Show units, where appropriate. Ideally, a graph in a publication should stand alone, with minimal need to refer to text or other graphs just to know what it shows. Avoid undefined abbreviations or acronyms, unique undefined terms, and so on. Similar rules apply to tables. As for graphs, tables must be given a descriptive title. They should also have clear row and column headers. It is common to give both numbers and percentages. Also, indicate the units or contents of the table cells, like "N (%)". D) Relative risks and odds ratios Relative risk: the incidence in one (exposed) group divided by the incidence in the unexposed. When greater than 1, it is common to interpret a relative risk as how many times "more likely" something (like a disease) is to happen in one group than in another. When the relative risk is less than 1, it is often simpler to subtract it from 1 and interpret the result as a relative risk reduction. Ex: The relative risk of lung cancer in smokers may be 10 compared to non-smokers. This would mean that smokers were 10 times as likely to develop it as non-smokers. Ex: The relative risk of some disorder in smokers who quit years ago may be 0.3 compared to current smokers. The incidence in that group is 0.3 times that in current smokers. And we can say that quitting reduced their risk by 1-0.3 = 0.7 (or 70%). Q: A large group of healthy patients is followed for 4 years, with symptomatic diverticular disease as an outcome of interest. NSAID users were found to have a relative risk of 2.24 compared to non-users for this outcome. What does this mean? (Aldoori et al, Arch Fam Med, May-Jun 1998). h Ex: Kearney et al, Am. J. Epidemiology May 1, 1996, followed a group forward after assessing dietary variables. They were interested in colon cancer. The relative risk for that cancer was about 0.6 in those with the highest calcium compared to those with the lowest. Q: This suggests that people with high calcium had (pick all that apply): i a. A risk 0.6 times that of people in the low calcium group of getting colon cancer. b. A 0.6% probability of getting colon cancer. c. A risk of colon cancer that was 1.6 times that in the low calcium group. d. A risk that was 40% lower than those in the low calcium group. Q: What relative risk would indicate no difference? j After adjusting for some confounding variables, the relative risk was much larger and no longer convincingly different from 1. We’ll discuss this kind of thing further in a later class in this series. 4 Odds ratio: How many times the "odds" of something happening are increased in one group compared to another. Rule: when the outcomes are uncommon (say, < 10% in either underlying population), the odds ratio approximates the relative risk. It is generally interpreted in that way. EBM note: For clinical purposes, absolute risk reductions are more useful than relative. This is the difference in the %'s. Ex: 2% of cases of a certain infection recur after cure. With a new antibiotic, only 1% do. This is a relative risk reduction of 50%. Ex: 40% recur, but only 20% on a new antibiotic. This is also a 50% relative risk reduction. Q: Which represents a bigger real impact (or are they the same)? The absolute risk reductions are 1% and 20%. These tell you more about the value of the new treatment. The number needed to treat is the inverse of the absolute risk reduction (as a decimal): 1/.01 = 100. 1/.20 = 5 So, using the new antibiotic would prevent one patient in 100 from recurring with the first disease, but fully 1 in 5 with the second. QUIZ 1. You are reporting on a sample of stroke patients, giving the number of years since their stroke. The results are: 0.6, 1.2, 1.5, 2, 2.6, 3, 3.2, 5, 8, 25, 45 What single statistic best summarizes the average response? a. b. c. d. e. 2. In your practice of hypertensive patients, you have a database that records their most recent SBP and DBP. Your assistant calculates the mean + SD for the SBP, and finds 120 + 10. a. b. 3. The standard deviation The standard error of the mean The median The mean The relative risk Within what interval (assuming rough normality) would you expect about 95% of the SBP values to fall? About what % of your patients had a last SBP value of 140 or higher? Wintemute et al. studied the criminal activities of handgun purchasers who had, or did not have, a prior misdemeanor conviction as of the date of their purchase. Over the next 15 years, 10% of those without an initial conviction committed a subsequent crime. For those with a prior conviction, the relative risk of a second crime (compared to those initially crime-free) was 5. [JAMA, Dec 23/30, 1998] What % of those with a prior conviction went on to commit a second crime? 5% 10% 20% 25% 50% 75% All of them 5 4. You read papers on multiple sclerosis in two different major US cities. The paper for Philadelphia presents an annual incidence rate. The one for Chicago presents a prevalence rate. You don't notice that difference and think both are incidence rates. You are surprised at the data! Which city would have the higher reported rate, and why? 5. The normal range for WBC is 5-10 (thousand per microliter). In a sample of patients with a certain lowlevel infection, the range of WBC is given as 5 to 40, median 25, IQR 10 to 35. What % of this sample is in the normal range for WBC? What % is above 25? 6. To help determine the distribution of individual subjects, one should use the (SD // SEM). To get a margin of error for estimating the population mean, one should use the (SD // SEM). 7. You see that the mean on a certain blood parameter was 20, SEM = 1. Does this indicate that the researcher has very little variability in the data? a. b. c. d. Yes No, because the SEM was 5% of the mean. No, because the SEM depends on N and therefore does not directly indicate data variability. No, because we don't have the relative risk. 8. In #7, where are you 95% sure that the true (population) mean lies? to . 9. 10% of patients given the standard treatment recur. 8% of patients given the new treatment recur. How many patients would have to get the new treatment rather than the old in order for 1 to benefit? 10. A researcher reports that 20 of the older children had attained a skill (like self-injection) after training, but only 10 of the younger group did so. Your comment: a. b. c. Interesting result. If this is beyond chance it would show the importance of age on learning this skill. I need the SD of the yes/no skill outcome in order to interpret this result. I need the number of old and young kids and then the % successful in each age group to interpret this result. FMR.1_of_4.descriptives and SEM.doc Quiz answers 1: c, since the data has high extremes 2: 100-140 2.5% would be 140 and above 3: RR=5 4: Chicago: since people live many years with MS, the prevalence will be many times the annual incidence. 5: 25% in NR, 50% above 25. 6: SD then SEM 7. c 8: 18 to 22 (2 SEM’s on either side of the mean) 9: 50 (Absolute risk reduction is 2% 0.02. 1/0.02 = 50). 10 C a b c d e f g h i j The few players who make very high salaries pull up the mean but don't affect the median. There are some very long times till diagnosis. The mean is > median, and neither is mid-range. 105-175 for 2/3. 70-210 for 95%. She is moderately low for her age group (0.8 SD below the mean) but more seriously low compared to a young population (2 SD's below their mean. A T score < -1 is osteopenia). 1300-1500 Of those with less than high school, 27% had any limit. Of those with HS or higher, 11% did. In the young group, 70% succeeded. In the old group, 70% (14/20) succeeded. The success percentages are the same in the two age groups. NSAID users were 2.24 times as likely to develop the disease. (a) and (d). 1.0 6