Medical Statistics: Hypothesis Testing Nimrod Lavi, MD Adhir Shroff, MD, MPH Clinical Decision Making: Introduction to Hypothesis Testing Agenda Types of variables Descriptive statistics What is a hypothesis Definition of a p-value Sample vs. universe Comparative statistics – – T-tests Chi-square 2 Clinical Decision Making: Introduction to Hypothesis Testing Agenda Types of variables Descriptive statistics What is a hypothesis Definition of a p-value Sample vs. universe Comparative statistics – – T-tests Chi-square 3 Clinical Decision Making: Introduction to Hypothesis Testing Continuous variable One in which research participants differ in degree or amount. “susceptible to infinite gradations” (p. 176, Pedhazur & Schmelkin, 1991) Examples: height, weight, age 4 Clinical Decision Making: Introduction to Hypothesis Testing Categorical variable Participants belong to, or are assigned to, mutual exclusive groups – Nominal – Used to group subjects Numbers are arbitrary Examples: sex, race, dead/alive, marital status Ordinal (rank) Given a numerical value in accordance to their rank on the variable Numerical values assigned to participants tells nothing of the distance between them Examples: class rank, finishers in a race 5 Clinical Decision Making: Introduction to Hypothesis Testing Independent vs Dependent Variable Independent – – “predictor variable” Usually on the “x” axis Dependent – – “outcome” variable Usually on the “y” axis Dependent The independent variable (a treatment) leads to the dependent variable (outcome) Ultimately, we are interested in differences between dependent variables Independent 6 Clinical Decision Making: Introduction to Hypothesis Testing Agenda Types of variables Descriptive statistics What is a hypothesis Definition of a p-value Sample vs. universe Comparative statistics – – T-tests Chi-square 7 Clinical Decision Making: Introduction to Hypothesis Testing Descriptive Statistics These are measures or variables that summarize a data set 2 main questions – – Index of central tendency (ie. mean) Index of dispersion (ie. std deviation) 8 Clinical Decision Making: Introduction to Hypothesis Testing Descriptive Statistics Data set for ICD complications in 2005 14 patients Sex: F, F, M, M, F, F, F, M, F, M, M, F, F, F Make: G, S, G, G, G, M, S,S, G,G, M, S Central tendency is summarized by proportion or frequency Sex: – – Categorical data – – 5/14 = .36 or 36% 9/14 = .64 or 64% Make: – M F G S M 6/12 = .5 or 50% 4/12 = .33 or 33% 2/12 = .17 or 17% Dispersion not really used in categorical data 9 Clinical Decision Making: Introduction to Hypothesis Testing Descriptive Statistics Data set SBP among a group of CHF pts in VA clinic 13 patients 100, 95, 98, 172, 74, 103, 97, 106, 100, 110, 118, 91, 108 Continuous variable Central Tendency Mean – – Median – mathematical average of all the values Σ (xi+xii…xn)/n value that occupies middle rank, when values are ordered from least to greatest Mode – Most commonly observed value(s) 10 Clinical Decision Making: Introduction to Hypothesis Testing Descriptive Statistics Data set SBP among a group of CHF pts in VA clinic 13 patients 100, 95, 98, 172, 74, 103, 97, 106, 100, 110, 118, 91, 108 Continuous variable Central Tendency Mean – – mathematical average of all the values Σ (xi+xii…xn)/n = (100+95+98+172+74+103+ 97+106+100+110+118+ 91+108)/13 = 105.5 11 Clinical Decision Making: Introduction to Hypothesis Testing Descriptive Statistics Data set SBP among a group of CHF pts in VA clinic 13 patients 100, 95, 98, 172, 74, 103, 97, 106, 100, 110, 118, 91, 108 Continuous variable Central Tendency Median – value that occupies middle rank, when values are ordered from least to greatest 74, 91, 95, 97, 98, 100, 100, 103, 106, 108, 110, 118, 172 Useful if data is skewed or there are outliers 12 Clinical Decision Making: Introduction to Hypothesis Testing Descriptive Statistics Data set SBP among a group of CHF pts in VA clinic 100, 95, 98, 172, 74, 103, 97, 106, 100, 110, 118, 91, 108 Continuous variable Index of dispersion Standard deviation – – measure of spread around the mean Calculated by measuring the distance of each value from the mean, squaring these results (to account for negative values), add them up and take the sq root 13 Clinical Decision Making: Introduction to Hypothesis Testing Descriptive Statistics: “Normal” 14 Clinical Decision Making: Introduction to Hypothesis Testing Descriptive Statistics: Confidence Intervals “Range of values which we can be confident includes the true value” Defines the “inner zone” about the central index (mean, proportion or ration) Describes variability in the sample from the mean or center Will find CI used in describing the difference between means or proportions when doing comparisons between groups Altman DG. Practical Statistics for Medical Research ;1999 15 Clinical Decision Making: Introduction to Hypothesis Testing Descriptive Statistics: Confidence Intervals For example, a “95% CI” indicates that we are 95% confident that the population mean will fall within the range described Can be used similar to a p-value to determine significant differences CI is similar to a measure of spread, like SD As sample size increase or variability in the measurement decrease, the CI will become more narrow 16 Clinical Decision Making: Introduction to Hypothesis Testing Descriptive Statistics: Confidence Intervals L a n c e t 1999; 3 5 4 : 7 0 8 – 1 5 Prospective, randomized, multicenter trial of different management strategies for ACS 2500 pts enrolled in Europe with 6 month followup Primary endpoints: Composite endpoint of death and myocardial infarction after 6 months 17 Clinical Decision Making: Introduction to Hypothesis Testing Descriptive Statistics: Confidence Intervals L a n c e t 1999; 3 5 4 : 7 0 8 – 1 5 18 Clinical Decision Making: Introduction to Hypothesis Testing Descriptive Statistics: Confidence Intervals L a n c e t 1999; 3 5 4 : 7 0 8 – 1 5 *Risk ratio= Riskinvasive / Risknoninvasive When CI cross 1 or whatever designates equivalency, the p-value not be significant. 19 Clinical Decision Making: Introduction to Hypothesis Testing Descriptive Statistics: Confidence Intervals L a n c e t 1999; 3 5 4 : 7 0 8 – 1 5 Review Calculate: – RRR, ARR, NNT RRR = (12.1-9.4) / 12.1 = 22% ARR = 12.1 - 9.4 = 2.7% NNT = 100 / ARR = 100 / 2.7 = 37 20 Clinical Decision Making: Introduction to Hypothesis Testing Agenda Types of variables Descriptive statistics What is a hypothesis Definition of a p-value Sample vs. universe Comparative statistics – – T-tests Chi-square 21 Clinical Decision Making: Introduction to Hypothesis Testing Hypothesis Statement about a population, where a certain parameter takes a particular numerical value or falls in a certain range of values. Examples: – – – A director of an HMO hypothesizes that LOS p AMI is longer than for CHF exacerbation An investigator states that a new therapy is 10% better than the current therapy Bivalirudin is not-inferior to heparin/eptifibitide for coronary PCI 22 Clinical Decision Making: Introduction to Hypothesis Testing Null Hypothesis (Ho) “Innocent until proven guilty” Null hypothesis (Ho) usually states that no difference between test groups really exists Fundamental concept in research is the concept of either “rejecting” or “conceding” the Ho State the Ho: – – – A director of an HMO hypothesizes that LOS p AMI is longer than for CHF exacerbation An investigator states that a new therapy is 10% better than the current therapy Bivalirudin is not-inferior to heparin/eptifibitide for PCI 23 Clinical Decision Making: Introduction to Hypothesis Testing Null Hypothesis (Ho): Courtroom Analogy The null hypothesis is that the defendant is innocent. The alternative is that the defendant is guilty. If the jury acquits the defendant, this does not mean that it accepts the defendant’s claim of innocence. It merely means that innocence is plausible because guilt has not been established beyond a reasonable doubt. Graduate Workshop in Statistics Session 4. Hamidieh K. 2006 Univ of Michigan 24 Clinical Decision Making: Introduction to Hypothesis Testing Agenda Types of variables Descriptive statistics What is a hypothesis Definition of a p-value Sample vs. universe Comparative statistics – – T-tests Chi-square 25 Clinical Decision Making: Introduction to Hypothesis Testing Extrapolation of Research Findings Sample population vs. the world If your study shows that treatment A is better than treatment B – – You cannot conclude that treatment A is ALWAYS better than treatment B You only sampled a small portion of the entire population, so there is always a chance that your observation was a chance event 26 Clinical Decision Making: Introduction to Hypothesis Testing Extrapolation of Research Findings At what point are we comfortable concluding that there is a difference between the groups in our sample In other words, what is the false-positive rate that we are willing to accept What is this called in statistical terms? 27 Clinical Decision Making: Introduction to Hypothesis Testing Agenda Types of variables Descriptive statistics What is a hypothesis Definition of a p-value Sample vs. universe Comparative statistics – – T-tests Chi-square 28 Clinical Decision Making: Introduction to Hypothesis Testing Definition of p-value With any research study, there is a possibility that the observed differences were a chance event The only way to know that a difference is really present with certainty, the entire population would need to be studied The research community and statisticians had to pick a level of uncertainty at which they could live 29 Clinical Decision Making: Introduction to Hypothesis Testing Definition of p-value This level of uncertainty is called type 1 error or a false-positive rate 30 Clinical Decision Making: Introduction to Hypothesis Testing Two Types of Errors Truth Decision Made Reject H0 Result Type I Error H0 True Trt has no effect Not Reject H0 Correct Decision Reject H0 Correct Decision Not Reject H0 Type II Error “Power” H1 True Trt has an effect Stay tuned…. Graduate Workshop in Statistics Session 4. Hamidieh K. 2006 Univ of Michigan 31 Clinical Decision Making: Introduction to Hypothesis Testing Definition of p-value This level of uncertainty is called type 1 error or a false-positive rate (a) More commonly called a p-value Statistical significance will be recognized if p ≤ 0.05 (can be set lower if one wishes) 32 Clinical Decision Making: Introduction to Hypothesis Testing Trade-Off in Probability for Two Errors There is an inverse relationship between the probabilities of the two types of errors. Increase probability of a type I error → decrease in probability of a type II error .01 .05 Graduate Workshop in Statistics Session 4. Hamidieh K. 2006 Univ of Michigan 33 Clinical Decision Making: Introduction to Hypothesis Testing Definition of p-value This level of uncertainty is called type 1 error or a false-positive rate (a) More commonly called a p-value In general, p ≤ 0.05 is the agreed upon level In other words, the probability that the difference that we observed in our sample occurred by chance is less than 5% – Therefore we can reject the Ho 34 Clinical Decision Making: Introduction to Hypothesis Testing Definition of p-value Stating the Conclusions of our Results When the p-value is small, we reject the null hypothesis or, equivalently, we accept the alternative hypothesis. – “Small” is defined as a p-value a, where a = acceptable false (+) rate (usually 0.05). When the p-value is not small, we conclude that we cannot reject the null hypothesis or, equivalently, there is not enough evidence to reject the null hypothesis. – “Not small” is defined as a p-value > a, where a = acceptable false (+) rate (usually 0.05). Graduate Workshop in Statistics Session 4. Hamidieh K. 2006 Univ of Michigan 35 Clinical Decision Making: Introduction to Hypothesis Testing Agenda Types of variables Descriptive statistics What is a hypothesis Definition of a p-value Sample vs. universe Comparative statistics – – t-tests Chi-square 36 Clinical Decision Making: Introduction to Hypothesis Testing One variable Continuous Categorical Mean, SD Frequency One-sample t-test Two variables T-test Chi-square Three or more variables ANOVA Chi-square 37 Clinical Decision Making: Introduction to Hypothesis Testing Two Sample Tests: Continuous Variable t-test Comparing two groups, statistical significance is determined by: – Magnitude of the observed difference – Bigger differences are more likely to be significant Spread, or variability, of the data Larger spread will make the differences not significant 38 Clinical Decision Making: Introduction to Hypothesis Testing Two Sample Tests: Continuous Variable 39 Clinical Decision Making: Introduction to Hypothesis Testing Two Sample Tests: Continuous Variable t-test Comparing two groups, statistical significance is determined by: – Magnitude of the observed difference – Spread, or variability, of the data – Bigger differences are more likely to be significant Larger spread will make the differences not be significant Key is to compare the difference between groups with the variability within each group 40 Clinical Decision Making: Introduction to Hypothesis Testing Two Sample Tests: Continuous Variable Types t-tests – Student t-test or two sample t-test Used if independent variables are unpaired Example: – – A randomized trial to high dose statin versus placebo post AMI Paired t-test Used if independent variables are paired – Each person is measured twice under different conditions – Similar individuals are paired prior to an experiment Each receives a different trt, same response is measured Example: – A study of ejection fraction in patients before and after Bi-V pacing 41 Clinical Decision Making: Introduction to Hypothesis Testing Two Sample Tests: Continuous Variable t-test Tails – “Two-tailed” – Most commonly used in clinical research studies Means that the treatment group can be better or worse than the control group “One-tailed” Used only if the groups can only differ in one direction 42 Clinical Decision Making: Introduction to Hypothesis Testing Example: t-test What type of test should be run? How are the data related or are they? Data entered into a statistical program… p value = 0.2329, not significant 43 Clinical Decision Making: Introduction to Hypothesis Testing Agenda Types of variables Descriptive statistics What is a hypothesis Definition of a p-value Sample vs. universe Comparative statistics – – T-tests Chi-square 44 Clinical Decision Making: Introduction to Hypothesis Testing Two Sample Tests: Categorical Variables Chi square (χ2) analysis Data that is organized into frequency, generate proportions Based on comparing what values are expected from the null hypothesis to what is actually observed Greater the difference between the observed and expected, the more likely the result will be significant 45 Clinical Decision Making: Introduction to Hypothesis Testing Chi square (χ2) analysis Outcome Therapy + - Totals Group A a b a+b c d c+d a +c b+d a+b+c+d “Control” Group B “Treatment” • Null hypothesis states that outcomes of therapy A and B are equally successful • This is how the expected outcomes are determined 46 Clinical Decision Making: Introduction to Hypothesis Testing Chi square (χ2) analysis Outcome Therapy Group A + - Totals a b a+b c d c+d a +c b+d a+b+c+d “Control” Group B “Treatment” • Next the actual observed values are then recorded • With this information the χ2 value can be calculated and a p-value will be generated 47 Clinical Decision Making: Introduction to Hypothesis Testing Example: χ2 analysis Arrange data into a 2x2 table Treatment groups along the vertical axis, Outcomes alone the horizontal axis 48 Clinical Decision Making: Introduction to Hypothesis Testing Example: χ2 analysis Data entered into a statistical program P-value 0.6392 Not a significant difference 49 Clinical Decision Making: Introduction to Hypothesis Testing Example: Ear Infections and Xylitol Experiment: n = 533 children randomized to 3 groups Group 1: Placebo Gum; Group 2: Xylitol Gum; Group 3: Xylitol Lozenge Response = Did child have an ear infection? Group Infection 1 placebo Y 2 gum N 3 lozenge Y 4 placebo N 5 gum Y 6 lozenge N Count 49 150 39 129 29 137 Graduate Workshop in Statistics Session 5. Hamidieh K. 2006 Univ of Michigan 50 Clinical Decision Making: Introduction to Hypothesis Testing Two Sample Tests: Categorical Variables Outcome Therapy + - Group A a b c d “Control” Group B “Treatment” (Observed Expected) = Expected allCells 2 2 51 Clinical Decision Making: Introduction to Hypothesis Testing Example: Ear Infections and Xylitol Infection Yes Group Placebo Gum Xylitol Gum Xylitol Lozenge Total Count Expected Count Count Expected Count Count Expected Count Count Expected Count 49 39.1 29 39.3 39 38.6 117 117.0 No 129 138.9 150 139.7 137 137.4 416 416.0 Total 178 178.0 179 179.0 176 176.0 533 533.0 Compute expected count for each cell: Expected count = (Row total) (Column total) / Total n Example: 39.1 = (178 × 117) / 533 Or intuitively, calculate overall infection rate = total number infected / total number = 117/533 = .2195 Now, assuming no difference between treatments, the infection rate will be the same in each group = .2195 x total for each group = .2195 x 178 = 39.1 Graduate Workshop in Statistics Session 5. Hamidieh K. 2006 Univ of Michigan 52 Clinical Decision Making: Introduction to Hypothesis Testing Example: Ear Infections and Xylitol Infection Yes Group Placebo Gum Xylitol Gum Xylitol Lozenge Total Count Expected Count Count Expected Count Count Expected Count Count Expected Count 49 39.1 29 39.3 39 38.6 117 117.0 No 129 138.9 150 139.7 137 137.4 416 416.0 Total 178 178.0 179 179.0 176 176.0 533 533.0 → From a table, p = 0.035 Graduate Workshop in Statistics Session 5. Hamidieh K. 2006 Univ of Michigan 53 Clinical Decision Making: Introduction to Hypothesis Testing Conclusion There are many ways to describe one’s data P-values are the maximum acceptable false positive rate Remember the Courtroom Analogy when it comes to the Null hypothesis Choice of statistical test depends on type of variable and number of comparison groups 54 Clinical Decision Making: Introduction to Hypothesis Testing References Neely JG, et al. – – – Laryngoscope, 112:1249–1255, 2002 Laryngoscope, 113:1534–1540, 2003 Laryngoscope, 113:1719 –1724, 2003 Guyatt G, et al. Basic Statistics for Clinicians. CMAJ. 1/1/95 http://www-personal.umich.edu/~khamidie/?M=A Altman, DG. Practical Statistics for Medical Research. 1999. 55 Clinical Decision Making: Introduction to Hypothesis Testing Thank you 56