21. Introduction to Biostatistics – Part Two [Start of recorded material] Estimation Suppose our estimate of quality of life of testicular cancer survivors is 70 on a questionnaire with a range from zero to 100 with a 95% confidence interval from 40 to 100. This is not very precise. How can we make the confidence interval smaller? Increasing sample size increases measurement precision. We become more confident in our results as the precision increases, thus the 95% confidence interval decreases in width. If the 95% confidence interval was 65 to 75, we would be reasonably sure that the true quality of life lies between 65 and 75. Comparing Groups : Randomised Controlled Trials The last example on the average quality of life in testicular cancer patients was an example of a cross-sectional study. Suppose we have a randomised controlled trial so that our interest is in comparing groups, sometimes called trial arms. Suppose we would like to answer the question, does physical activity result in less fatigue in lung cancer patients? Comparing Groups : Observational Study We also compare groups in observational studies. For example, if we want to answer the question of whether the consumption of red meat is associated with the occurrence of colorectal cancer, we could perform either a case controlled study or a cohort study. For more on epidemiological or observational study design, see the Study Design module. Null Hypothesis RCT, the null hypothesis, is usually that there is no difference between the two, sometimes more, groups; i.e. that that is no effect of the intervention. The mean difference in self reported fatigue scores is zero. In an epidemiological study, the null hypothesis is that there is no association between exposure and disease. For example, the odds of colorectal cancer, the disease, among people who eat red meat, the exposure, is the same as the odds for people who don’t eat red meat. The odds ratio is one. This brings us to hypothesis testing. We start with the null hypothesis. In general, no difference between groups, no association between exposure and disease, or the status quo. Research Hypothesis For every null hypothesis, there is an alternative, or a research hypothesis. For our examples, there is a difference in fatigue between physical activity group and the no physical activity group. There is an association between eating meat and developing colorectal cancer. -1- The Steps of Hypothesis Testing The steps of hypothesis testing are (1) set up a null and alternative hypothesis. We set the significance level, alpha, for the test. This defines what a small P-value is. This is conventionally point zero five. Step two, assuming the null hypothesis is true, calculate the P-value. Step three, compare the P-value with the significance level. If P is less than alpha, reject the null hypothesis in favour of the alternative. If P is greater than alpha, do not reject the null hypothesis. Significance If P is less than point zero five, we say the difference is statistically significant at the 5% level. If P is greater than point zero five, we say the difference is not statistically significant at the 5% level. Possible Outcomes The possible outcomes of a hypothesis test are a statistically significant difference could mean either that there is a true difference, or there is no true difference, this study just found unusual results (a type one error). No statistically significant difference would mean either there really is no true difference, or there is a true difference but we did not detect it with this study, a type two error has occurred. Error Types In a study we may or may not get the right answer; i.e. there maybe a difference but we didn’t find it. This is a type two error. Or, there might be no true difference but because of random variation or other reasons we found one. This is a type one error. This is summarised in the table. The power of a study is our ability to find statistically significant results, if they exist. A small test study has low power because estimation is not very precise. Error Type Quiz Power Power is the probability of finding a statistically significant difference of a given size, if it exists. Mathematically power is equal to one minus the probability of a type two error. Power : High or Low? Suppose you had the same sample size for these two scenarios pictured. The top scenario, where you’re looking for a large difference, has high power. It is easier to find a large difference than a small difference. The lower scenario, where there isn’t much difference between the two distributions, has low power. If you want to look for smaller differences, you need to have a large sample size. Power for the Fatigue Study -2- Consider the physical activity in lung cancer study. Suppose the study aims to have 100 patients each in the two arms. And suppose that the outcome is the percent reporting low fatigue. Although there would be high power to detect an 80% decrease in fatigue, it is highly unlikely that the intervention would actually cause such a large decrease. On the other extreme, a decrease in fatigue of just 1% is probably not worthwhile, and in order to find it, if that was the true effect of the intervention, you would have to have a massive study. The Ethics of Power and Sample Size In order for a study to be ethical, we must have a large enough sample size, or power, to be able to detect a reasonable and meaningful difference. Reasonable means we have a hope of finding such a difference. Meaningful, this would be useful to patients. P-values and Confidence Intervals There is a connection between P-values and confidence intervals. When the confidence interval contains the null value, zero for differences, one for ratios, the hypothesis test is not statistically significant. And vice versa. The example below did not find a statistically significant difference. P-values and Confidence Intervals 2 If a 95% confidence in a role for a ratio contains one, the P value will be greater than point zero five. The example below also shows a not statistically significant result. RCT Quiz Match the results with the error types, then click “submit” Grilled Meat Quiz [End of recorded material] -3-