FILENAME: - Consumer Learning

advertisement
21. Introduction to Biostatistics – Part Two
[Start of recorded material]
Estimation
Suppose our estimate of quality of life of testicular cancer survivors is 70 on a
questionnaire with a range from zero to 100 with a 95% confidence interval from 40 to
100. This is not very precise. How can we make the confidence interval smaller?
Increasing sample size increases measurement precision. We become more confident in
our results as the precision increases, thus the 95% confidence interval decreases in
width. If the 95% confidence interval was 65 to 75, we would be reasonably sure that
the true quality of life lies between 65 and 75.
Comparing Groups : Randomised Controlled Trials
The last example on the average quality of life in testicular cancer patients was an
example of a cross-sectional study. Suppose we have a randomised controlled trial so
that our interest is in comparing groups, sometimes called trial arms. Suppose we
would like to answer the question, does physical activity result in less fatigue in lung
cancer patients?
Comparing Groups : Observational Study
We also compare groups in observational studies. For example, if we want to answer
the question of whether the consumption of red meat is associated with the occurrence
of colorectal cancer, we could perform either a case controlled study or a cohort study.
For more on epidemiological or observational study design, see the Study Design
module.
Null Hypothesis
RCT, the null hypothesis, is usually that there is no difference between the two,
sometimes more, groups; i.e. that that is no effect of the intervention. The mean
difference in self reported fatigue scores is zero. In an epidemiological study, the null
hypothesis is that there is no association between exposure and disease. For example,
the odds of colorectal cancer, the disease, among people who eat red meat, the exposure,
is the same as the odds for people who don’t eat red meat. The odds ratio is one. This
brings us to hypothesis testing. We start with the null hypothesis. In general, no
difference between groups, no association between exposure and disease, or the status
quo.
Research Hypothesis
For every null hypothesis, there is an alternative, or a research hypothesis. For our
examples, there is a difference in fatigue between physical activity group and the no
physical activity group. There is an association between eating meat and developing
colorectal cancer.
-1-
The Steps of Hypothesis Testing
The steps of hypothesis testing are (1) set up a null and alternative hypothesis. We set
the significance level, alpha, for the test. This defines what a small P-value is. This is
conventionally point zero five. Step two, assuming the null hypothesis is true, calculate
the P-value. Step three, compare the P-value with the significance level. If P is less
than alpha, reject the null hypothesis in favour of the alternative. If P is greater than
alpha, do not reject the null hypothesis.
Significance
If P is less than point zero five, we say the difference is statistically significant at the
5% level. If P is greater than point zero five, we say the difference is not statistically
significant at the 5% level.
Possible Outcomes
The possible outcomes of a hypothesis test are a statistically significant difference could
mean either that there is a true difference, or there is no true difference, this study just
found unusual results (a type one error). No statistically significant difference would
mean either there really is no true difference, or there is a true difference but we did not
detect it with this study, a type two error has occurred.
Error Types
In a study we may or may not get the right answer; i.e. there maybe a difference but we
didn’t find it. This is a type two error. Or, there might be no true difference but
because of random variation or other reasons we found one. This is a type one error.
This is summarised in the table. The power of a study is our ability to find statistically
significant results, if they exist. A small test study has low power because estimation is
not very precise.
Error Type Quiz
Power
Power is the probability of finding a statistically significant difference of a given size, if
it exists. Mathematically power is equal to one minus the probability of a type two
error.
Power : High or Low?
Suppose you had the same sample size for these two scenarios pictured. The top
scenario, where you’re looking for a large difference, has high power. It is easier to
find a large difference than a small difference. The lower scenario, where there isn’t
much difference between the two distributions, has low power. If you want to look for
smaller differences, you need to have a large sample size.
Power for the Fatigue Study
-2-
Consider the physical activity in lung cancer study. Suppose the study aims to have 100
patients each in the two arms. And suppose that the outcome is the percent reporting
low fatigue. Although there would be high power to detect an 80% decrease in fatigue,
it is highly unlikely that the intervention would actually cause such a large decrease.
On the other extreme, a decrease in fatigue of just 1% is probably not worthwhile, and
in order to find it, if that was the true effect of the intervention, you would have to have
a massive study.
The Ethics of Power and Sample Size
In order for a study to be ethical, we must have a large enough sample size, or power, to
be able to detect a reasonable and meaningful difference. Reasonable means we have a
hope of finding such a difference. Meaningful, this would be useful to patients.
P-values and Confidence Intervals
There is a connection between P-values and confidence intervals. When the confidence
interval contains the null value, zero for differences, one for ratios, the hypothesis test is
not statistically significant. And vice versa. The example below did not find a
statistically significant difference.
P-values and Confidence Intervals 2
If a 95% confidence in a role for a ratio contains one, the P value will be greater than
point zero five. The example below also shows a not statistically significant result.
RCT Quiz
Match the results with the error types, then click “submit”
Grilled Meat Quiz
[End of recorded material]
-3-
Download