Basic Statistics

advertisement
Basic Statistics
Test Result
Positive
Disease present
A
Disease absent
B
Negative
C
D
Sensitivity: the proportion of subjects with the disease that have a positive test result;
Sensitivity=A/(A+C)
Specificity: the proportion of subjects with the disease that have a negative test result;
Specificity=D/(B+D)
**Sensitivity and specificity are not a function of disease prevalence or pre-test probability, but Positive
and negative predictive value are a function of disease prevalence and pre-test probability
**If your pre-test probability is low or disease is rare, a negative result is more useful; if it is a common
disease or your pre-test probability is high, a positive result is more useful
Positive predictive value (PPV): proportion of subjects with a positive test result who have the disease;
PPV=A/(A+B)
Negative predictive value (NPV): proportion of subjects with a negative test result who do not have the
disease; NPV=D/(C+D)
Confidence interval: a range of values within which we have a high probability that the true value from
the population lies within this range; expressed as a percentage (usually 95% confidence interval of a
certain range, meaning you are 95% confident that the true mean is between this range)
Relative risk: probability of the disease if the risk factor is present divided by probability of disease if
the risk factor is absent
Relative risk of 1: no effect
Relative risk >1: positive effect
Relative risk <1: negative effect
Relative risk is usually presented with confidence intervals. For a given result, the data points should not
be both less than one and greater than one, if so it is not statistically significant.
Relative risk reduction: the percentage by which the disease or outcome was reduced by the study
treatment. This equals: [1-(relative risk)]x100.
Relative risk reduction of 0= no effect,
Relative risk reduction of >0= positive effect,
Relative risk reduction of <0=harmful effect.
For a p value of 0.05 and a 95% confidence interval, if the confidence interval includes values both above
and below zero, the results are not statistically significant.
Odds ratio: the odds of having the risk factor if the disease is present divided by the odds of having the
risk factor if the disease is not present; similar to relative risk, used for case control studies.
Definitions
Group matching: assigning individuals to study and control groups such that there is a similar
distribution of a specific variable
Incidence: the number of new cases in a specified time period
Intention to treat: analysis in which subjects are analyzed according to the groups to which they were
assigned regardless of whether the therapy or evaluation is successful. Important principle since the
effectiveness of an intervention must be assessed in light of the frequency of non-compliance, lack of
follow-up, side-effects.
Mean: the average
Median: the data point for which half of the values are larger and half are smaller
P value: the probability of obtaining results at least as significant as achieved if the null hypothesis were
true (if no true effect exists)
Power: probability of obtaining a difference as large as desired if one truly exists (1-Beta)
Prevalence: the percentage of the specified population with a disease at a certain point in time
Standard deviation: measure of the extent to which individual values vary from the mean. In a normal
distribution, 68% of the data lies within one standard deviation on either side of the mean; 95% within 2
standard deviations, 99.7% within 3 standard deviations.
Standard error: an expression of the precision of the mean value obtained in comparison to the larger
population; it is a function of sample distribution, standard deviation and sample size.
Variance: expression of the degree of dispersion of the data (the lower the variance, the more uniform
the data); equal to [(standard deviation)2/sample size].
Types of studies:
Case control study: retrospective observational study which compares subjects who have a disease
process to others who do not
Cohort study: prospective observational study which compares subjects with a risk factor to subjects
without the risk factor and observes each group for the development of the disease
Cross-over study: study in which the same subject undergoes the study therapy and control separately
and the results are compared
Cross-sectional study: observational study which compares subjects with a common risk factor to
subjects without the risk factor in terms of the presence of a disease at a single point in time (may be a
questionnaire or survey)
Double-blind study: study in which both the subjects and the investigators are unaware which
treatment groups the subjects are assigned
Observational study: study in which no intervention is made
Randomized clinical trial: prospective study in which subjects are randomly assigned to study and
control groups prior to the study intervention or evaluation
Errors, variables & biases
Confounding variables: characteristics that differ between the control and study groups and affect the
results
Inter-observer error: difference in measurement between different investigators
Intra-observer error: difference in measurement obtained by the same investigator when repeated
Recall bias: bias in which the ability of one group to recall past events is different than another group
Reporting bias: bias in which the tendency of one group to report events is different than that of
another group
Selection bias: bias in assignment of study and control groups or in the study group and the larger
population such that differences in the group occur and effect results.
Type I error: demonstrating a difference between groups when no true difference exists in the larger
population; a “false positive”, risk = Alpha
Type II error: detecting no significant difference between groups when one truly exists; a “false
negative”, risk=Beta
Download