Glossary of Terms

advertisement
Appendix 2. Glossary of Terms
*Retrieved from: Windish DM, Diener-West M. A clinician-educator’s roadmap to choosing and interpreting statistical
tests. J Gen Intern Med. 2006 June; 21(6): 656-660. PMCID: PMC1924630
Alpha—probability of a Type I error.
Analysis of Covariance (ANCOVA)—test used to compare means of a continuous outcome
variable among 2 or more groups of categorical variables after controlling for potential
confounding factors.
Analysis of Variance (ANOVA)—test used to compare means of a continuous outcome variable
of 3 or more groups.
Beta—probability of a Type II error.
Bias—unintentional systematic error in the design or conduct of a study that produces results
Varying from the truth.
Binomial distribution—probability distribution used to describe dichotomous outcomes in a
population.
Bivariate analysis—statistical analysis in which there is one dependent variable and one
independent variable.
Box-and-whisker plot—type of exploratory data analysis that displays and summarizes the data
distribution using the median value (middle of the box) and the 25th and 75th
percentile values obtained from the data (upper and lower ends of the box).
Chi-square test (Pearson)—statistical test used to compare two unpaired (independent) samples
where the outcome is dichotomous or nominal and the sample size is large (>30).
Confidence interval (CI)—an interval computed from the sample data with a specific
probability that contains the unknown true population value within the interval.
Confounding variable—variable related to both the outcome of interest and to another factor
(independent variable) of interest.
Continuous variable—variable with no gaps in values; e.g., age.
Correlation—describes a relationship between 2 or more variables.
Correlation (Pearson’s)—statistical test used to quantify the association between two
continuous variables.
Covariate—an independent variable in a study.
Cox proportional hazard regression—statistical test for assessing time to a dichotomous event
where the independent variable(s) are nominal or continuous.
Dependent variable (response variable)—the outcome variable of interest.
Dichotomous variable—variable that only has two possible outcomes.
Discrete variable—variable with gaps in values; e.g., the number of study participants.
Descriptive statistics (exploratory data analysis)—methods of organizing, summarizing and
displaying data; includes calculating measures of central tendency and measures of dispersion.
Effect modification (interaction)—a situation in which two or more independent factors
modify the effect of each other with regard to an outcome; thus the outcome differs
depending on whether or not an effect modifier is present
Effect size—a measure of difference between two groups
Experimental study—study that examines groups where an intervention has been allocated.
Fisher’s exact test—statistical test used to compare two unpaired (independent) samples where
the outcome is dichotomous or nominal and the sample size is small; alternative to the
chi-square test.
Friedman’s test—statistical test used to compare three or more paired (dependent) samples
when the outcome is either ordinal or continuous with a skewed distribution.
Generalizable—the extent to which findings from a study from a sample population can be
applied to the entire population.
Independent variable—explanatory or predictor variable in a study.
Inferential statistics (confirmatory data analysis)—uses estimation and hypothesis testing to
assess the strength of the evidence, make predictions and draw conclusions about a
population based on sample data.
Interquartile range (IQR)—measure of spread or dispersion in the data calculated as the
difference between the 25th and 75th percentile values.
Interrater reliability—the extent to which measurements or observations are the same when
repeated by two or more individuals.
Intrarater reliability—the extent to which measurements or observations are the same when
repeated by the same individual.
Kappa statistic—statistical test used to quantify the agreement between two observers.
Kendall’s coefficient of concordance—statistical test used to quantify the association between
two variables when the outcome is ordinal.
Kruskal-Wallis test—nonparametric statistical test used to compare three or more unpaired
(independent) samples where the outcome is either ordinal or continuous with a skewed
distribution.
Linear regression—regression analysis used to quantify the association between one
independent variable and a continuous outcome that is normally distributed.
Logistic regression—regression analysis used to quantify the association between one
independent variable and a dichotomous outcome.
McNemar’s test—statistical test used to compare two paired (dependent) samples when the
outcome of interest is dichotomous (or nominal with only two outcomes).
Mean—measure of central tendency; the sum of the measurements divided by the number of
measurements being added (the average).
Median—measure of central tendency; the middle (midpoint) observation.
Multinomial logistic regression—logistic regression used to quantify the association between
one or more independent variables and a nominal outcome having more than 2 levels.
Multiple linear regression—linear regression used to quantify the association between
more than one independent variable and a continuous outcome that is normally distributed.
Multiple logistic regression—logistic regression used to quantify the association between
more than one independent variable and a dichotomous outcome.
Multivariable analysis—statistical analysis in which there is one dependent variable and more
than one independent variable.
Nominal variable—variable having descriptive categories with no inherent order.
Nonparametric regression—a type of regression used to quantify the association between one
or more independent variables and a continuous outcome having a skewed distribution.
Nonparametric test—statistical test does not assume that the shape of the distribution is known;
results are based on rankings of the outcome variable and not on the actual values obtained.
Normal (bell-shaped) distribution—probability distribution used to describe continuous
outcomes in a population. (Symmetrical bell-shaped curve in which the mean, median
and mode are the same).
Null hypothesis—statement of no effect or no association in a study.
Observational study—study that examines groups at one or more points in time without
allocation of an intervention.
Ordinal logistic regression—logistic regression used to quantify the association between one
or more independent variables and an ordinal outcome.
Ordinal variable—variable having categories with an implicit ranking.
P-value—probability of obtaining an outcome as extreme or more extreme than the observed
result under the assumption that the null hypothesis is true.
Paired (dependent) sample—study design in which each study individual is matched with a
control group individual based on some characteristic(s) and their outcomes are
compared in a matched way.
Paired t-test—statistical test used to compare two paired (dependent) samples where the
outcome is continuous and normally distributed.
Parametric test—statistical test in which assumptions are made about the underlying
probability distribution of observed data.
Power—the ability of a study to detect a difference when one exists; probability of rejecting the
null hypothesis when it is false.
Probability distribution—a description of the probability associated with all possible observed
outcomes.
Proportion—fraction in which the numerator consists of a subset of individuals represented in
the denominator.
Qualitative variable—variable that describes attributes (ordinal or nominal).
Quantitative variable—variable that describes an amount or quantity (continuous or discrete).
Random sample—a sample of subjects selected from a population such that each subject has the
same chance of being selected.
Regression analysis—statistical method used to describe the association between one dependent
variable and one or more independent variables; used to adjust for confounding variables.
Relative frequency—the ratio of the number of observations having a certain characteristic or
value divided by the total number of observations.
Repeated measures analysis of variance—measurements are made repeatedly in each subject
(e.g., before, during and after an intervention).
Sample size—the number of subjects in a study.
Simple linear regression—regression analysis used to quantify the association between
one independent variable and a continuous outcome that is normally distributed.
Simple logistic regression—regression analysis used to quantify the association between one
independent variable and a dichotomous outcome.
Skewed distribution—a distribution of values that is not symmetric (i.e., not bell-shaped).
Positively skewed: data are distributed such that a greater proportion of the observations
have values less than or equal to the mean (i.e., more observations with lower values).
Negatively skewed: data are distributed such that a greater proportion of the observations
have values greater than or equal to the mean (i.e., more observations with higher values).
Spearman rank correlation—nonparametric statistical test used to quantify the association
between two variables where one or both variables is not normally distributed.
Standard deviation (SD)—descriptive statistic that measures the dispersion of individual data
around the mean value.
Statistical significance—when the p-value is less than the probability of a Type I error (usually
set at 0.05).
Stem-and-leaf plot—type of exploratory data analysis that orders and organizes data to display
its shape and distribution. The “stem” contains the first digit or digits of each observation
and the “leaf” contains the remaining digit or digits of each observation .
Student t-test—statistical test used to compare two unpaired (independent) samples having a
normally distributed continuous outcome.
Type I error—the error that results when the null hypothesis is rejected when it is really true;
stating there is a difference in outcome when none exists (false positive).
Type II error—the error that results when the null hypothesis is accepted when it is really false;
stating there is no difference in outcome when one actually exists (false negative).
Unpaired (independent) sample—study design that compares the outcome of two groups
where the groups are not matched on any characteristic.
Wilcoxon rank-sum test (Mann-Whitney U test)—nonparametric statistical test used to
compare two unpaired (independent) samples where the outcome of interest is ordinal or
continuous and not normally distributed.
Wilcoxon signed-rank test—nonparametric statistical test used to compare two paired
(dependent) samples where the outcome of interest is ordinal or continuous with a
skewed distribution.
Download