Appendix 2. Glossary of Terms *Retrieved from: Windish DM, Diener-West M. A clinician-educator’s roadmap to choosing and interpreting statistical tests. J Gen Intern Med. 2006 June; 21(6): 656-660. PMCID: PMC1924630 Alpha—probability of a Type I error. Analysis of Covariance (ANCOVA)—test used to compare means of a continuous outcome variable among 2 or more groups of categorical variables after controlling for potential confounding factors. Analysis of Variance (ANOVA)—test used to compare means of a continuous outcome variable of 3 or more groups. Beta—probability of a Type II error. Bias—unintentional systematic error in the design or conduct of a study that produces results Varying from the truth. Binomial distribution—probability distribution used to describe dichotomous outcomes in a population. Bivariate analysis—statistical analysis in which there is one dependent variable and one independent variable. Box-and-whisker plot—type of exploratory data analysis that displays and summarizes the data distribution using the median value (middle of the box) and the 25th and 75th percentile values obtained from the data (upper and lower ends of the box). Chi-square test (Pearson)—statistical test used to compare two unpaired (independent) samples where the outcome is dichotomous or nominal and the sample size is large (>30). Confidence interval (CI)—an interval computed from the sample data with a specific probability that contains the unknown true population value within the interval. Confounding variable—variable related to both the outcome of interest and to another factor (independent variable) of interest. Continuous variable—variable with no gaps in values; e.g., age. Correlation—describes a relationship between 2 or more variables. Correlation (Pearson’s)—statistical test used to quantify the association between two continuous variables. Covariate—an independent variable in a study. Cox proportional hazard regression—statistical test for assessing time to a dichotomous event where the independent variable(s) are nominal or continuous. Dependent variable (response variable)—the outcome variable of interest. Dichotomous variable—variable that only has two possible outcomes. Discrete variable—variable with gaps in values; e.g., the number of study participants. Descriptive statistics (exploratory data analysis)—methods of organizing, summarizing and displaying data; includes calculating measures of central tendency and measures of dispersion. Effect modification (interaction)—a situation in which two or more independent factors modify the effect of each other with regard to an outcome; thus the outcome differs depending on whether or not an effect modifier is present Effect size—a measure of difference between two groups Experimental study—study that examines groups where an intervention has been allocated. Fisher’s exact test—statistical test used to compare two unpaired (independent) samples where the outcome is dichotomous or nominal and the sample size is small; alternative to the chi-square test. Friedman’s test—statistical test used to compare three or more paired (dependent) samples when the outcome is either ordinal or continuous with a skewed distribution. Generalizable—the extent to which findings from a study from a sample population can be applied to the entire population. Independent variable—explanatory or predictor variable in a study. Inferential statistics (confirmatory data analysis)—uses estimation and hypothesis testing to assess the strength of the evidence, make predictions and draw conclusions about a population based on sample data. Interquartile range (IQR)—measure of spread or dispersion in the data calculated as the difference between the 25th and 75th percentile values. Interrater reliability—the extent to which measurements or observations are the same when repeated by two or more individuals. Intrarater reliability—the extent to which measurements or observations are the same when repeated by the same individual. Kappa statistic—statistical test used to quantify the agreement between two observers. Kendall’s coefficient of concordance—statistical test used to quantify the association between two variables when the outcome is ordinal. Kruskal-Wallis test—nonparametric statistical test used to compare three or more unpaired (independent) samples where the outcome is either ordinal or continuous with a skewed distribution. Linear regression—regression analysis used to quantify the association between one independent variable and a continuous outcome that is normally distributed. Logistic regression—regression analysis used to quantify the association between one independent variable and a dichotomous outcome. McNemar’s test—statistical test used to compare two paired (dependent) samples when the outcome of interest is dichotomous (or nominal with only two outcomes). Mean—measure of central tendency; the sum of the measurements divided by the number of measurements being added (the average). Median—measure of central tendency; the middle (midpoint) observation. Multinomial logistic regression—logistic regression used to quantify the association between one or more independent variables and a nominal outcome having more than 2 levels. Multiple linear regression—linear regression used to quantify the association between more than one independent variable and a continuous outcome that is normally distributed. Multiple logistic regression—logistic regression used to quantify the association between more than one independent variable and a dichotomous outcome. Multivariable analysis—statistical analysis in which there is one dependent variable and more than one independent variable. Nominal variable—variable having descriptive categories with no inherent order. Nonparametric regression—a type of regression used to quantify the association between one or more independent variables and a continuous outcome having a skewed distribution. Nonparametric test—statistical test does not assume that the shape of the distribution is known; results are based on rankings of the outcome variable and not on the actual values obtained. Normal (bell-shaped) distribution—probability distribution used to describe continuous outcomes in a population. (Symmetrical bell-shaped curve in which the mean, median and mode are the same). Null hypothesis—statement of no effect or no association in a study. Observational study—study that examines groups at one or more points in time without allocation of an intervention. Ordinal logistic regression—logistic regression used to quantify the association between one or more independent variables and an ordinal outcome. Ordinal variable—variable having categories with an implicit ranking. P-value—probability of obtaining an outcome as extreme or more extreme than the observed result under the assumption that the null hypothesis is true. Paired (dependent) sample—study design in which each study individual is matched with a control group individual based on some characteristic(s) and their outcomes are compared in a matched way. Paired t-test—statistical test used to compare two paired (dependent) samples where the outcome is continuous and normally distributed. Parametric test—statistical test in which assumptions are made about the underlying probability distribution of observed data. Power—the ability of a study to detect a difference when one exists; probability of rejecting the null hypothesis when it is false. Probability distribution—a description of the probability associated with all possible observed outcomes. Proportion—fraction in which the numerator consists of a subset of individuals represented in the denominator. Qualitative variable—variable that describes attributes (ordinal or nominal). Quantitative variable—variable that describes an amount or quantity (continuous or discrete). Random sample—a sample of subjects selected from a population such that each subject has the same chance of being selected. Regression analysis—statistical method used to describe the association between one dependent variable and one or more independent variables; used to adjust for confounding variables. Relative frequency—the ratio of the number of observations having a certain characteristic or value divided by the total number of observations. Repeated measures analysis of variance—measurements are made repeatedly in each subject (e.g., before, during and after an intervention). Sample size—the number of subjects in a study. Simple linear regression—regression analysis used to quantify the association between one independent variable and a continuous outcome that is normally distributed. Simple logistic regression—regression analysis used to quantify the association between one independent variable and a dichotomous outcome. Skewed distribution—a distribution of values that is not symmetric (i.e., not bell-shaped). Positively skewed: data are distributed such that a greater proportion of the observations have values less than or equal to the mean (i.e., more observations with lower values). Negatively skewed: data are distributed such that a greater proportion of the observations have values greater than or equal to the mean (i.e., more observations with higher values). Spearman rank correlation—nonparametric statistical test used to quantify the association between two variables where one or both variables is not normally distributed. Standard deviation (SD)—descriptive statistic that measures the dispersion of individual data around the mean value. Statistical significance—when the p-value is less than the probability of a Type I error (usually set at 0.05). Stem-and-leaf plot—type of exploratory data analysis that orders and organizes data to display its shape and distribution. The “stem” contains the first digit or digits of each observation and the “leaf” contains the remaining digit or digits of each observation . Student t-test—statistical test used to compare two unpaired (independent) samples having a normally distributed continuous outcome. Type I error—the error that results when the null hypothesis is rejected when it is really true; stating there is a difference in outcome when none exists (false positive). Type II error—the error that results when the null hypothesis is accepted when it is really false; stating there is no difference in outcome when one actually exists (false negative). Unpaired (independent) sample—study design that compares the outcome of two groups where the groups are not matched on any characteristic. Wilcoxon rank-sum test (Mann-Whitney U test)—nonparametric statistical test used to compare two unpaired (independent) samples where the outcome of interest is ordinal or continuous and not normally distributed. Wilcoxon signed-rank test—nonparametric statistical test used to compare two paired (dependent) samples where the outcome of interest is ordinal or continuous with a skewed distribution.