Quantitative Research Glossary Absolute risk reduction (ARR): the difference in the absolute risk (rates of adverse events) between study and control populations. Bias (systematic error): Deviation of results or inferences from the truth, or processes leading to such deviation. Sources of bias to look for are: Selection bias: systematic differences in the groups being compared caused by incomplete randomisation (allocation stage). Performance bias: systematic differences in the care provided apart from the intervention being evaluated (intervention stage). Exclusion bias: systematic differences in withdrawals / exclusions of people from the trial (follow up stage). Detection bias: systematic differences in the ways outcomes are assessed (outcomes stage). Blinding: A study in which observer(s) and/or subjects are kept ignorant of the group to which the subjects are assigned, as in an experimental study, or of the population from which the subjects come, as in a nonexperimental or observational study. The purpose of "blinding" is to eliminate sources of bias. The terms single blind, double blind and triple blind are in common use, but are not used consistently and so are ambiguous unless the specific people who are blinded are listed. Concealment of allocation: The process used to ensure that the person deciding to enter a participant into a randomised controlled trial does not know the group into which that individual will be allocated. Inadequate concealment of allocation may lead to selection bias. Confidence interval (CI): Any result obtained in a sample of patients can only give an estimate of the result which would be obtained in the whole population of patients from whom the study patients were selected. The real value will not be known, but the confidence interval can show the size of the likely variation from the true figure. A 95% confidence interval means that there is a 95% chance that the true result lies within the range specified. Wide confidence intervals indicate less precise estimates of effect. Considered to be superior to P values in determining significances, as they provide more information. Confounding variable (confounder): A characteristic that may be distributed differently between the study and control groups and that can effect the outcome being assessed. Confounding may be due to chance or bias. Confounding is a major concern in non-randomised studies Controls : The group that acts as a comparator for one or more experimental interventions in a controlled trial. The group without the disease or outcome of interest in a case control study. Effect size: This is the standardised effect observed. The effect size then becomes a generic term for the estimate of effect for a study and is usually defined as the difference in means between the intervention and control groups divided by the standard deviation of the control or both groups. Estimate of effect: The observed relationship between an intervention and an outcome expressed as, for example, a number needed to treat, odds ratio, relative risk reduction, risk ratio, mean difference. Fixed-effect model : A statistical model that assumes homogeneity among the results of the included studies in a meta-analysis. Only within-study variation is taken to influence the uncertainty of results. Heterogeneity: In systematic reviews heterogeneity refers to variability or differences between studies. A distinction is sometimes made between: statistical heterogeneity - differences in the effect estimates methodological heterogeneity - differences in study design clinical heterogeneity - differences in participants, interventions or outcome measures Homogeneity: In systematic reviews homogeneity refers to the degree to which the results of studies included in a review are similar. Clinical homogeneity means that, in trials included in a review, the participants, interventions and outcome measures are similar or comparable. Intention to treat analysis: Individual outcomes in a clinical trial are analyzed according to the group to which they have been randomized, regardless of whether they dropped out, fully complied with the intervention or crossed over to the other treatment. By simulating practical experience intention to treat analysis provides a better measure of effectiveness In true intention to treat analysis all participants should be included irrespective of whether outcomes were collected. There is no clear consensus on this as it involves including participants in the analyses whose outcomes are unknown, and therefore requires imputation of data. Inter-quartile range: extent of spread of middle 50% of ranked data. Commonly reported with median for skewed data. Interim analysis: Analysis comparing intervention groups at any time before the formal completion of a trial, usually before recruitment is complete. Often used with stopping rules so that a trial can be stopped if participants are being put at risk unnecessarily. Timing and frequency of interim analyses should be specified in the protocol. Line of no effect: the vertical line in the middle of an odds ratio diagram, which represents the point where treatment and control have the same effect – there is no difference between the two. Median: The value of the observation that comes half way when the observations are ranked in order. Medians are often used when data are skewed, meaning that the distribution is uneven. Mean difference: A standard statistic that measures the absolute difference between the mean value of two groups, estimating the amount by which on average the intervention changes the outcome compared to the control. Normal distribution : A statistical distribution with known properties commonly used as the basis of models to analyse continuous data. Key assumptions in such analyses are that the data are symmetrically distributed about a mean value, and the shape of the distribution can be described using the mean and standard deviation. If these assumptions do not hold, then non-parametric analyses should be used. Null hypothesis: The statistical hypothesis that one variable (e.g. whether or not a study participant was allocated to receive an intervention) has no association with another variable or set of variables (e.g. whether or not a study participant died), or that two or more population distributions do not differ from one another. Number needed to treat (NNT) is one measure of a treatment’s clinical effectiveness. It is the number of people you would need to treat with a specific intervention (e.g. aspirin for people having a heart attack) to see one additional occurrence of a specific outcome (e.g. prevention of death). Odds ratio (OR) is one measure of a treatment’s clinical effectiveness. The ratio of the odds of an event in one group (e.g. the experimental (intervention) group) to the odds of an event in another (e.g. the control group). If it is equal to 1, then the effects of the treatment are no different from those of the control treatment. In studies of treatment effect, the odds in the treatment group are usually divided by the odds in the control group. If the OR is greater (or less) than 1, then the effects of the treatment are more (or less) than those of the control treatment. P value: The possibility that any particular outcome would have occurred by chance. P values can range from 0 (impossible for the event to happen by chance) to 1 (the event will certainly happen). Statistical significance is usually p<0.05. Considered to be inferior to confidence intervals in determining significance of studies. Permuted block randomization : A method of randomisation that ensures that, at any point in a trial, roughly equal numbers of participants have been allocated to all the comparison groups. Permuted blocks should be used in trials using stratified randomisation Power: The ability of a study to demonstrate an association or causal relationship between two variables, given that an association exists. For example, 80% power in a clinical trial means that the study has a 80% chance of showing a statistically significant treatment effect if there really was an important difference between outcomes. If the statistical power of a study is low, the study results will be questionable (the study might have been too small to detect any differences). By convention, 80% is an acceptable level of power. Random effects model: A statistical model in which both within-study sampling error (variance) and between-studies variation are included in the assessment of the uncertainty (confidence interval) of the results of a meta-analysis. When there is heterogeneity among the results of the included studies beyond chance, random-effects models will give wider confidence intervals than fixed-effect models. Randomisation : The process of allocating participants to one of the groups of a randomised controlled trial using (i) a means of generating a random sequence and (ii) a means of concealing the sequence, such that those entering participants to a trial are unaware of which intervention a participant will receive. Random allocation implies that each individual or unit being entered into a trial has the same chance of receiving each of the possible interventions. This should ensure that intervention groups are balanced for both known and unknown factors. Phrases such as ‘quasi-randomised’ indicate that the process was not truly random, for example, allocation by date of birth, day of the week, medical record number, month of the year, or the order in which participants are included in the study (alternation). Relative risk (or risk ratio): the proportion of the group at risk in one group divided by the proportion at risk in the second group. A risk ratio of one indicates no difference between comparison groups. Relative risk reduction: The difference in the risk of the event between the control and experimental groups, relative to the control group. Sample size calculation: A sample size calculation designed to highlight the likelihood of detecting a true difference between outcomes in the control and intervention groups. They allow researchers to work out how large a sample they will need in order to have a moderate, high or very high chance of detecting a true difference between the groups. See also Power. Standard deviation (SD) – a statistical measure that describes the extent of data spread or scatter of a set of observations, calculated as the average difference from the mean value in the sample. Statistical significance – A result that is unlikely to have happened by chance. The usual threshold for this judgement is that the results, or more extreme results, would occur by chance with a probability of less than 0.05 if the null hypothesis was true. Stratification: The process by which groups are separated into mutually exclusive sub-groups of the population that share a characteristic: e.g. age group, sex, or socioeconomic status. In any randomised trial it is desirable that the comparison groups should be as similar as possible as regards those characteristics that might influence the response to the intervention. Stratified randomisation is used to ensure that equal numbers of participants with a characteristic thought to affect prognosis or response to the intervention will be allocated to each comparison group. Stratified randomisation is performed by performing separate randomisation (often using random permuted blocks) for each strata. Two-sided (or two-tailed) test: A hypothesis test in which the values for which we can reject the null hypothesis are located entirely in both tails of the probability distribution. Testing whether one treatment is either better or worse than another (rather than testing whether one treatment is only better than another) would be a two-tailed test. Type I error : A conclusion that a treatment works, when it actually does not work. The risk of a Type I error is often called alpha. In a statistical test, it describes the chance of rejecting the null hypothesis when it is in fact true. (Also called false positive.) Type II error: A conclusion that there is no evidence that a treatment works, when it actually does work. The risk of a Type II error is often called beta. In a statistical test, it describes the chance of not rejecting the null hypothesis when it is in fact false. The risk of a Type II error decreases as the number of participants in a study increases. (Also called false negative.)