Validity and Bias A. Definitions: 1. Bias – Systematic error that distorts the estimated effect. 2. Validity – Absence of systematic error – the degree to which a study reaches a correct conclusion. a) Internal Validity – the extent to which the study accurately/correctly reflects the true situation within the study population. b) External Validity – the extent to which the study results are applicable to other populations (i.e. generalizability) B. Bias 1. Types of Bias: a) Selection Bias: errors in the way subjects are identified and selected for the study b) Information Bias: errors in the measurements done in selected subjects 2. Each type of bias comes in two forms: a) Differential Misclassification: errors in selection or measurement in one axis (exposure or outcome) are related to the subjects’ status on the other axis (outcome or exposure respectively). b) Non-Differential Misclassification: errors in selection or measurement occur at random, i.e. the errors on one axis (exposure or outcome) are not related to the subjects’ status on the other axis. 3. Differential Selection Bias a) Definition: selection bias occurs when the selection of subjects on one axis (exposure for cohort studies, disease for case-control studies) is somehow related to the other axis (outcome for cohort and exposure for case-control) b) “Stacks the deck” in favor of or against an association between the exposure and outcome. c) Always a concern in cross-sectional, case control, and retrospective cohort studies; less so in prospective cohort studies. d) In cross-sectional studies, differential selection bias occurs when the sample of subjects selected does not appropriately represent the general population from which they were selected, producing unpredictable biases in the prevalence estimates for exposures and/or outcomes. e) In cohort studies, differential selection bias occurs when the selection of exposed and unexposed is related in some way to the development of the disease or outcome of interest. Differential bias in the selection process is uncommon in prospective cohort studies where the outcome has not occurred at the time exposure status is determined, but differential loss to follow-up can produce a bias if the loss to follow-up is related to exposure status. 1 Can occur in a retrospective cohort study when the outcome or event has already occurred and it can, thus, potentially affect the selection of exposed and/or unexposed cohort subjects. 1 Note: Loss to follow-up can also be thought of as an information bias; regardless of whether one calls it a selection or information bias, the effect of differential loss to follow-up can produce biases in the estimate of association. 1 f) g) h) In case-control studies, differential selection bias occurs whenever the selection of cases and controls is related in some way to exposure status. If the selection of cases and controls is based on different criteria and these criteria are related to exposure status If the control group doesn’t represent the exposure experience in the population that gave rise to the cases. Examples of Selection Bias (case-control studies): Oral Contraceptives (OC) use and Pulmonary Embolism (PE) MD’s who suspected an OC-PE association were more likely to admit women w/ PE symptoms who were taking OC; this makes these women more likely to be cases than women w/ PE symptoms and no OC use. No similar tendency among those admitted for other conditions (i.e. controls). Overestimates the frequency of OC use among cases thus overestimating the disease-exposure association. Alcohol Consumption and Coronary Heart Disease using trauma patients as controls Controls more likely than general population (from which cases were drawn) to drink alcohol. Underestimates the disease-exposure association. Papilloma virus infection and Sexual Behavior with differential participation between cases (95%) and controls (71%) Controls with multiple partners may account for the excess non-participation. Underestimates frequency of having multiple partners among controls, thus overestimating the disease-exposure association. Evaluating Differential Selection Bias in Case-Control Studies (i) Is it likely that any of the control subjects would have been included in the study as a case if he/she had been diagnosed? If not then selection bias is likely (ii) In hospital-based studies, are the diseases/conditions the controls have potentially related in any way to the exposure of interest? If so then selection bias is likely (iii) Are the proportions of cases and controls participating relatively high and similar to one another? If not then selection bias is likely (iv) Is there any information on non-responders to suggest that they were similar to responders? Selection bias is less likely if responders and nonresponders are similar (v) Are there any population data (from the source population) to suggest the prevalence of exposure in the controls is about what one would be expected? If so selection bias is somewhat less likely 2 4. Differential Information Bias a) Definition: information bias occurs when there are systematic errors in the evaluation/measurement of selected study subjects on one axis (outcome for cohort studies, exposure for case-control studies) that are somehow related to the other axis (exposure for cohort and outcome for case-control) b) Differential Recall Bias (i) This can occur in case-control studies any time there are differences in how cases and control recall their exposures due to: Differences in ability or motivation to recall and report exposures and/or Use of different method of collecting exposure information for cases and controls. (ii) Examples: Risk factors for Brazilian Purpuric Fever Mothers of children who died are more likely to remember & report potential risk factors than mothers of health children. Study of pancreatic cancer: information obtained from a proxy (spouse) for cases (many of whom had already died) but directly from controls. c) 5. Differential Interviewer (case-control) or Observer (cohort) Biases (i) In case-control studies this can occur when interviews are done differently for cases and controls (different settings, interviewers, or methods) or when interviewers are aware of case/control status. Example: Infant feeding and severe cholera in Bangladesh Case caretakers interviewed at diarrhea treatment centers are likely to remember or report differently than controls who were interviewed at home. (ii) In cohort studies this can occur any time the outcome of interest is assessed in a different way in exposed and unexposed subjects. When exposure status is known the outcome cannot be objectively assessed, so observers will tend to look more closely for (or measure) the outcome among the exposed than among the unexposed groups. Non-Differential (“Random”) Forms of Misclassification a) Subjects are classified into the wrong study groups because of random (“nondifferential”) errors in classification of: Disease/Non-disease or Exposure/Nonexposure. Non-differential or random errors can occur with either selection or information biases These always bias the relative measure of association (relative risk estimates) towards the null hypothesis of no difference (i.e. they dilute the differences between the groups, producing an underestimate of the association) 3 b) Non-differential Selection Bias: occurs when the cases and controls are misclassified randomly (i.e. without respect to exposure status) or in a cohort study when exposed and unexposed are misclassified randomly. This form is especially common when the definition or identification of disease/outcome or exposure is ambiguous. Non-differential Information Bias: occurs when the misclassification of exposure status in a case-control study is not related to case/control status or when misclassification of disease/non-disease status in a cohort study is not related to exposure/non-exposure c) Example 1: Non-differential information bias (exposure misclassification) – case-control Disease + Disease - 600 + 300 + - 540 270 460 730 + Exposure Exposure - 400 - 700 1000 1000 Truth: OR = (600*700) / (400*300) = 3.50 1000 1000 10% of exposed from both groups under-report exposure: Observed: OR = (540*730) / (460*270) = 3.17 Example 2: Non-differential selection bias (cases misclassified as controls) – case-control 2 Disease + 600 Disease - + 300 + - 600 330 400 670 + Exposure Exposure - 400 1000 - 700 1000 Truth: OR = (600*700) / (400*300) = 3.50 1000 1000 10% of controls are actually cases (i.e. misclassified cases): Observed: OR = (600*670) / (400*330) = 3.05 2 Misclassified cases bring their higher exposure experience to the controls. 10% misclassification means 100 controls are actually cases (60 exposed, 40 unexposed) and 900 are true controls (270 exposed, 630 unexposed) 4 6. Evaluation of Bias – requires a carefully considered subjective judgment about: a) Presence of a potential bias – Does a particular bias or the potential for a particular bias exist in the study? b) Expected direction and magnitude of the bias – (i) What effect will the potential bias have on the results/measures? (ii) How much will the measures of effect or association be altered by the presence of bias? (iii) Will they be increased or decreased? c) What is the expected combined effect of all potential biases on the study results 7. Control of Bias a) Careful study design b) Objective and well-defined Exposure and Disease definitions c) Accurate, detailed and complete records and information d) Objective, detailed and accurate information from subjects e) Source and manner by which data is obtained are the same for both groups 5