OBSERVATIONAL STUDIES Instructor: Fabrizio D’Ascenzo fabrizio.dascenzo@gmail.com www.emounito.org www.metcardio.org Role MD CONFLICT OF INTEREST None AIM OF THE COURSE A critical appraisal - Theorical - Practical of observational studies TODAY’S PROGRAM: FIRST PART 1) Literature: clinical general concepts 2) Literature: clinical methodological concepts 3) Quick assessment of an observational study 4) Complete assessment of on observational study HOW TO READ and WRITE A STUDY Two points of view: - Clinical - Methodological CLINICAL - Strenght of association - Temporality - Consistency - Theorical Plausibility - Coherence - Specificity in the cause - Dose-response - Experimental evidence - Analogy STRENGHT OF ASSOCIATION Size of the association as measured by appropriate statistical tests Example Odds Ratio, Relative Risk But strength of association depends on the prevalence of other potential confounding factors TEMPORALITY Exposure should always precede the outcome CONSISTENCY The association is consistent when results are replicated in studies in different settings using different methods. If a relationship is causal, we would expect to find it consistently in different studies and among different populations. THEORICAL PLAUSIBILITY and COHERENCE The association agrees with currently accepted understanding of pathological processes. A causal association is increased if a biological gradient or dose-response curve can be demonstrated. The association should be compatible with existing theory and knowledge. IS THIS ENOUGH? RELIABLE EVIDENCE? METHODOLOGICAL GRADING THE EVIDENCE WHY TO PERFORM AND READ NOT RANDOMIZED EVIDENCE? • to save economical resources • to create hypothesis, especially for non randomizable patients • to shed light on the generalizability of results from existing randomized experiments HOW TO EVALAUTE NON RANDOMIZED EVIDENCE? QUICK ASSESSMENT OF AN OBSERVATIONAL STUDY 3 CRUCIAL CONCEPTS - DESIGN OF THE STUDY - BIAS - MULTIVARIATE ANALYSIS THREE DIFFERENT DESIGNS COHORT Advantages: chances to appraise different outcomes Disvantages: if events/outcomes are unfrequent, large number of patient is needed CASE-CONTROL Advantages: studies for infrequent outcomes Disvantages: controls patients need to be selected from the whole population CROSS SECTIONAL Advantages: easy to perform Disvantages: limited function OR EASIER • Retrospective>means testing an hypothesis on datasets - already present - built for that hypothesis but not at the time of patients’assessment • Prospective>means testing an hypothesis on datasets built for it, to evaluate, study and insert data of the patients at the moment of their hospitalization/drug assumption/intervention REASON FOR ASSOCIATIONS REASON FOR ASSOCIATIONS • Bias • Confounding • Chance • Cause BIAS Measure of association between exposure and outcome is systematically wrong Two directions: - bias away from the null - bias towards the null SELECTION BIAS Unintended systematic difference between the two or more groups, which is associated with the exposure. FOR EXAMPLE Inclusion of too selected patients: > patients with more severe disease presentation are often excluded TO obtain larger benefits ATTRITION BIAS If reported: How many patients attain a complete follow up> if a patient is lost at follow up, he/her may have dead (more probably) or alive 1192 consecutive patients undergoing PCI in our center between January 2009 and January 2011 1116 patients with follow up data derived from Piedmont Region dedicated registry (AURA) Medical folders of each patient, and for rehospitalizations were re-analyzed by a physician 76 patients not recorded in Piedmont Region dedicated registry: 39 recovered through phone call 37 not detectable (30 not European….) 1155 at follow up of 787 days (median;474-1027) Figure 1. ADJUDICATION BIAS If reported: who adjudicate the events: - A blinded central committee - Non blinded researchers ANALITICAL/INFORMATION BIAS an error in measuring exposure or outcome may cause information bias>lower risk if the study is multicenter IF REPORTED…. CHANCE The precision of an estimate of the association between exposure and outcome is usually expressed as a confidence interval (usually a 95% confidence interval) The width of the confidence interval is determined by the number of subjects with the outcome of interest, which in turn is determined by the sample size. With 200 pts Variables in the Equation B DIABETE PREGRESS RICOVERO V21 GSP_POSI .069 .488 .769 .010 2.111 SE .582 .567 .565 .747 .547 Wald .014 .739 1.855 .000 14.886 df 1 1 1 1 1 Sig. .906 .390 .173 .990 .000 Exp(B) 1.071 1.629 2.158 1.010 8.256 95.0% CI for Exp(B) Lower Upper .342 3.351 .536 4.950 .713 6.527 .233 4.368 2.825 24.126 With 1000 pts Variables in the Equation B DIABETE PREGRESS V21 RICOVERO GSP_POSI .069 .488 .010 .769 2.111 SE .238 .232 .305 .231 .223 Wald .084 4.436 .001 11.131 89.317 df 1 1 1 1 1 Sig. .773 .035 .975 .001 .000 Exp(B) 1.071 1.629 1.010 2.158 8.256 95.0% CI for Exp(B) Lower Upper .672 1.706 1.034 2.564 .555 1.836 1.373 3.390 5.329 12.791 CONFOUNDING The aim of an observational study is to examine the effect of the exposure, but sometimes the apparent effect of the exposure is actually the effect of another characteristic which is associated with the exposure and with the outcome. MULTIVARIATE ANALYSIS Multivariable analysis aims to explore the relationship between a dependent variable and two or more independent variables appraised simultaneously. ARE ALL MULTIVARIATE ANALYSIS THE SAME? • Logistic regression • Cox Multivariate adjustement • Propensity score HOW TO CHOOSE VARIABLES To avoid: - automatic algorithms with stepwise selection To choose established association from: - prior well conducted experimental or clinical studies - strong associations (e.g.p<0.10 or p<0.05 at univariate analysis) LOGISTIC REGRESSION: THE SIMPLEST ONE The logit function transforms a dependent variable ranging between 0 and 1 such as a probability of an event into a variable stemming from −∞ to +∞. LOGISTIC REGRESSION: THE SIMPLEST ONE Thus, event probabilities can be appraised as a linear regression function to appraise the logit of the probability of an event (dependent variable) given one or more dependent variables LOGISTIC REGRESSION: THE SIMPLEST ONE: LIMITS Overfit model can be highly predictive in the dataset in which the model was developed, but not in one in which it is validated or tested. Multicollinearity, whereby covariate present in the model are unduly associated Does not correct for time COX PROPORTIONAL HAZARD ANALYSIS: THE MOST USED ONE • It addresses differences in follow-up duration and censored data • It is based on The hazard function, which forms the basis of Cox analysis: the event rate at time t conditional on survival until time t or late CENSORED DATA Censored patients are exploited to compute hazards and are assumed in the Cox model to fail at the same rate as the non censored, but are not supposed to survive to the next time point. RIGHT CENSORED DATA The term right censored implies that the event of interest (i.e., the time-to-failure) is to the right of our data point. In other words, if the units were to keep on operating, the failure would occur at some time after our data point (or to the right on the time scale) INTERVAL CENSORED DATA If we inspect a certain unit at 100 hours and find it operating and perform another inspection at 200 hours to find that the unit is no longer operating, then the only information we have is that the unit failed at some point in the interval between 100 and 200 hours. LEFT CENSORED DATA A failure time is only known to be before a certain time. PROPENSITY SCORES: THE NEW ONE conditional probability of receiving an exposure or treatment given a vector of measured covariates Courtesy of American Heart Association cases and covariates influencing exposure, PROPENSITY SCORES: ONEof such and thus THE can beNEW used instead covariates to simplify the analysis plan and increase robustness PROPENSITY SCORES: THE NEW ONE How to do it: a logistic regression in a non-parsimonious fashion results of this non-parsimonious logistic regression are then exploited to build the propensity score THEN insert in multivariate adjustment to increase accuracy matching MATCHING Different methods: - calipers of width of 0.2 of the standard deviation of the logit of the propensity score - Mahalanobis metric Matching -greedy matching MATCHING calipers of width of 0.2 of the standard deviation of the logit of the propensity score and the use of calipers of width 0.02 and 0.03 tended to have superior performance for estimating treatment effects PROPENSITY SCORES: THE NEW ONE Calibration Whether the distances between the observed (treatment—yes or no) and the predicted outcome from the model (propensity score) are small and unsystematic. This is usually formally appraised with the Hosmer–Lemeshow goodness of fit test. PROPENSITY SCORES: THE NEW ONE Discrimination How well the predicted probabilities derived from the model classify patients into their actual treatment group. This is usually quantified with c-statistic, receiver operator characteristic, and area under the curve. IS THIS THE SAME? It is important to keep in mind that even propensity score methods can only adjust for observed confounding covariates and not for unobserved ones. IS EVERYTHING SO PERFECT? ACCURATE ASSESSMENT OF AN OBSERVATIONAL STUDY VARIABLES Clearly define all outcomes, exposures, predictors, potential confounders, and effect modifiers. Give diagnostic criteria, if applicable DATA SOURCES/ MEASUREMENT For each variable of interest, give sources of data and details of methods of assessment (measurement). Describe comparability of assessment methods if there is more than one group. STUDY SIZE Explain how the study size was arrived at HOW TO DO IT? RESULTS • Report numbers of individuals at each stage of study—eg numbers potentially eligible, examined for eligibility, confirmed eligible, included in the study, completing follow-up, and analysed • Give reasons for non-participation at each stage • Consider use of a flow diagram DISCUSSION • Summarise key results with reference to study objectives • Discuss limitations of the study, taking into account sources of potential bias or imprecision. Discuss both direction and magnitude of any potential bias • Give a cautious overall interpretation of results considering objectives, limitations, multiplicity of analyses, results from similar studies, and other relevant evidence • Discuss the generalisability (external validity) of the study results FUNDING Give the source of funding and the role of the funders for the present study and, if applicable, for the original study on which the present article is based TAKE HOME MESSAGES - Check for biological and methodological Pitfalls - Remember that multivariate analysis is multivariate analysis - Remember that multivariate analysis is “only” multivariate analysis THANKS A LOT!!!!