Critical appraisal of published vaccine trial papers Some preliminary questions Resources consulted • Greenhalgh, Trisha, BMJ 1997;315:243-6. – A series of 10 publications on how to read an article • AR Waladkhani. (2008). Conducting clinical trials. A theoretical and practical guide. What issue is being addressed? – Concept of NULL HYPOTHESIS – Authors vs scientists – Hypothetico-deductive approach What does hypothesis testing do ? Are these different ? NULL hypothesis NULL hypothesis represents the conservative position of no change. H0 H1 H2 H3 H4 H5 : : : : : : A A A A A A = B > B < B >> B << B <> B Hypothesis testing is error prone Type I error - incorrectly rejecting H0 Type II error - incorrectly accepting H0 (1 - ) denotes the power of the study. and (1 – ) are important determinants of sample size for a study. Relation between and (1 – ) is reciprocal. Most analysis done with alpha error 5% & beta error of 20% Nair PG 22 Feb 2008 Type of study Primary: experimental/clinical trial/ surveys – Design features: • • • • • • • • parallel group comparison Paired or matched comparison Within subject comparison Single blind Double blind Crossover Placebo controlled Factorial design: >1 independent variable Type of study (continued) – Secondary research: summarises/ draws conclusion from primary studies • Overviews – Non-systematic – Systematic reviews – Meta- analysis • Guidelines • Decision analysis • Economic analysis Is the design appropriate to research • Therapy: testing the efficacy – Randomised control trials • Diagnosis: evaluating a new test – Cross sectional survey • Screening tests: – Cross sectional survey • Prognosis: – Longitudinal cohort study • Causation: – Cohort or case control study – Case reports? Randomised controlled trials • In a randomised controlled trial, participants are randomly allocated by a process equivalent to the flip of a coin to either one intervention (such as a drug) or another (such as placebo treatment or a different drug). Both groups are followed up for a specified period and analysed in terms of outcomes defined at the outset. Because, on average, the groups are identical apart from the intervention, any differences in outcome are, in theory, attributable to the intervention. Advantages o o o o Allows rigorous evaluation of a single variable (effect of drug treatment versus placebo, for example) in a precisely defined patient group. Also a Prospective design (data are collected on events that happen after you decide to do the study) Uses hypotheticodeductive reasoning (seeks to falsify, rather than confirm, its own hypothesis) Potentially eradicates bias by comparing two otherwise identical groups Allows for meta-analysis (combining the numerical results of several similar trials at a later date) Disadvantages o o o o Expensive and time consuming; hence, in practice: Many randomised controlled trials are either never done, are performed on too few patients, or are undertaken for too short a period Most are funded by large research bodies (university or govt. sponsored) or drug companies, who ultimately dictate the research agenda Surrogate endpoints are often used in preference to clinical outcome measures may introduce "hidden bias", especially through: o o o Imperfect randomisation Failure to randomise all eligible patients (clinician only offers participation in the trial to patients he or she considers will respond well to the intervention) Failure to blind assessors to randomisation status COHORT STUDY Two (or more) groups of people are selected on the basis of differences in their exposure to a particular agent (such as a vaccine, a drug, or an environmental toxin), and followed up to see how many in each group develop a particular disease or other outcome. The follow up period in cohort studies is generally measured in years (and sometimes in decades depending on the disease studied) Case control study • Patients with a particular disease or condition are identified and "matched" with controls (patients with some other disease, the general population, neighbours, or relatives). Data are then collected (for example, by searching back through these people's medical records or by asking them to recall their own history) on past exposure to a possible causal agent for the disease. • Generally concerned with the aetiology of a disease, rather than its treatment. • An important source of difficulty (and potential bias) in a casecontrol study is the precise definition of who counts as a "case," since one misallocated subject may substantially influence the results. Cross sectional surveys • a representative sample of subjects (or patients) is interviewed, examined, or otherwise studied to gain answers to a specific clinical question. In cross sectional surveys, data are collected at a single time but may refer retrospectively to experiences in the past-such as the study of casenotes. Case reports • A case report describes the medical history of a single patient in the form of a story: • Although this type of research is traditionally considered to be "quick and dirty" evidence, a great deal of information can be conveyed in a case report that would be lost in a clinical trial or survey. Hierarchy of evidence • Systemic reviews & meta-analysis • Randomised controlled trials with a definitive result • Randomised controlled trials with a nondefinitive result • Cohort studies • Case control studies • Cross sectional surveys • Case reports Assessing the methodological quality Important questions • Is the study original or is it bigger, longer, rigorous, in different population etc. • Whom is the study about – Recruitment – Inclusion/exclusion criteria – Is it under real life circumstances • Is the design sensible – The intervention & the comparative arm – Outcome: actual or surrogate • Was systematic bias taken care of • Was assessment blind • Were preliminary statistical questions addressed – Sample size – Duration of follow up – Completeness of follow-up Systematic Bias • Systematic bias is defined as anything that erroneously influences the conclusions about groups and distorts comparisons. • the aim should be for the groups being compared to be as similar as possible except for the particular difference being examined. Steps to check systematic bias • Non- RCT’s: – use your common sense to decide if the baseline differences between the intervention and control groups are likely to have been so great as to invalidate any differences ascribed to the effects of the intervention. • Cohort studies: – The selection of a comparable control group is one of the most difficult decisions – the "controlling" in cohort studies occurs at the analysis stage, where complex statistical adjustment is made for baseline differences in key variables. • Case control studies: – diagnosis of "caseness" and the decision as to when the individual became a case is most open to bias. Sample size • Douglas Altman, “a trial should be big enough to have a high chance of detecting, as statistically significant, a worthwhile effect if it exists, and thus to be reasonably sure that no benefit exists if it is not found in the trial. “ • To calculate sample size, the clinician must decide two things. – What level of difference between the two groups would constitute a clinically significant effect. – the mean and the standard deviation of the principal outcome variable. • statistical nomogram, are used to work out how large a sample is required to have a moderate, high, or very high chance of detecting a true difference between the groups- the power of the study. Type I error • • • • False positive error Finding a difference when none exits. P value; usually fixed at 5%. 95% chance / probability of being right and 5% chance of the results being wrong. • Usually due to sampling error Nair PG 22 Feb 2008 Type II error • False negative error • β error • Not picking up a difference when a difference actually exits (β). • What is 1- β is the power of the test; the ability to pick up or detect a difference when one actually exists. • Usually power is 80-90%. • Underpowered studies – high type II error Statistical analysis • Are the groups comparable & baseline differences adjusted for • Statistical tests parametric or non parametric • If any obscure stats method used, has it been justified & referenced • Has tha data been analysed as per original protocol • Paired data tests, tails & outliers analysis Statistical significance • Probability & confidence – P value calculation & interpretation – Confidence intervals – Effect of intervention in terms of likely benefit/ harm to individual/ population • Relative Risk (RR) – Ratio of probability of event for exposed person to probability of event for unexposed person (Pe/Pu) • Absolute Risk Reduction (ARR) – Difference in probability of event for exposed person and unexposed person (Pe-Pu) The p value Maximum probability of getting the observed outcome by chance. p up to 0.05 (5%) is acceptable. 2-tailed p is preferable to 1-tailed p. Tails of a test • A tail in statistics refers to direction of movement of data • Two tailed- Weight, blood sugar, IOP ( 2 tailed) • One tailed- non-inferiority, prevention of MTCT Summary of Statistical analysis • An association between two variables is likely to be causal if it is strong, consistent, specific, plausible, follows a logical time sequence, and shows a dose-response gradient • A P value of <0.05 means that this result would have arisen by chance on less than one occasion in 20 • The confidence interval around a result in a clinical trial indicates the limits within which the "real" difference between the treatments is likely to lie, and hence the strength of the inference that can be drawn from the result • A statistically significant result may not be clinically significant. • The results of intervention trials should be expressed in terms of the likely benefit an individual could expect (for example, the absolute risk reduction) Drug/ vaccine trials • Clinical endpoints vs surrogate endpoints • Surrogate endpoint: A surrogate end point may be defined as a variable which is relatively easily measured and which predicts a rare or distant outcome of either a toxic stimulus (such as a pollutant) or a therapeutic intervention (a drug, surgical procedure, piece of advice, etc) but which is not itself a direct measure of either harm or clinical benefit. • Advantages: – Can reduce sample size, duration & cost of clinical trials – Can allow trts. to be assessed where primary outcomes are invasive/ unethical • Disadvantages – – – – Does not answer the objective of trt./ best trt. in a patient May not be valid or reliable Over reliance reflects a narrow clinical perspective As developed in animal models, extrapolation to humans may be invalid Ideal surrogate end point • reliable, reproducible, clinically available, easily quantifiable, affordable, and show a "dose-response" effect • true predictor of disease or the risk of disease • sensitive & specific • precise cut off between normal and abnormal values • acceptable positive predictive value and negative predictive value • amenable to quality control monitoring • Changes in the surrogate end point should rapidly and accurately reflect the response to treatment. Systematic reviews & meta analysis • Systemic review – An overview of primary studies that used explicit & reproducible methods – Needs a thorough search of appropriate databases Meta-analysis • Is a mathematical synthesis of the results of two or more primary studies that addressed the same hypothesis in the same way • Easier to assimilate than a bunch of studies • Software “metaview” driven & results obtained in a pictorial format • Homogeneity if the results of individual trial are mathematical compatible with the result of any of the others. Otherwise heterogeneous & needs more statistical tests like chi square. Quantitative vs qualitative research • Qualitative methods aim to make sense of, or interpret phenomena in terms of meanings people bring to them • It defines preliminary questions which can then be addressed in quantitative studies • Addresses a clinical problem thru a clearly formulated question & using more than one research method (triangulation) • Analysis done using explicit, systematic and reproducible methods ITT vs Per-protocol analysis • Failure to include all participants in the analysis may bias the trial results • The preservation of randomized trial population and trt assignment, control of type I error in superiority trials, assessment of public health policy are reasons for preferring ITT analysis • PPA: biological efficacy • FDA & EMEA prefer ITT analysis • Both important to answer different scientific questions & do have merits & demerits THANK YOU