Evaluating published vaccine trial papers

advertisement
Critical appraisal of published
vaccine trial papers
Some preliminary questions
Resources consulted
• Greenhalgh, Trisha, BMJ 1997;315:243-6.
– A series of 10 publications on how to read an
article
• AR Waladkhani. (2008). Conducting clinical
trials. A theoretical and practical guide.
What issue is being addressed?
– Concept of NULL HYPOTHESIS
– Authors vs scientists
– Hypothetico-deductive approach
What does hypothesis testing do ?
Are these different ?
NULL hypothesis
NULL hypothesis represents the conservative
position of no change.






H0
H1
H2
H3
H4
H5
:
:
:
:
:
:
A
A
A
A
A
A
= B
> B
< B
>> B
<< B
<> B
Hypothesis testing is error prone
Type I error - incorrectly rejecting H0

Type II error - incorrectly accepting H0

(1 - ) denotes the power of the study.
  and (1 – ) are important determinants of

sample size for a study.
Relation between  and (1 – ) is reciprocal.
Most analysis done with alpha error 5% & beta error of 20%
Nair PG 22 Feb 2008
Type of study
Primary: experimental/clinical trial/ surveys
– Design features:
•
•
•
•
•
•
•
•
parallel group comparison
Paired or matched comparison
Within subject comparison
Single blind
Double blind
Crossover
Placebo controlled
Factorial design: >1 independent variable
Type of study (continued)
– Secondary research: summarises/ draws
conclusion from primary studies
• Overviews
– Non-systematic
– Systematic reviews
– Meta- analysis
• Guidelines
• Decision analysis
• Economic analysis
Is the design appropriate to research
• Therapy: testing the efficacy
– Randomised control trials
• Diagnosis: evaluating a new test
– Cross sectional survey
• Screening tests:
– Cross sectional survey
• Prognosis:
– Longitudinal cohort study
• Causation:
– Cohort or case control study
– Case reports?
Randomised controlled trials
• In a randomised controlled trial, participants are
randomly allocated by a process equivalent to the
flip of a coin to either one intervention (such as a
drug) or another (such as placebo treatment or a
different drug). Both groups are followed up for a
specified period and analysed in terms of
outcomes defined at the outset. Because, on
average, the groups are identical apart from the
intervention, any differences in outcome are, in
theory, attributable to the intervention.
Advantages
o
o
o
o
Allows rigorous evaluation of a single variable (effect of drug
treatment versus placebo, for example) in a precisely defined
patient group. Also a Prospective design (data are collected on
events that happen after you decide to do the study)
Uses hypotheticodeductive reasoning (seeks to falsify, rather
than confirm, its own hypothesis)
Potentially eradicates bias by comparing two otherwise identical
groups
Allows for meta-analysis (combining the numerical results of
several similar trials at a later date)
Disadvantages
o
o
o
o
Expensive and time consuming; hence, in practice:
Many randomised controlled trials are either never done, are
performed on too few patients, or are undertaken for too short
a period
Most are funded by large research bodies (university or govt.
sponsored) or drug companies, who ultimately dictate the
research agenda
Surrogate endpoints are often used in preference to clinical
outcome measures may introduce "hidden bias", especially
through:
o
o
o
Imperfect randomisation
Failure to randomise all eligible patients (clinician only
offers participation in the trial to patients he or she
considers will respond well to the intervention)
Failure to blind assessors to randomisation status
COHORT STUDY
Two (or more) groups of people are selected on the
basis of differences in their exposure to a particular
agent (such as a vaccine, a drug, or an environmental
toxin), and followed up to see how many in each group
develop a particular disease or other outcome. The
follow up period in cohort studies is generally
measured in years (and sometimes in decades
depending on the disease studied)
Case control study
• Patients with a particular disease or condition are identified and
"matched" with controls (patients with some other disease, the
general population, neighbours, or relatives). Data are then
collected (for example, by searching back through these people's
medical records or by asking them to recall their own history) on
past exposure to a possible causal agent for the disease.
• Generally concerned with the aetiology of a disease, rather than
its treatment.
• An important source of difficulty (and potential bias) in a casecontrol study is the precise definition of who counts as a "case,"
since one misallocated subject may substantially influence the
results.
Cross sectional surveys
• a representative sample of
subjects (or patients) is
interviewed, examined, or
otherwise studied to gain
answers to a specific clinical
question. In cross sectional
surveys, data are collected at a
single time but may refer
retrospectively to experiences
in the past-such as the study
of casenotes.
Case reports
• A case report describes the
medical history of a single
patient in the form of a
story:
• Although this type of
research is traditionally
considered to be "quick and
dirty" evidence, a great deal
of information can be
conveyed in a case report
that would be lost in a
clinical trial or survey.
Hierarchy of evidence
• Systemic reviews & meta-analysis
• Randomised controlled trials with a definitive
result
• Randomised controlled trials with a nondefinitive result
• Cohort studies
• Case control studies
• Cross sectional surveys
• Case reports
Assessing the methodological
quality
Important questions
• Is the study original or is it bigger, longer, rigorous, in different population
etc.
• Whom is the study about
– Recruitment
– Inclusion/exclusion criteria
– Is it under real life circumstances
• Is the design sensible
– The intervention & the comparative arm
– Outcome: actual or surrogate
• Was systematic bias taken care of
• Was assessment blind
• Were preliminary statistical questions addressed
– Sample size
– Duration of follow up
– Completeness of follow-up
Systematic Bias
• Systematic bias is defined as anything that
erroneously influences the conclusions about
groups and distorts comparisons.
• the aim should be for the groups being
compared to be as similar as possible except
for the particular difference being examined.
Steps to check systematic bias
• Non- RCT’s:
– use your common sense to decide if the baseline differences
between the intervention and control groups are likely to have
been so great as to invalidate any differences ascribed to the
effects of the intervention.
• Cohort studies:
– The selection of a comparable control group is one of the most
difficult decisions
– the "controlling" in cohort studies occurs at the analysis stage,
where complex statistical adjustment is made for baseline
differences in key variables.
• Case control studies:
– diagnosis of "caseness" and the decision as to when the
individual became a case is most open to bias.
Sample size
• Douglas Altman, “a trial should be big enough to have a high
chance of detecting, as statistically significant, a worthwhile
effect if it exists, and thus to be reasonably sure that no benefit
exists if it is not found in the trial. “
• To calculate sample size, the clinician must decide two things.
– What level of difference between the two groups would
constitute a clinically significant effect.
– the mean and the standard deviation of the principal outcome
variable.
• statistical nomogram, are used to work out how large a sample
is required to have a moderate, high, or very high chance of
detecting a true difference between the groups- the power of
the study.
Type I error
•
•
•
•
False positive error
Finding a difference when none exits.
P value; usually fixed at 5%.
95% chance / probability of being right and 5%
chance of the results being wrong.
• Usually due to sampling error
Nair PG 22 Feb 2008
Type II error
• False negative error
• β error
• Not picking up a difference when a
difference actually exits (β).
• What is 1- β is the power of the test; the
ability to pick up or detect a difference when
one actually exists.
• Usually power is 80-90%.
• Underpowered studies – high type II error
Statistical analysis
• Are the groups comparable & baseline
differences adjusted for
• Statistical tests parametric or non parametric
• If any obscure stats method used, has it been
justified & referenced
• Has tha data been analysed as per original
protocol
• Paired data tests, tails & outliers analysis
Statistical significance
• Probability & confidence
– P value calculation & interpretation
– Confidence intervals
– Effect of intervention in terms of likely benefit/ harm to
individual/ population
• Relative Risk (RR)
– Ratio of probability of event for exposed person to probability of
event for unexposed person (Pe/Pu)
• Absolute Risk Reduction (ARR)
– Difference in probability of event for exposed person and
unexposed person
(Pe-Pu)
The p value
 Maximum probability of


getting the observed
outcome by chance.
p up to 0.05 (5%) is
acceptable.
2-tailed p is preferable
to 1-tailed p.
Tails of a test
• A tail in statistics refers to direction of
movement of data
• Two tailed- Weight, blood sugar, IOP ( 2
tailed)
• One tailed- non-inferiority, prevention of
MTCT
Summary of Statistical analysis
• An association between two variables is likely to be causal if it is
strong, consistent, specific, plausible, follows a logical time
sequence, and shows a dose-response gradient
• A P value of <0.05 means that this result would have arisen by
chance on less than one occasion in 20
• The confidence interval around a result in a clinical trial indicates
the limits within which the "real" difference between the
treatments is likely to lie, and hence the strength of the inference
that can be drawn from the result
• A statistically significant result may not be clinically significant.
• The results of intervention trials should be expressed in terms of
the likely benefit an individual could expect (for example, the
absolute risk reduction)
Drug/ vaccine trials
• Clinical endpoints vs surrogate endpoints
• Surrogate endpoint: A surrogate end point may be defined as a variable
which is relatively easily measured and which predicts a rare or distant
outcome of either a toxic stimulus (such as a pollutant) or a therapeutic
intervention (a drug, surgical procedure, piece of advice, etc) but which is
not itself a direct measure of either harm or clinical benefit.
• Advantages:
– Can reduce sample size, duration & cost of clinical trials
– Can allow trts. to be assessed where primary outcomes are invasive/ unethical
• Disadvantages
–
–
–
–
Does not answer the objective of trt./ best trt. in a patient
May not be valid or reliable
Over reliance reflects a narrow clinical perspective
As developed in animal models, extrapolation to humans may be invalid
Ideal surrogate end point
• reliable, reproducible, clinically available, easily
quantifiable, affordable, and show a "dose-response"
effect
• true predictor of disease or the risk of disease
• sensitive & specific
• precise cut off between normal and abnormal values
• acceptable positive predictive value and negative
predictive value
• amenable to quality control monitoring
• Changes in the surrogate end point should rapidly and
accurately reflect the response to treatment.
Systematic reviews & meta analysis
• Systemic review
– An overview of primary studies that used explicit
& reproducible methods
– Needs a thorough search of appropriate databases
Meta-analysis
• Is a mathematical synthesis of the results of two
or more primary studies that addressed the same
hypothesis in the same way
• Easier to assimilate than a bunch of studies
• Software “metaview” driven & results obtained in
a pictorial format
• Homogeneity if the results of individual trial are
mathematical compatible with the result of any
of the others. Otherwise heterogeneous & needs
more statistical tests like chi square.
Quantitative vs qualitative research
• Qualitative methods aim to make sense of, or
interpret phenomena in terms of meanings
people bring to them
• It defines preliminary questions which can then
be addressed in quantitative studies
• Addresses a clinical problem thru a clearly
formulated question & using more than one
research method (triangulation)
• Analysis done using explicit, systematic and
reproducible methods
ITT vs Per-protocol analysis
• Failure to include all participants in the analysis
may bias the trial results
• The preservation of randomized trial population
and trt assignment, control of type I error in
superiority trials, assessment of public health
policy are reasons for preferring ITT analysis
• PPA: biological efficacy
• FDA & EMEA prefer ITT analysis
• Both important to answer different scientific
questions & do have merits & demerits
THANK YOU
Download