Uploaded by waktola boka

Biostat-Lecture Note-All-2019

advertisement
Biostatistics and Epidemiology (Stat 4101)
Taddele Ch. (MSc.)
taddelecherinet@gmail.com /tkibr005@uottawa.ca
Addis Ababa University
Statistics Department
Winter, 2020
Biostatistics and Epidemiology
What is Statistics?
 The field of statistics: the study and use of theory and
methods for the analysis of data arising from random
process.
 The study of how to make sense of data.
 Statistics is the science of learning from data, and of
measuring, controlling, and communicating uncertainty.
 Statistics is also an ART of conducting a study, analyzing
the data, and derive useful conclusions from numerical
outcomes about real life problems.
3
The two fields of Statistics
1.
Mathematical statistics: the study and development
of statistical theory and methods in abstract; and
2. Applied Statistics: the application of statistical
methods to solve real problems involving randomly
generated data, and the development of new
statistical methodology motivated by real problems.
4
Classes of Statistics
 Descriptive statistics is the branch of statistics
that includes methods
summarizing data.
for
organizing
and
 Inferential statistics is the branch of statistics that
involves generalizing from a sample to the population
from which the sample was selected and assessing the
reliability of such generalizations.
5
Biostatistics
 Biostatistics is the branch of applied statistics directed toward
applications in health sciences and biology.
 e.g.
Clinical trials, Epidemiology, Pharmacology, Medical decision making, Comparative
Effectiveness Research etc.
 Why Biostatistics?

Some statistical methods are more heavily used in health applications
than elsewhere


Example: survival analysis, longitudinal data analysis
Because examples are drawn from health sciences
6
Statistical Methods in Biostatistics
The data analysis process can be viewed as a sequence of steps that
lead from planning to data collection to making informed
conclusions based on the resulting data.
1. Understanding the nature of the problem: know the goal of
the research and what questions we hope to answer.
2. Deciding what to measure and how to measure it: what
information is needed to answer the questions of interest.
3. Data collection: decide whether an existing data source is
adequate or whether new data must be collected.
If you decide to use existing data, then understand how
were collected and for what purpose.
the
data
7
4. Data summarization and preliminary analysis: a
preliminary analysis that includes summarizing the
data graphically and numerically.
5.
Formal data analysis: select and apply statistical
methods.
6.
Interpretation of results: Several questions should
be addressed in this final step. Example:
- What can we learn from the data?
- What conclusions can be drawn from the analysis? And
- How can our results guide future research?
8
Application of Biostatistics
1. Collection of vital statistics - for example, mortality rates -
used to inform about and to monitor the health status of the
population
2. Analysis of accident records - to find out the times during the
year when the greatest number of accidents occurred in a plant and
decide when the need for safety instruction is the highest
9
Application of Biostatistics
3. Clinical trials - to determine whether or not a new hypertension medication
performs better than the standard treatment for mild to moderate essential
hypertension
10
Application of Biostatistics…
4. Surveys to estimate the proportion of low-income women of
child-bearing age with iron deficiency anemia
11
Application of Biostatistics…
5.
Studies
to
investigate
whether
or
not
exposure
to
electromagnetic fields is a risk factor for leukemia (a cancer caused by
an overproduction of damaged white blood cells)
12
The Logic of Scientific Reasoning
 How do we go about deciding something is true?
 We have two tools at our disposal to pursue scientific inquiry:

We have our senses, through which we experience the world
and make observations.

We have the ability to reason, which enables us to make logical
inferences.
 In science we impose logic on those observations.
 All the logic in the world is not going to create an observation,
and all the individual observations in the world won't in
themselves create a theory.
13
The Logic of Scientific Reasoning
In deductive inference, we hold a theory and based on it we
make a prediction of its consequences. That is, we predict what
the observations should be.
In inductive inference, we go from the specific to the general.
14
The Logic of Scientific Reasoning
 We make many observations, discern a pattern, make a
generalization, and infer an explanation.
 For example, it was observed in the Tikur Anbesa Hospital
in the 1990s that women giving birth were dying at a high
rate of puerperal fever, a generalization that provoked terror
in prospective mothers.
 Induction is based on our belief that the things unobserved will
be like those observed or that the future will be like the past.
15
The Logic of Scientific Reasoning
 Asking which comes first, theory or observation, is like asking
which comes first, the chicken or the egg.
 Theories, then, can be used to predict observations.
 But these observations will not always be exactly as we predict
them


due to error
inherent variability of natural phenomena.
 If observations are widely different from our predictions we
will have to abandon or modify the theory.
16
The Logic of Scientific Reasoning
 How do we test the extent of the discordance of our
predictions based on theory from the reality of our
observations?

The test is a statistical or probabilistic test
 Statistics is a methodology with broad areas of application in

Science and industry,

medicine and

many other fields.
17
The Logic of Scientific Reasoning
 A phenomenon may be principally based on
a) Deterministic model

Example. Boyle's laws for a fixed volume an increase in
temperature of a gas determines that there is an increase in
pressure.
b) Probabilistic model
 which implies that various states of a phenomenon occur with
certain probabilities.
“The presence of variation requires the use of statistical analysis” (Arias E,
Smith BL., 2003).
18
The Logic of Scientific Reasoning
 When there is little variation with respect to a phenomenon

much more weight is given to a small amount of evidence than when there is
a great deal of variation

Drug cures few patients of Pancreatic (invariably a fatal disease)- More weight to
the evidence.

Determine Vitamin C cures Cold - More patients should be involved since
there may be biological variability among patients.
19
Review of Probability and Statistics
Definition: The entire collection of individuals or objects about
which information is desired is called the population of
interest.
 In many of the situations, we cannot observe the full population.
 A sample is a subset of the population, selected for study.
 An individual subject/object in a population is called an
experimental unit.
Representative
Population
Sample
Inference
• uncertainty
• reliability
20
Definition of Probability
-A probability is a number between 0 and 1 that reflects
the likelihood of occurrence of some outcome.
Intuitively: Probability is relative frequency in the population
Formal: Random experiment----->Events----->Probabilities
-The probability of an outcome, denoted by
P(outcome), is interpreted as the proportion of the
time that the outcome occurs in the long run.
21
Some properties
1.
The probability of any outcome is a number between
0 and 1.
2. If outcomes cannot occur simultaneously, then the
probability that any one of them will occur is the sum
of the outcome probabilities.
3. The probability that an outcome will not occur is
equal to 1 minus the probability that the outcome will
occur.
22
Independence
 Independent outcomes: Two outcomes are said to be
independent if the probability that one outcome occurs
is not affected by knowledge of whether the other outcome
has occurred.
-
If there are more than two outcomes under consideration, they are
independent if knowledge that some of the outcomes have
occurred does not change the probabilities that any of the other
outcomes have occurred.
 Dependent outcomes: If the occurrence of one
outcome changes the probability that the other
outcome occurs, the outcomes are dependent.
23
Conditional Probability
 Now let us consider the case where the chance that a particular
event happens is dependent on the outcome of another event.
 The probability of A, given that B has occurred, is called the
conditional probability of A given B, and is written symbolically
as P(A|B).
24
Conditional Probability
 When we speak of conditional probability,

the denominator becomes all the outcomes in the condition not
all possible outcomes and

the numerator consists of those outcomes that are in both
condition and conditioned events.
A
Bbbb
AnB B
N
P( A ∩ B)
P ( A / B) =
P( B)
25
Bayesian Probability
 Imagine that M is the event “loss of memory,” and B is the
event “brain tumor.”
 We can establish from research on brain tumor patients the
probability of memory loss given a brain tumor, P(M|B).
 A clinician, however, is more interested in the probability of a
brain tumor, given that a patient has memory loss, P(B/M).
26
Bayesian Probability
 It is difficult to obtain directly

have to study the vast number of persons with memory loss
(which in most cases comes from other causes) and determine
what proportion of them have brain tumors.
P( M / B) P( B)
P(B / M ) =
P( M )
P( M / B) P( B)
=
P( M / B) P( B) + P( M / B c ) P( B c )
27
Bayesian Probability
 “memory loss, M” can occur either
among people with brain tumor, with probability,
P(M/B) P(B), or
among people with no brain tumor, with probability,
P(M/Bc)P(Bc)
28
Example 2
 To study the proportion of smokers by sex from a population
a random sample of 200 persons was taken, the following
table shows the result.
Sex
Non-Smoker
Smoker
Total
Male
64
16
80
Female
42
a) What
is the probability
of getting78a non smoker120
given that a
Total
94
200
person
selected is a 106
female?
b) What is the probability of getting a male given that a person
selected is smoker?
Solution
- P (M) = 80/200,
P(F) = 120/200
- P(S) = 94/200,
P(N) = 106/200
- P(M and S)= 16/200,
P(F and N)=42/200
1) P(N/F) = P(N and F)/P(F) =42/120= 0.35
2) P(M/S)=P(M and S)/P(S) =16/94= 0.17
Exercise
A study investigating the effect of prolonged exposure to bright
light on retina damage in premature infants
Bright light
Reduced light
TOTAL
Retinopathy
YES
18
21
39
Retinopathy TOTAL
NO
3
21
18
39
21
60
Find probability of retinopathy, given that the infant was
exposed to bright light
31
Solution
The probability of developing retinopathy is:
P(Retinopathy) =
No. of infants with retinopathy (18 + 21)
=
= 0.65
Total No. of infants
21 + 39
We want to compare the probability of retinopathy, given that the
infant was exposed to bright light, with that the infant was exposed
to reduced light.
Exposure to bright light and exposure to reduced light are
conditioning events, events we want to take into account when
calculating conditional probabilities.
32
The conditional probability of retinopathy, given exposure to bright
light, is:
P(Retinopathy/exposure to bright light) =
No. of infants with retinopathy exposed to bright light
No. of infants exposed to bright light
= 18/21 = 0.86
P(Retinopathy/exposure to reduced light) =
# of infants with retinopathy exposed to reduced light
No. of infants exposed to reduced light
= 21/39 = 0.54
The conditional probabilities suggest that premature infants exposed to
bright light have a higher risk of retinopathy than premature infants
exposed to reduced light
33
Applications: Diagnostic test
 Assume that there is a disease D.

Let, D+ the event that a patient has the disease D and

Let, D- the event that a patient has not the disease D.
 Assume that there exists a test T to diagnose this disease.

Let, T+ a positive test result and

Let, T- a positive test result.
34
Applications …
Sensitivity P(T+/D: The sensitivity of a symptom (or set of
symptoms or screening test) is the probability that the symptom
is present given that the person has a disease.
Specificity P(T- / D-): The specificity of a symptom (or set of
symptoms or screening test) is the probability that the symptom
is not present given that the person does not have a disease.

Intuitively, we expect a "good" test to identify the ill persons and to
discriminate the non-ill persons.

Therefore, a "good" test has a value close to 1 for both the sensitivity and
specificity.
35
Applications …
 When a wrong conclusion is made,
 P(T- / D+) is the probability of a false negative result
 P(T+ / D-) is the probability of a false positive result
 When we develop a new test, we apply it to a group of patients to
get a value for the sensitivity and specificity.
 However, in practice the test will be used to diagnose an
individual.
36
Applications …
 Hence, we are interested in the probability that a person is ill when
he/she has a positive test result.

This is the Predictive Positive Value (PPV), P(D+/T+).

By Bayes‘ rule, this is given by
P(T + / D +) P( D +)
P ( D + / T +) =
P(T + / D +) P( D +) + P(T + / D −) P( D −)

This leads to
sensitivity × P( D +)
P ( D + / T +) =
sensitivity × P( D +) + (1 − specificity ) × (1 − P( D +))
37
Applications …
 P(D+) is the prevalence of the disease. This is the probability
that any person is ill.
Also, we are interested in the probability that a person is not ill
when he/she has a negative test result.
 This is the Predictive Negative Value (PNV), P(D- / T-).
P(T − / D −) P( D −)
P ( D − / T −) =
P(T − / D −) P( D −) + P(T − / D +) P( D +)
38
Applications: Example 1
Example: In oncology, we have a test for a rare type of cancer. The
sensitivity of this test is 99% and its specificity is 98%. The
prevalence of this cancer is equal to 0.005. Compute the predictive
positive value?
sensitivity × P( D +)
Solution: P ( D + / T + ) =
sensitivity × P( D +) + (1 − specificity ) × (1 − P( D +))
P ( D + / T +)
0.99 × 0.005
= 0.20
0.99 × 0.005 + (1 − 0.98) × 0.995
We note that although we have a very accurate test, only 20% of the
persons with a positive test result have the disease!
39
Applications: Example 2
Example: Here is a simplified version of how genes code eye color,
assuming only two colors of eyes.
-
Each person has two genes for eye color.
-
Each gene is either B or b.
-
A child receives one gene from each of its parents.

The gene it receives from its father is one of its father’s two genes, each
with probability 1/2; and similarly for its mother.

The genes received from father and mother are independent.
40
Applications: Example 2…
 If your genes are BB or Bb or bB, you have brown eyes;
 if your genes are bb, you have blue eyes.
 Suppose that John has brown eyes. So do both of John’s parents,
and his sister has blue eyes. What is the probability that John’s
genes are BB?
41
Applications: Example 2 …
Solution
- John’s sister has genes bb, so one b must have come from each
parent.
- Thus each of John’s parents is Bb or bB; we may assume Bb. So the
possibilities for John are (writing the gene from his father first)
BB, Bb, bB, bb each with probability 1/4.
John gets his father’s B gene with probability 1/2 and his mother’s B
gene with probability 1/2, and these are independent, so the probability
that he gets BB is 1/4.

42
Applications: Example 2…
Solution
 Let X be the event ‘John has BB genes’ and Y the event ‘John has
brown eyes’.
Then X= {BB} and Y = {BB, Bb, bB}. The question asks us to
calculate P(X | Y). This is given by
1
P( X ∩ Y )
4 1
P ( X /=
Y)
= =
3 3
P(Y )
4
43
Odds and Probability
The odds are simply the ratio of the proportions for the two possible
outcomes (success/ failure).
When the odds of a particular horse losing a race are said to be 4 to
1, he has a 4/5 = .80 probability of losing.
To convert an odds statement to probability, we add 4 + 1 to get our
denominator of 5. The odds of the horse winning are 1 to 4, which
means he has a probability of winning of 1/5 = .20.
If p is the proportion for one outcome, then 1−p is the proportion
for the second outcome:
proportion of success
p
=
odds =
proportion of failures 1 − p
p=
odds
1 + odds
44
Inference
 We described statistical inference as the branch of
statistics that involves generalizing from a sample to the
population from which it was selected.
 Interest usually centers on the value of one or more
variables.
 A variable associates a value with each individual or object
in a population.
 A variable can be either categorical or numerical,
depending on its possible values.
45
 Numerical
continuous.


variables can be either discrete or
A discrete numerical variable is one whose possible
values are isolated points along the number line.
A continuous numerical variable is one whose possible
values form an interval along the number line.
46
Estimation
In general terms, estimation uses a sample statistic as the basis for
estimating(approximating) the value of the corresponding
population parameter.
It is also common to use estimation in situations where a researcher
simply wants to learn about an unknown population.
Given several unbiased statistics that could be used for estimating a
population characteristic, the best choice to use is the statistic with
the smallest standard deviation.
47
Point estimation
 An estimate for a parameter that is one numerical
value. An example of a point estimate is the sample
mean or the sample proportion.
 A statistic whose mean value is equal to the value
of the population characteristic being estimated is
said to be an unbiased statistic. A statistic that is
not unbiased is said to be biased.
49
Confidence interval
 A confidence interval (CI): An interval of values
computed from sample data that is likely to cover the
true parameter of interest.
 he interpretation of CI: "We are 'some level of
percent confident' that the 'population of interest' is
from 'lower bound to upper bound'.
 The confidence level associated with a confidence
interval estimate is the success rate of the method used
to construct the interval.
 The standard error of a statistic is the estimated standard
deviation of the statistic.
50
Point & Interval Estimation
 Point Estimator- draws inference using a single number/value
 Don't reflect the effect of larger sample sizes
 Interval Estimator – contains a certain percentage of possible
values of the parameter
Lower
Confidence
Limit
Upper
Confidence
Limit
Width of
confidence interval
51
Properties of Good Estimators

Unbiased -an estimator whose expected value is equal to that
parameter.
 Consistent - the difference between the estimator and the parameter
grows smaller as the sample size grows larger.
 Relatively efficient-the one whose variance is smaller
52
Estimating the Population Mean when the
Population Standard Deviation is Known
 How is an interval estimator produced from a sampling
distribution?
- To estimate µ, a sample of size n is drawn from the
population, and its mean x is calculated.
- Under certain conditions, x is normally distributed (or
approximately normally distributed.), thus
x −µ
Z=
σ n
53
 We know that
P ( µ − zα 2
σ
σ
≤ x ≤ µ + zα 2
) = 1−α
n
n
– This leads to the relationship
P( x − z α 2
σ
σ
≤ µ ≤ x + zα 2
) = 1− α
n
n
1 - α of all the values of x obtained in repeated sampling from this
distribution, construct an interval that includes (covers) the expected value of
the population.
σ
σ 

x − zα 2 n , x + zα 2 n 


54
Confidence level
1-α
x − zα 2
σ
n
Lower confidence limit
x
2z α 2
x + zα 2
σ
n
σ
n
Upper confidence limit
σ
σ  See simulation results

 x − z α 2 n , x + z α 2 n  demonstrating this point


55
56
• The confidence interval are correct most, but not all,
of the time.
150
UCL
100
LCL
50
0
Not all the confidence intervals cover
the real expected value of 100.
0
The selected confidence level is 90%,
and 10 out of 100 intervals do not cover
the real µ.
100
57
58
 Example:
 The number and the types of television programs and
commercials targeted at children is affected by the amount
of time children watch TV. A survey was conducted
among 100 North American children, in which they were
asked to record the number of hours they watched TV per
week. The population standard deviation of TV watch was
known to be σ = 8.0. Suppose that the sample mean is
27.191.
 Estimate the watch time with 95% confidence level.
59
Solution
 The parameter to be estimated is µ, the mean time of
TV watch per week per child (of all American
Children).
 We need to compute the interval estimator for µ.
x = 27.191.
x ± zα 2
σ
n
= 27.191 ± z .025
= 27.191 ± 1.96
8.0
100
8.0
100
Since 1 - α =.95, α = .05.
Thus α/2 = .025. Z.025 = 1.96
= 27.191 ± 1.57 = [25.621, 28.761]
60
Hypothesis Testing
 The purpose of hypothesis testing is to determine whether
there is enough statistical evidence in favor of a certain belief
about a parameter.
Is a new drug effective in curing a certain disease? A sample of patient is
randomly selected. Half of them are given the drug where half are
given a placebo. The improvement in the patients conditions is then
measured and compared.
 Null and alternative hypotheses reference population values, and
not observed statistics.
61
Concept of hypothesis testing
 The critical concepts of hypothesis testing.
There are two hypotheses (about a population
parameter(s)):


H0 - the null hypothesis
[ for example µ = 5]
H1 - the alternative hypothesis [µ > 5]
This is what you want to prove
– Assume the null hypothesis is
true.
• Build a statistic related to the
parameter hypothesized.
• Pose the question: How probable is
it to obtain a statistic value at least
as extreme as the one observed
from the sample?
µ=5
x
62
 Make one of the following two decisions (based on the
test):
-
Reject the null hypothesis in favor of the alternative
hypothesis.
-
Do not reject the null hypothesis in favor of the
alternative hypothesis.
63
Example - Efficacy Test for New drug
 Drug company has new drug, wishes to compare it with current
standard treatment
-
Federal regulators tell company that they must demonstrate that new drug is
better than current treatment to receive approval
-
Firm runs clinical trial where some patients receive new drug, and others
receive standard treatment
-
Numeric response of therapeutic effect is obtained
-
Compute mean difference of the therapeutic effects
After few procedures make decisions
64
Possible Outcomes of Statistical Decision
α = P(Type I Error ) Ineffective drug is deemed better.
β = P(Type II Error ) Effective drug is deemed to be no better
• Goal: Keep α and β reasonably should be small
•The power of a test is the probability that the test will reject the null
hypothesis when the treatment does have an effect.
65
Hypothesis Testing-One Sample
 The test statistic is converted to a conditional probability called a
P-value.
-
P- value answers the question “If the null hypothesis were true, what is the
probability of observing the current data or data that is more extreme?”
-
Small p values provide evidence against the null hypothesis because they
say the observed data are unlikely when the null hypothesis is true.
-
“significant” means “the observed difference is not likely due to chance.”
It does not mean of “important” or “meaningful.”
66
Warnings!
 Failure to reject the null hypothesis leads to its acceptance.
(WRONG! Failure to reject the null hypothesis implies insufficient
evidence for its rejection.)
 The p value is the probability that the null hypothesis is incorrect.
(WRONG! The p value is the probability of the current data or data
that is more extreme, assuming H0 is true.)
 α = .05 is a standard with an objective basis. (WRONG! α = .05 is
merely a convention that has taken on unwise mechanical use.
67
Warnings!.....
NB: There is no sharp distinction between “significant” and
“insignificant” results, only increasingly strong evidence as the p
value gets smaller.
 Small p values indicate large effects. (WRONG! p values tell you
nothing about the size of an effect.)
 Data show a theory to be true or false. (WRONG! Data can at best
serve to support or refute a theory or claim.)
 Statistical significance implies importance. (WRONG! WRONG!
WRONG! Statistical significance says very little about the
importance of a relation.
68
Hypothesis Testing-One Sample
-
Determine whether prenatal alcohol affects birth weight or not.
-
A sample is selected from the original population and is given alcohol.
-
The question is what would happen if the entire population were given alcohol.
-
The treated sample provides information about the unkonwn treated population.
69
Hypothesis Testing-Two Sample
 Often we want to compare one group to another.
 What happens when we are comparing two samples?
 Variability in both samples, and potentially two samples are related
 Having an accurate measure of tumor size is extremely important
because it allows a physician to accurately determine if a tumor is
growing, shrinking or remaining constant.
 The problem is that often the measurements of the tumor size vary
from physician to physician
 Measure by RECIST method rather linear distance across the tumor
70
Hypothesis Testing-Two Sample
 For a portion of the study, a pair of doctors were shown the same
set of tumor pictures.
 The volume of the tumor was measured by two separate physicians
under similar conditions.
 Question of interest: Did the measurements from the two
physicians significantly differ?
 If not, then there would be no evidence that the volume
measurements change based on physician.
 20 scans were measured by each physician
 Measurements in
cm3, What
can you say about these samples?
71
Example-Paired
 Two measurement on the same person, they are related so we must
account for this
 Measure the effect of the treatment in
each person by taking the difference
 Instead
of
having
two
samples,
consider our dataset to be one sample
of differences
72
Example-Paired
 Volume from Dr. 1
 Population mean:
 Sample mean
 Volume from Dr. 2
 Population mean:
 Sample mean:
Difference

Population mean:

Sample mean:
73
Example-Paired
use t-distribution with n-1 df where n is the number of differences
Standard deviation of differences
Step 1: Hypothesis: No difference between physicians effect
Step 2: Level of significance-alpha=0.05
Step 3: Test statistics, t small sample size
74
Example-Paired
 Step 4: Decision, don not reject the null hypothesis
 Step 5: Conclusion, there is no evidence of a difference in tumor
volume measurement based on physician
 Confidence interval for paired t-test
 For our example, the confidence interval is (-1.01, 0.54)
 Note that the conclusion from the hypothesis test and the
confidence interval are the same
75
Example-Paired
 Other Examples:

Differences between left and right eye

Differences between dominant and recessive hand

Matched samples
76
Example-two sample independent
 Often it is impractical to design study to use the same patients for
both group
Example: Comparison of cholesterol in males and females
 Since the samples are not paired, we cannot use the difference
between the individual samples
 Compare the tumor volume among patients with different forms of
cancer. The average tumor size is important to know the effect of
treatment can be determined.
77
Example-two sample independent
 The null hypothesis is that there is no difference between the
volume of the tumor in the two forms of cancer
H0: mbrain =mbreast , or mbrain – mbreast =0
 More generally, we can test if the difference between two groups is
a specific value, m1-m2=D
 This occurs when comparing two treatment groups and we are
interested if the two groups are different by a specific amount
78
Example-two sample independent
Basic form of test statistic
Known variance
Unknown Variance
Case 1) Equal variance and known
Then the test statistic is
Estimate σ βy
79
Two sample independent
Known variance
Unknown Variance
Case 2) Unequal variance and known
Then the test statistic is with V df
Satterthwaite or Welch
approximation
80
Comparing More than two Means
81
Analysis of Variance (ANOVA)
 Idea: For two or more groups, test difference between means, for
quantitative normally distributed variables.
 Just an extension of the t-test (an ANOVA with only two groups is
mathematically equivalent to a t-test)
 It’s like this: If there are three groups to compare:
 Do six pair-wise ttests, but this would increase my type I error
 So, instead look at the pairwise differences “all at once.”
 To do this, recognize that variance is a statistic that allows more than
one difference at a time…

will look at two measures of variation, overall variance vs
individual differences
82
Analysis of Variance (ANOVA)
 Use a ratio of the two
 Between group variation / within group variation
Summarizes the mean differences between all
groups at once.
Variability between groups
F=
Variability within groups
Analogous to pooled variance from a ttest.
83
Example
Treatment 1
Treatment 2
Treatment 3
Treatment 4
y11
y21
y31
y41
y12
y22
y32
y42
y13
y23
y33
y43
y14
y24
y34
y44
y15
y25
y35
y45
y16
y26
y36
y46
y17
y27
y37
y47
y18
y28
y38
y48
y19
y29
y39
y49
y110
y210
y310
y410
10
∑y
1j
y1• =
10
∑
j =1
10
y 2• =
10
( y1 j − y1• ) 2
j =1
10 − 1
10
10
∑(y
2j
∑y
2j
j =1
y 3• =
10
− y 2• )
j =1
10 − 1
2
10
∑
∑y
10
3j
j =1
y 4• =
10
( y 3 j − y 3• )
j =1
10
2
∑
∑y
The group means
j =1
10
( y 4 j − y 4• ) 2
j =1
10 − 1
4j
10 − 1
The (within) group
variances
84
SSW + SSB
4
10
∑∑ ( y
i =1 j =1
ij
− y i• )
4
2
+
∑
i =1
( y i • − y •• )
=
TSS
4
2
=
10
∑∑
( y ij − y •• ) 2
i =1 j =1
85
One Way ANOVA Example
 Assume ”treatment results” from 13 patients visiting one of three
doctors are given:
 Doctor A: 24, 26, 31, 27
 Doctor B: 29, 31, 30, 36, 33
 Doctor C: 29, 27, 34, 26
 H0: The treatment results are from the same population of results
 H1: They are from different populations
86
One Way ANOVA Example
 Averages within groups:
 Doctor A: 27
 Doctor B: 31.8
 Doctor C: 29
 Total average
4 × 27 + 5 × 31.8 + 4 × 29
= 29.46
4+5+ 4
87
One Way ANOVA Example
 Sum of squares within groups:
SSW = (24 − 27) 2 + (26 − 27) 2 + ... + (29 − 31.8) 2 + .... = 94.8
 Compare it with sum of squares between groups:
SSG =
(27 − 29.46) 2 + (27 − 29.46) 2 + ... + (31.8 − 29.46) 2 + ....
=
4(27 − 29.46) 2 + 5(31.8 − 29.46) 2 + 4(29 − 29.46) 2 =
52.43
88
One Way ANOVA Example
 Comparing these, we also need to take into account the number of
observations and sizes of groups
MSW
=
94.8
SSW
= = 9.48
n − K 13 − 3
SSB 52.43
MSB
= = = 26.2
K −1 3 −1
MSB 26.2
= = 2.76
MSW 9.48
F3−1,13−3,0.05 = 4.10
89
One Way ANOVA Example
BetweenGroups
WithinGroups
Total
Sumof
Squares
52,431
94,800
147,231
df
2
10
12
MeanSquare
26,215
9,480
F
2,765
Sig.
,111
Use ”Analyze => Compare Means => One-way ANOVA
Do NOT reject the null hypothesis
A statistically significant ANOVA (F-test) only tells you that at least
two of the groups differ, but not which ones differ.
Determining which groups differ (when it’s unclear) requires more
sophisticated analyses to correct for the problem of multiple
comparisons…
90
One Way ANOVA Example
 Why not just do all possible pair wise t tests?
 Answer: because, at an error rate of 5% each test, this means
you have an overall chance of up to 1-(.95)3= 14% of making a
type-I error (if all 3 comparisons were independent)
 If you wanted to compare 6 groups, you’d have to do 6C2 = 15 pair
wise t tests; which would give you a high chance of finding
something significant just by chance (if all tests were independent
with a type-I error rate of 5% each); probability of at least one
type-I error = 1-(.95)15=54%.
91
Correction for multiple comparisons
 If your ANOVA test identifies a difference between group means,
then you must identify which of your k groups differ.
 If you did not specify the comparisons of interest (“contrasts”)
ahead of time, then you have to pay a price for making all kCr
pairwise comparisons to keep overall type-I error rate to α.
 Bonferroni
 For example, to make a Bonferroni correction, divide your desired alpha cut-
off level (usually .05) by the number of comparisons you are making.
Assumes complete independence between comparisons, which is way too
conservative
92
Non parametric Tests
 Most of the statistical methods referred to as parametric require the
use of interval- or ratio-scaled data.
 Nonparametric methods are often the only way to analyze nominal
or ordinal data and draw statistical conclusions.
 Nonparametric methods require no assumptions about the
population probability distributions.
 Non parametric methods are often called distribution-free methods
93
Example: Chi-square test
here we have the results of a poll that asked people’s opinions about
the use of the death penalty as opposed to life in prison.
χ
2
(Oij − Eij ) 2
=
∑
∑
Eij
i
j
20.02
2
χ 0.05,2
= 5.99
H0: distribution of female preferences matches distribution of male preferences
HA: female proportions do not match male proportions
Reject H0
94
Summary of Chisquare Test
Type
Goodness
fit
Aim
Hypotheses
One sample.
H0: The observed values are equal to
theoretical values (expected). (The
data
followed
the
assumed
distribution).
Ha: The observed values are not equal
to theoretical values (expected). (The
data did not follow the assumed
distribution).
of Compares the expected and
observed values to determine
how well the experimenter’s
predictions fit the data.
Homogeneity
Two different populations (or
sub-groups).
 Applied to one categorical
variable.
H0: Investigated populations are
homogenous.
Ha: Investigated populations are not
homogenous.
Independence
One population.
Type of variables: nominal,
dichotomical, ordinal or grouped
interval
Each population is at least 10
times as large as its respective
sample
Research hypothesis: The two
variables are dependent (or related).
H0: There is no association between
two variables. (The two variables are
independent).
Ha: There is an association between
two variables.
95
McNemar Test
 Suppose we have the situation where measurements are made on
the same group of people before and after some intervention, or
suppose
 we are interested in the agreement between two judges who
evaluate the same group of patients on some characteristics.
 In such situations, the before and after measures, or the opinions of
two judges, are not independent of each other, since they pertain to
the same individuals.
 The test statistic is:
(n12 − n21 ) 2
z =
 χ12
n12 + n21
2
96
McNemar Test-Example
 1319 schoolchildren were questioned on the prevalence of symptoms
of severe cold at the age of 12 and again at the age of 14 years. At age
12, 356 (27%) children were reported to have severe colds in the past
12 months compared to 468 (35.5%) at age 14.
H0: the prevalence is same at 12
and 14 years age
Ha: The prevalence is not the same
(256 − 144) 2
= 31.36
(256 + 144)
The calculated test statistic is much larger than the tabulated
2
( χ 0.05,1 = 3.89 ). There is a difference for prevalence of cold at
age 12 and 14.
97
Sign Tests
 Used for paired data
 Can be ordinal or continuous
 Very simple and easy to interpret
 Makes no assumptions about distribution of the data
 Not very powerful
 The null hypothesis for the sign test is
H0: the median difference is zero
98
Sign Tests
 To evaluate H0 we only need to know the signs of the differences
 If half the differences are positive and half are negative, then the
median = 0 (H0 is true).
 If the signs are more unbalanced, then that is evidence against
H0.
99
 evaluate whether these data provide
evidence that orthodontic treatment
improves children’s image of their
teeth
 The sign test looks at the signs of the
differences
 15 children felt better about their
teeth (+ difference in ratings)
 1 child felt worse (- diff.)
4
children
felt
the
same
(difference = 0)
 Looks like good evidence
 Need a p-value
100
Sign Tests
 The p-value is the probability of an outcome as or more extreme
(under H0 ) than that observed.
 We observed 15 positives and 1 negative.
 If H0 were true we’d expect an equal number of positive and
negative differences.
 More extreme outcomes would be
 more than 15 positives or less than 1 positives
101
Sign Tests
 P-value = P(X > 15) + P(X < 1)
 X is the number of positive differences
 Under H0, X is Binomial(n = 16, p = 0.5)
 n =16 because the sign test disregards the zero differences
102
Wilcoxon Signed-rank test
 Wilcoxon Signed-rank test is another non-parametric test used
for paired data.
 It uses the magnitudes of the differences
 the sign test does not
 More powerful than the sign test
 More difficult to interpret than the sign test
103
child
Rating
before
Rating
after
1
1
5
2
1
4
3
3
1
4
2
3
5
4
4
6
1
4
7
3
5
8
1
5
9
1
4
10
4
4
11
1
1
12
1
4
13
1
4
14
2
4
15
1
4
16
2
5
17
1
4
18
1
5
19
4
4
20
3
5
Example: Body image data
 Use the Wilcoxon signed-rank
test to evaluate whether these
data
provide
evidence
that
orthodontic treatment improves
children’s image of their teeth.
104
child
Rating
before
Rating
after
difference
1
1
5
4
2
1
4
3
3
3
1
-2
4
2
3
1
5
4
4
0
6
1
4
3
7
3
5
2
8
1
5
4
9
1
4
3
10
4
4
0
11
1
1
0
12
1
4
3
13
1
4
3
14
2
4
2
15
1
4
3
16
2
5
3
17
1
4
3
18
1
5
4
19
4
4
0
20
3
5
2
Example: Body image data
 Use the Wilcoxon signed-
rank test to evaluate whether
these data provide evidence
that
orthodontic
treatment
improves children’s image of
their teeth.
 Work with the differences
 Remove
those
with
zero
difference
105
child
diff.
1
4
2
3
3
-2
4
1
6
3
7
2
8
4
9
3
12
3
13
3
14
2
15
3
16
3
17
3
18
4
20
2
Example: Body image
data
To compute the test we
need to
106
child
diff.
sign
1
4
+
2
3
+
3
-2
-
4
1
+
6
3
+
7
2
+
8
4
+
9
3
+
12
3
+
13
3
+
14
2
+
15
3
+
16
3
+
17
3
+
18
4
+
20
2
+
Example: Body image data
To compute the test we
need to
 note
the signs of the
differences
107
child
diff.
sign
|diff.|
1
4
+
4
2
3
+
3
3
-2
-
2
4
1
+
1
6
3
+
3
7
2
+
2
8
4
+
4
9
3
+
3
12
3
+
3
13
3
+
3
14
2
+
2
15
3
+
3
16
3
+
3
17
3
+
3
18
4
+
4
20
2
+
2
Example:
Body image data
To compute the test we
need to
 note
the signs of the
differences
 get magnitudes of the
differences
108
child
diff.
sign
|diff.|
4
1
+
1
3
-2
-
2
7
2
+
2
14
2
+
2
20
2
+
2
2
3
+
3
6
3
+
3
9
3
+
3
12
3
+
3
13
3
+
3
15
3
+
3
16
3
+
3
17
3
+
3
1
4
+
4
8
4
+
4
18
4
+
4
Example: Body image data
To compute the test
we need to
 note the signs of the
differences
 get magnitudes of the
differences
 reorder the data by
magnitude
109
child
diff.
sign
|diff.|
Avg.
ranks
4
1
+
1
1
3
-2
-
2
3.5
7
2
+
2
3.5
14
2
+
2
3.5
20
2
+
2
3.5
2
3
+
3
9.5
6
3
+
3
9.5
9
3
+
3
9.5
12
3
+
3
9.5
13
3
+
3
9.5
15
3
+
3
9.5
16
3
+
3
9.5
 reorder the data by magnitude
17
3
+
3
9.5
 assign
1
4
+
4
15
8
4
+
4
15
18
4
+
4
15
Example:Body image data
 To compute the test we need
to
 note
the
signs
of
the
magnitudes
of
the
differences
 get
differences
ranks
to
the
observations
110
child
diff.
sign
|diff.|
Avg.
ranks
4
1
+
1
1
3
-2
-
2
3.5
7
2
+
2
3.5
14
2
+
2
3.5
20
2
+
2
3.5
2
3
+
3
9.5
6
3
+
3
9.5
9
3
+
3
9.5
12
3
+
3
9.5
13
3
+
3
9.5
15
3
+
3
9.5
16
3
+
3
9.5
17
3
+
3
9.5
1
4
+
4
15
8
4
+
4
15
18
4
+
4
15
Example: Body image data
Note that since there are
many
ties
in
the
magnitudes we had to
assign average ranks.
111
child
diff.
sign
|diff.|
Avg.
ranks
4
1
+
1
1
3
-2
-
2
3.5
7
2
+
2
3.5
14
2
+
2
3.5
20
2
+
2
3.5
through 5th differences all
2
3
+
3
9.5
6
3
+
3
9.5
have the same magnitude, so
9
3
+
3
9.5
we give them all the average
12
3
+
3
9.5
13
3
+
3
9.5
of the 2nd through 5th rank
15
3
+
3
9.5
16
3
+
3
9.5
17
3
+
3
9.5
1
4
+
4
15
8
4
+
4
15
18
4
+
4
15
Example: Body image data

For example, the 2nd
(2+3+4+5)/4 = 3.5
112
child
diff.
sign
|diff.|
Avg.
ranks
4
1
+
1
1
3
-2
-
2
3.5
7
2
+
2
3.5
14
2
+
2
3.5
20
2
+
2
3.5
rank test is the sum of the
2
3
+
3
9.5
ranks
6
3
+
3
9.5
9
3
+
3
9.5
12
3
+
3
9.5
13
3
+
3
9.5
15
3
+
3
9.5
16
3
+
3
9.5
17
3
+
3
9.5
1
4
+
4
15
8
4
+
4
15
18
4
+
4
15
Example: Body image data
The statistic for the signed-
of
the
positive
differences
113
child
diff.
sign
|diff.|
Avg.
ranks
4
1
+
1
1
3
-2
-
2
3.5
7
2
+
2
3.5
14
2
+
2
3.5
20
2
+
2
3.5
rank test is the sum of the
2
3
+
3
9.5
ranks
6
3
+
3
9.5
9
3
+
3
9.5
12
3
+
3
9.5
13
3
+
3
9.5
15
3
+
3
9.5
16
3
+
3
9.5
17
3
+
3
9.5
1
4
+
4
15
8
4
+
4
15
18
4
+
4
15
Example: Body image data
The statistic for the signed-
of
the
positive
differences
R1 = 1 + 3.5 + 3.5 + 3.5 +
9.5 + 9.5 + 9.5 + 9.5
+ 9.5 + 9.5 + 9.5 +
9.5 + 15 + 15 + 15
= 132.5
114
R1: What does it mean?
 With 16 observations R1 could range from 0 (all differences are
negative) to 136 (all differences are positive).
 If H0 were true we’d expect R1 to be near the middle of the
range, in this case, 68.
 R1= 132.5 appears to be evidence against H0
 Need a p-value
115
Signed-rank test p-value
For n > 15, can use a normal approximation
n(n + 1)
µ=
4
3
(
t
n
(
n
)(
n
)
+
+
1
2
1
∑
i − ti )
2
−
σ =
24
48
where ti are the numbers of ties in each group of ties (note that if
ti = 1 then the term is 0), and n is the number of non-zero
differences.
The two-sided p-value is given by
|R1 − µ | − 0.5 

p − value = 2 × P N(0,1) >

σ


116
p-value for body image example
16(16 + 1)
= 68
µ=
4
There are 4 people tied with difference 2, 8 with
difference 3, and 3 tied with difference 4. So
∑ (t
And so,
3
i
− t i ) = (43 − 4 ) + (83 − 8 ) + (33 − 3 ) = 588
16 × 17 × 33 588
σ =
−
= 386.25
24
48
2
117
p-value for body image example
|132.5 − 68| − 0.5 

p − value = 2 × P  N(0,1) >

386.25


= 2 × P (N(0,1) > 3.26 )
= 2 × 0.001 = 0.002
118
p-value for signed-rank test
 If n < 15 then should not use Normal approximation, but
instead use an “exact” p-value or critical tables.
 Or simply list the possibilities for a given number of sample
 For n sample size, 2n , possibilities exist and probabilities can
be calculated based on the possibilites
 In body image example, exact p-value is 0.00015.
119
Kruskal-Wallis Test
 ANOVA is based on the assumption of normality
 Non-parametric alternative to ANOVA
 One ordinal dependent variable with 3 or more independent levels
 Kruskal Wallis involves the analysis of the sums of ranks for each
group, as well as the mean rank for each group.
 As sample sizes get larger, the distribution of the test statistic
approaches that of χ2, with df = k – 1
120
Kruskal-Wallis Test
 The null hypothesis states that there is no difference in the
distribution of scores of the K populations from which the samples
were selected.
 The alternative hypothesis states that there is at least difference in
the distribution of scores of 2 populations from which the samples
were selected.
 The test statistic is:
K
Ri2
12
2
=
W
−
3(
n
+
1)

χ
∑
k −1
n(n + 1) i =1 ni
121
Kruskal-Wallis Test
 Test Statistic is:
 Example:
K
Ri2
12
2
3(
1)
=
W
−
n
+
χ

∑
k −1
n(n + 1) i =1 ni
122
Summary of Tests
123
Chapter Two
Principles and Methods of
Epidemiology
124
Introduction
 Health is ‘the state of being free from illness or injury’.
 It refers to freedom from medically defined diseases.
 Poverty, social inequalities, unemployment, and crowding are
among the main determinants of health.
 Illness can be observed :
 Subjectively
 Objectively
125
Introduction
 Subjective observations by the patient (symptoms)

Example: nausea
 Subjective observations by the examiner (signs)

Example: Weight loss

inter-observer variation (the degree of agreement among different
examiners)

intra-observer variation (the degree of agreement between different
examinations made by one examiner).
 objective observations (tests): to manifestations that can be read
from an instrument and hence are less dependent on subjective
judgments by the person examined or the examiner.
126
What is Epidemiology
 The term “epidemic” is used to describe an unexpected increase in
the frequency of any disease such as myocardial infarction, obesity,
or asthma (in general Genetic, behavioral or environmental).
 The term "Pandemic" is used to describe an occurring of disease
over a wide geographic area and affecting an exceptionally high
proportion of the population
127
What is Epidemiology
 Public health epidemiology uses the “healthy” population to study
the transition from being healthy to being diseased or ill.
 Clinical epidemiology uses the population of patients to study
predictors of cure or changes in the disease state.
 A clinical epidemiologist can study how best to treat diseases without
taking an interest in how these diseases emerged.
128
What is Epidemiology
 Epidemiology comes from the Greek words
 Epi, meaning “on or upon,”
 Demos, meaning “people,” and
 Logos, meaning “the study of.”
 “Epidemiology is the study of the distribution and determinants of
health-related states or events in specified populations, and the
application of this study to the control of health problems.”
129
What is Epidemiology
 Study- includes: surveillance, observation, hypothesis testing,
analytic research and experiments
 Distribution-Refers to analysis of: times, persons, places and
classes of people affected.

Time characteristics include annual, seasonal, and daily or even hourly
occurrence during an epidemic.

Place
characteristics
include
geographic
variation,
urban-rural
differences, and location of worksites or schools.

Personal characteristics include demographic factors such as age, race,
sex, marital status, and socioeconomic status, as well as behaviors and
environmental exposures.
130
What is Epidemiology
 Epidemiology is concerned with the frequency and pattern of
health events in a population.

Frequency includes not only the number of such events in a
population, but also the rate or risk of disease in the population.

This characterization of the distribution of health-related states or
events is one broad aspect of epidemiology called descriptive
epidemiology.

Descriptive epidemiology provides the What, Who, When, and Where
of health-related events.
131
What is Epidemiology
 Determinants include factors that influence health: biological,
chemical, physical, social, cultural, economic, genetic and
behavioral.

Search for causes and other factors that influence the occurrence of
health-related events.

Analytic epidemiology attempts to provide the Why and How of
such events by comparing groups with different rates of disease
occurrence and with differences in demographic characteristics,
genetic
or
immunologic
make-up,
behaviors,
environmental
exposures, and other so-called potential risk factors.
132
What is Epidemiology
 Health-related states or events

Originally,
epidemiology
was
concerned
with
epidemics
of
communicable diseases.

Then epidemiology was extended to endemic communicable diseases
and non communicable infectious diseases.

More recently, epidemiologic methods have been applied to chronic
diseases, injuries, birth defects, maternal-child health, occupational
health, and environmental health.

Now, even behaviors related to health and well-being (amount of
exercise, seat-belt use, etc.) are recognized as valid subjects for
applying epidemiologic methods.
133
What is Epidemiology
 Health related events refer to: diseases, causes of death,
behaviors such as use of tobacco, positive health states,
reactions to preventive regimes and provision and use of health
services.
 Specified
populations
include
those
with
identifiable
characteristics, such as occupational groups.

Clinicians are concerned with the health of an individual;

epidemiologists are concerned with the collective health of the people
in a community or other area.
134
What is Epidemiology
 Application: to prevention and control the aims of public
health—to promote, protect, and restore health

Epidemiology is more than “the study of.”

epidemiology provides data for directing public health action.

using epidemiologic data is an art as well as a science.
135
Summary
Epidemiology is the study (scientific, systematic, data driven) of
the distribution (frequency, pattern) and determinants (causes,
risk factors) of health-related states and events (not just
diseases) in specified populations (patient is community,
individuals viewed collectively), and the application of (since
epidemiology is a discipline within public health) this study to
the control of health problems.
136
 Epidemiologists are required to have some knowledge of:
 Public health: because of the emphasis on disease prevention
 Clinical medicine: because of the emphasis on disease classification
and diagnosis (numerators)
 Pathophysiology: because of the need to understand basic biological
mechanisms in disease (natural history)
 Biostatistics: because of the need to quantify disease frequency and
its relationships to antecedents (denominators, testing hypotheses)
 Social sciences: because of the need to understand the social context
in which disease occurs and presents (social determinants of health
phenomena)
137
Uses of Epidemiology
1. To study the history of disease:
 Trends of a disease for the prediction of trends.
 Result are useful in planning for health services and public health.
2. Community diagnosis:
 What are the diseases, conditions, injuries, disorders, disabilities,
defects causing illness, health problems or death in a community or
region?
138
Uses of Epidemiology
3. Look at risks of individuals as they affect groups
or
population:
 What are the risks factors, problems, behavior that affects group?
 Health screening, medical exams, disease assessments.
4. Assessments, evaluation, research.
 How well do public health and health services meet the problems and
needs of the population or group?
139
Uses of Epidemiology
4. Assessments, evaluation, research.
 How well do public health and health services meet the problems and
needs of the population or group?
5. Completing the clinical picture:
 Identification and diagnosis process to establish that a condition exists
or that a person has a specific disease.
6. Identification of syndromes:
 Help to establish and set criteria to define syndromes.
140
Uses of Epidemiology
7. Determine the causes and sources of disease:
 Findings allow for control, prevention, and elimination of the causes
of disease, conditions, injury, disability
141
Some Epidemiologic Concepts: Mortality Rates
 Death is a unique and universal event, and as a final event
 Mortality rates pertain to the number of deaths occurring in a
particular population subgroup and often provide one of the first
indications of a health problem.
 Mid year-an estimate of the average number during the year.
 A reason for taking the population at midyear as the denominator
for determining rates or ratios is because of a population may grow
or shrink during the year in question
142
Measures of Mortality
143
Age Specific Death Rates (ASDR)
144
Age Specific Death Rates (ASDR)

145
Maternal Mortality Rate (MMR)
146
Incidence and Prevalence
147
Incidence and Prevalence
148
Incidence and Prevalence
 The quality of the data is commonly described with use of four terms:
 Accuracy: the degree to which a measurement represents the true value
of the attribute being measured.
 Precision: the reproducibility of a study result, that is, the degree of
resemblance among study results, were the study to be repeated under
similar circumstances: lack of precision is referred to as "random
error".
 Reliability: a measure of how dependably an observation is exactly the
same when repeated; it refers to the measuring procedure rather than to
the attribute being measured.
149
Incidence and Prevalence
 Validity: the extent to which the study measures what it is
intended to measure; lack of validity is referred to as "bias"
or "systematic error."
150
Measures of Association
 Epidemiologic studies often interested in knowing how much
more likely an individual is to develop a disease if he or she is
exposed to a particular factor than the individual who is not so
exposed.
151
Measures of Association
Relative Risk
 It is the ratio of two incidence rates
 the
rate of development of the disease for people with
the exposure factor, divided by the rate of development
of the disease for people without the exposure factor.
 the
probability of the outcome in those exposed divided by
the probability of the outcome in those not exposed-Cohort
study
=
RR
p1
P(disease / exp osed )
=
P(disease / un exp osed ) p2
152
Example Relative Risk
 Cross-Classification of Aspirin Use and Myocardial Infarction
RR
p1
P(disease / exp osed )
=
P(disease / un exp osed ) p2
189
0.0171
11034
=
RR
= = 1.818
104
0.0094
11037
 The risk of developing heart attack (Myocardial infarction) for
those individuals who don’t use Aspirin is 1.818 times Aspirin
153
Odds Ratio
Odds ratio is the odds of the outcome in one group divided
by the odds of the outcome in the other group
Let p1 refers to the probability of the outcome in group 1,
and p2 is the probability of the outcome in group 2.
P(disease / exsposed )
OR =
P(disease / unexsposed )
(1 − P(disease / exp osed ))
(1 − P(disease / un exp osed ))
p1
=
(1 − p1 ) p1 (1 − p2 )
=
p2
p2 (1 − p1 )
(1 − p2 )
154
Example Odds Ratio
 A study on the relationship between seat-belt use (yes, no) and
outcome of an automobile crash (fatality, non fatality) for drivers
involved in accidents is given below. Calculate Odds ratio.
OR
=
p1 (1 − p2 ) 160 x3600
=
p2 (1 − p1 ) 510 x1500
= 0.75
155
Relation ship between RR and OR
RR =
OR
[(1 − po ) + ( po × OR)]
 wherepo is the proportion of those unexposed who develop the
outcome,
 OR is the odds ratio, and
 RR is the relative risk estimated from the odds ratio.
156
Bias
 What do Epidemiologists do?
 Measure effects

It could be rate or risk
 Attempt to define a cause: just an estimate of the truth
 Implement Public health measures
 How ever there might be:

Biase?
 Chance?
Can be evaluated quantitatively
 Confounding?
157
Bias
 Despite all preventive efforts, it has to be remembered that bias:
 May mask an association or cause
 Masy cause over or under estimation of the effect size

Conclusion different from the truth
 Bias can be minimized by design and conduct of study
 Some type of Bias cannot be minimized with increased sample
size
158
Bias
 Deviation of results or inferences from the truth
 Bias is defined as “any trend in the collection, analysis,
interpretation, publication, or review of data that can lead to
conclusions that are systematically different from the truth”.
 Bias can lead to an incorrect estimation of the association between
an exposure and the risk of a disease.
 Bias can be due to:
 Selection
 Measurement/ Mis classification
 Confounding
159
Selection Bias
 It is a distortion in the estimate of association between risk
factors and disease that results from how the subjects are
selected for the study.
 Could occur because the sampling frame is sufficiently different
from the target population or
 That is the sampling frame is not the mirror image of the target
population
160
Response Bias
 Also called as ascertainment bias
 Systematic error due to differences in characteristics between those
who choose or volunteer to take part in a study and those who do
not
 Avoid differential response rates
 Of 100 people exposed to a risk factor, 20% develop the disease and of
a 100 people unexposed, 16% develop the disease yielding a relative
risk of 1.25
161
Response Bias…
 Now imagine that only 60% of the exposed respond to follow-up, or
are ascertained as having or not having the disease, a 60% response
rate among the exposed. Assume further that all of the ones who
don't respond happen to be among the ones who don't develop
disease. The relative risk would be calculated as 2.06
 Now imagine that only 60% of the nonexposed reply, a 60%
response rate among the nonexposed, and all of the nonexposed
who don't respond happen to be among the ones who don't have the
disease. Now the relative risk estimate is 0.75.
162
Confounding
 Confounding exists when a risk factor other than the exposure
under study is associated, independently, both with the exposure
and with the outcome
Confounder
Exposure
Outcome
For a variable to be a confounder, it must have three characteristics:
1) it must be associated with the exposure (causally or not);
2) it must be a cause, or a surrogate of the cause, of the health outcome;
3) it should not be in the causal pathway between the potential risk factor and outcome.
163
Matching
 Several
methods
are
available
to
address
confounders:
randomization, matching, multivariate analysis etc.
 Matching refers to the selection of unexposed subjects’ i.e.,
controls that in certain important characteristics are identical to
cases (may be with the possible confounder).
 Matching addresses issues of confounding in the DESIGN stage of
a study as opposed to the analysis phase.
164
Matching…
 A study of coffee and heart disease, match subjects on their
smoking history, since smoking may be a confounder of the
relationship between coffee and heart disease.
 Whenever enrolled a coffee drinker into the study, determine if that
person was a smoker.
 If the patient was a smoker, the next patient who would be enrolled
who was not a coffee drinker (i.e., a member of the comparison
group), would also have to be a smoker.
 For each coffee-drinking non smoker, a non coffee-drinking non
smoker would be enrolled.
165
Chapter Three
Designing Research
166
Introduction
 Research is all about addressing an issue or asking and answering a
question or solving a problem or
 Research is what we do when we have a question or a problem we
want to resolve
 We may already think we know the answer to our question
already
 We may think the answer is obvious, common sense even
 But until we have subjected our problem to rigorous scientific
scrutiny, our 'knowledge' remains little more than guesswork or at
best, intuition.
167
Procedures in research
168
What is research design?
 A framework for the research plan of action.
 A master plan that specifies the methods and procedures for
collecting and analyzing the needed information
 A strategy for how the data will be collected.
The purpose of research design is:
 It provides the scheme for answering research question.
 It maintains control to avoid bias that may affect the outcomes.
 It organize the study in a certain way defending the advantages of
doing while being aware and caution about potential disadvantages
169
Categories of Research Design
 Descriptive studies:
 Examine patterns of
disease
 Analytical studies:
 Studies of suspected
causes of diseases
 Experimental studies:
 Compare
treatment
modalities
170
Observational and Experimental Studies
 Researchers are interested in comparing reading scores for students in
schools with low average family income with scores for students in
schools
with
high
average
family
income.
They
choose
a
random sample of schools in each category. This is an observational
study: the researchers do nothing to affect either family income or reading
scores.
 Researchers are interested in comparing two methods for teaching
reading. They randomly assign half the schools in their sample to one
method and the other half to the other method. At the end of the school
year, they analyze reading scores of the children in the schools. This is
an experiment: the researchers deliberately decide which students receive
each teaching method.
171
Observational Study…
 Looks at the natural history of the disease, can suggest a hypothesis
 Non-experimental, Observational studies are an alternative to
experimental studies.
 Observational because there is no individual intervention
 Treatment and exposures occur in a “non-controlled” environment
 Individuals can be observed prospectively, retrospectively, or
currently
172
Case Control Study
 It is designed to help determine if an exposure is associated with an
outcome.
 A comparison group that does not have the disease.
 Subjects in whom the disease has been diagnosed.
 The two groups must be comparable except for the factor of
interest.
 If one wants to show that an associated factor is a cause it is
necessary to control for all important differences other than the
exposure factor of interest.
173
Case Control Study…
 Then, look back in time to learn which subjects in each group had
the exposure(s), comparing the frequency of the exposure in the
case group to the control group.
Exposed
Cases
Not
Exposed
Population
Exposed
Control
Not
Exposed
174
Case Control Study…
 Case control studies are comparatively
 quick,
 inexpensive, and
 Easy
 They are particularly appropriate for
 Investigating outbreaks, and

E.g a study of endophthalmitis following ocular surgery
 Studying rare diseases or outcomes.

study of risk factors for uveal melanoma, or corneal ulcers.
175
Case Control Study…
 Case-control studies cannot provide any information about the
incidence or prevalence of a disease because no measurements are
made.
 Case-control studies may prove an association but they do not
demonstrate causation. made in a population based sample.
 All studies which contain ‘cases’ and ‘controls’ are not case-control
studies.
176
Cohort Study
 A study design where one or more samples (called cohorts) are
followed prospectively and subsequent status evaluations with
respect to a disease or outcome are conducted to determine which
initial participants exposure characteristics (risk factors) are
associated with it.
 As the study is conducted, outcome from participants in each
cohort is measured and relationships with specific characteristics
determined.
177
Cohort Study…
 A “cohort” is a group of people who have something in common.
 Can represent the source population—the population from which
cases of disease arise
 For example, the effect of company downsizing on the health of
office workers. This group is then compared to a similar group
that hasn't been exposed to the variable.
178
Cohort Study: Example
 To determine the long-term effectiveness of influenza vaccines in
elderly people, cohorts of vaccinated elderly and unvaccinated
community-dwelling elderly were studied. The results suggest that
the elderly who are vaccinated have a reduced risk of
hospitalization for pneumonia or influenza.
 This study uses data collected from high school students, and
studies the differences in initiation of tobacco use between
adolescents that started working for pay and a that did not work.
The results suggest that adolescents who work for pay have a
higher risk of initiating tobacco use.
179
Case Control and Cohort Study
 The distinction between case-control studies and prospective
studies lies in the sampling.
 In the case-control study we sample from among the diseased and
nondiseased,
 whereas in a prospective study we sample from among those with the
factor and those without the factor.
180
Demonstrating Strength of Causality
 cross-sectional studies: useful in showing associations, in
providing early clues to etiology.
 case-control studies: useful for rare diseases or conditions, or
when the disease takes a very long time to become manifest
(synonymous name: retrospective studies).
 Cohort studies: useful for providing stronger evidence of
causality, and less subject to biases due to errors of recall or
measurement
(synonymous
names:
prospective
studies,
longitudinal studies).
 clinical trials: prospective, experimental studies that provide the
most rigorous evidence of causality.
181
Crossectional Study
 Determines prevalence at a point in time
 Measure exposure and outcome variables at one point in time.
 Immediate outcome assessment and no loss to follow-up, therefore
faster, cheaper, easier
 Useful for determining the prevalence of risk factors and the
frequency of prevalent cases of a disease for a defined population
 They are also useful for measuring current health status and
planning for selected health services
182
Crossectional Study: Example
 For example, in a cross-sectional study of high blood pressure and
coronary heart disease the investigators determine the blood
pressure and the presence of heart disease at the same time.
 If they find an association, they would not be able to tell which came
first.

Does heart disease result in high blood pressure or does high blood
pressure cause heart disease, or

are both high blood pressure and heart disease the result of some other
common cause?
183
Choosing study design
 Does it adequately test the hypotheses?
 Hypotheses determine participants, variables measured & data
analysis methods
 Example hypotheses tested in student projects

Discussion of Requirements of proposal
 Does it identify and control extraneous factors?
 Eliminate
alternative explanations for results to increase
confidence in cause-effect conclusion (internal validity)
184
Choosing study design
 Control depends on type of design

Correlational design has less control

Extraneous variables are measured and effects are statistically
controlled
 Are results generalizable?
 Replicate to other samples and other contexts
 Random selection of participants
 Features of field experiments enhancing external validity

Realistic nature of setting and/or task

Manipulation of treatment
185
Choosing study design

Use of control group

Nature of samples used

Lack of control over confounding variables due to non-random
assignment or inability for matching
 Can the hypothesis be rejected or retained via statistical
means? (statistical conclusion validity)
 Need reliable measures
 Need large enough sample to detect true effect and avoid Type I and
Type II errors
186
Choosing study design
 What is a null hypothesis?

No effect proposed
 What is an alternative hypothesis?

What is a directional hypothesis?
 Is the design efficient in using available resources?
 Optimal balance between research design, time, resources and
researcher expertise
187
Methods and Methodology
 Research methods are the tools, techniques or processes that
we use in our research. These might be,

surveys, interviews, Questionaire

participant observation, Focus Group Discussion.

However, Methods and how they are used are shaped by
methodology.
188
Methods and Methodology
 Methodology is the study of
 How research is done,
 How we find out about things, and
 How knowledge is gained.
 In other words, methodology is about the principles that guide
our research practices.
 Methodology therefore explains why we are using certain
methods or tools (logic, reality, values) in our research.
189
Chapter Four
Survival Data Analysis
190
Introduction
 Survival Analysis typically focuses on time to event data.
 Survival time refers to a variable which measures the time from a
particular starting time to a particular endpoint of interest.
 Survival analysis is generally defined as a set of methods for
analyzing data where the outcome variable is the time until the
occurrence of an event of interest.
191
Introduction…
 In the most general sense, it consists of techniques for positive-
valued random variables, such as
 Time to death
 Time to onset (acquire/develop) or (relapse/recurence) of a disease
 Length of stay in a hospital
 Duration of a strike
 Money paid by health insurance
 Viral load measurements: how much HIV is in the blood.
 Time to finishing a doctoral dissertation!
192
Examples Time to Event
 Medicine:
 Time to relapse of a certain disease
 Time to death of HIV patients after retroviral therapy
 Time to re-occurrence of a particular symptom
 Time to cure from a certain disease
 Agriculture:
 Length of time required for a cow to conceive after calving
 Time until a farm experiences its first case of an exotic disease
193
Examples Time to Event…
 Sociology: “duration analysis”: studying behavioural changes of
individuals over time.
 Time to find a new job after a period of unemployment
 Time until re-arrest after release from prison
 Time until getting promotion
 Engineering: “reliability analysis”
 Time to the failure of a machine

Failure of networks
194
Examples Time to Event…
 Management: time until turn over, retirment
 Demography:
 Time until births, marriages, divorces, migration patterns and deaths
 Criminology:
 Time until Commiting crimes, convictions, arrests and rehabilitations
 Epidemiology:
 Time until aging , chronic diseases
195
Examples Time to Event…
 The event can be
 death, occurrence of a disease,
 marriage, divorce, etc.
 The time to event or survival time can be measured in days,
weeks, years, etc.
 For example, if the event of interest is heart attack, then the survival
time can be the time in years until a person develops a heart attack.
196
Time to Event Diagrammatically
censored
event
censored
event
censored
censored
event
event
Time in a unit of measurement
Individuals do not all enter the study at the same time.
When the study ends, some individuals still haven't had the event yet.
Other individuals drop out or get lost in the middle of the study, and all we know
about them is the last time they were still “free" of the event.
197
The Main Concepts of Survival Analysis(Event)
 An event is ‘‘a change in state as defined by one or more qualitative
variables within some observation period and within the relevant
state space’’ (Blossfeld, Hamerle & Mayer, 1989).
 It consists of some form of change in state (Melnyk et al., 1995).
 Qualitative changes can be identified as events if there is a
‘‘relatively sharp disjunction between what precedes and what
follows’’ the change over a period of time.
198
The Main Concepts of Survival Analysis (Event)
 For instance, an event can be a store opening, a store failure, a job
termination, etc.
 Three questions can be asked:
 Question 1. Did the event happen?
 Question 2. When did it happen?
 Question 3. How do various factors affect the occurrence and the
timing of the event?
199
The Main Concepts of Survival Analysis (Measurement
Window)
 The measurement window characterises the period of time during
which the researcher makes their observation.
 The choice of the measurement window length in the different
investigations is a personal and arbitrary judgement by the
researcher.
 Indeed, there is little theoretical and empirical evidence to use as a
guide
200
The Main Concepts of Survival Analysis (Measurement
Window)…
 Due to the arbitrary aspect of length choice, we must not forget
that results can vary consequently with the length of the
observation period.
 This variability in the results can be observed in various works
about turnover.
 In addition to this diversity in the results, studies are also very
difficult to compare.
201
The Main Concepts of Survival Analysis(Time)
 In order to define a failure time random variable, we need:
 An unambiguous time origin
 A time scale [real time: days, years]
 Definition of the event[Death, Disease, Recurrence, Response]
 Failure time random variables are always non-negative.
 Calendar time should be distinguished from survival time
(patients time).
202
The Main Concepts of Survival Analysis
(Censoring)
 We often possess data composed of duration between two events
 Individuals do not all enter the study at the same time
 When the study ends, some individuals still haven't had the event
yet
 Patient withdrawal from a clinical trial
 Death due to some cause other than the one of interest
 Migration of human population
203
The Main Concepts of Survival Analysis
(Censoring) …
 Duration may not been completely observed.
 The possibility that some individuals may not be observed for the
full time to failure is occurrence of censoring
 The censorship refers to an incomplete survival time like
 the lack of start date
 the lack of ending date of the event
 the loss of a customer
 disappearance from the sample within the measurement window
204
The Main Concepts of Survival Analysis
(Censoring) …
 Censored data can occur when:
 The event of interest is death, but the patient is still alive at the time of
analysis.
 The individual was lost to follow-up without having the event of
interest.
 The event of interest is death by cancer but the patient died of an
unrelated cause, such as a car accident.
 The patient is dropped from the study without having experienced the
event of interest due to a protocol violation.
205
The Main Concepts of Survival Analysis
(Censoring) …
 Censored data are data which are incomplete.
 But in order to be complete, the information must meet three
conditions
1.
The time during which the subject is exposed to a particular risk type
must be specified;
2.
We must be able to identify the end of the period; and
3.
The end must be due to some event under investigation. These three
conditions cannot always be satisfied
206
The Main Concepts of Survival Analysis
(Types of Censoring) …
 Censoring (incomplete survival time) can be:
 Right Censored
 Left Censored
 Interval Censored
 Sometimes process is terminated due to reasons different from these
under investigation.
 In Real situation we have occasionally both right and left censored
data.
207
Right Censoring
 Right censoring occurs when a subject leaves the study before an
event occurs, or
 The study ends before the event has occurred.
 Don’t know when the event occurred
 This unknown date can be :
 close,
 distant or
 even non-existent.
208
Right Censoring …
 Patients in a clinical trial to study the effect of treatments on stroke
occurrence. The study ends after 5 years. Those patients who have
had no strokes by the end of the year are censored.
Total survival time
Observed survival
time
Start study
End study
209
Right Censoring ….
 The following notation is used to denote right censored data:
T = survival time (event time)
C = censoring time
 The data are usually represented as (Y, δ) where Y = min(T,C) is
the time recorded, and
 δ indicates whether we observed an event time or a censoring time,
that is
210
Right Censoring ….
Y = min(T , C )
T for an uncensored observation
=
C for a censored observation
δ = I (T ≤ C )
1, for an uncensored observation
=
0, for a censored observation
211
Right Censoring …
Type I Right censoring
 all subjects are followed from a common starting point to a
common end point
 Censoring time is the same for all subjects
 Example: Everyone followed for 1 year
Study Start
Study End
212
Right Censoring …
 Type II Right censoring
 all items are put on test at the same (starting) time and
 Stop observation when a set number of events have occurred
 often used in equipment testing
 replace all light bulbs when five have failed
Study Start
Study End
213
Right Censoring ….
 Random Right censorship
 random refers to fact that censoring process/events unplanned/ not
under control of investigator, so events occur “randomly”;
 for each item/subject the survival time and the censoring time are
random variables and that we observe the minimum of these two
random variables.
214
Right Censoring ….
 In random right censoring our focus, more general than Type I
 Entry is at any time, the study itself continues until a fixed time point
but subjects enter and leave the study at different times.
Study Start
Study End
215
Left Censoring
 The event has occurred prior to the start of the study
 Or the true survival time is less than the person’s observed survival
time, only a upper bound for the time of event of interest is known
 We know the event occurred, but unsure when prior to observation
 In this kind of study, exact time would be known if it occurred after
the study started
216
Left Censoring …
 Example:
 Survey question: when did you first smoke?
 HIV: infection time
Censored during start of study
Follow up time
Start study
End study
217
Left Censoring …
 The following notation is used to denote left censored data:
T = survival time (event time)
C = censoring time
 The data are usually represented as (Y, ε) where
Y = max(T, ε) is the time recorded, and
 ε indicates whether we observed an event time or a censoring time,
that is
218
Left Censoring …
Y = max(T , C )
ε = I (T ≥ C l )
1, for an uncensored observation
=
0, for a censored observation
219
Interval Censoring
Due to discrete observation times, actual times not
observed.
Example: progression-free survival
 Progression of cancer defined by change in tumor size
 Measure in 3-6 month intervals
 If increase occurs, it is known to be within interval, but not
exactly when.
220
Survival Analysis
 Survival analysis examines the hazard that a certain event occurs .
 Survival analysis possesses two main aims:
 Firstly, we want to estimate the time period during which the event can
happen.
 Second, we want to examine and describe the time distribution of the
event, and estimate quantitatively the impact of various independent
factors, called covariates, on this distribution.
 Note: Data collection essentially consists of a longitudinal record of
events which happen to study units/subjects.
221
Why Survival Analysis…
 Why not compare mean time-to-event between your groups using
a t-test or linear regression?
 ignores censoring
 If no censoring (everyone followed to outcome of interest) than ttest
on mean or median time to event is fine.
 Why not compare proportion of events in your groups using
risk/odds ratios or logistic regression?
 ignores time
 If time at-risk was the same for everyone, could just use proportions.
222
The Survival Analysis Methodology
 Most modern methods are mainly non-parametric (Kaplan-Meier,
1958) or
 Semi-parametric (the Cox model, 1972) or
 It may be parametric
 When no covariate is available in the data, Kaplan-Meier methods
can be used, otherwise the Cox model is the solution.
223
Probability Density Function
 The probability of the failure time occurring at exactly time t (out
of the whole range of possible t’s) is
P (t ≤ T < t + ∆t )
f (t ) = lim
∆t 
→ 0
∆t
 The goal of survival analysis is to estimate and compare survival
experiences of different groups.
 Survival experience is described by the cumulative survival
function:
S (t ) = 1 − P (T ≤ t ) = 1 − F (t )
F(t) is the CDF of
f(t), and is “more
interesting” than f(t).
224
The survival function
 The survival function reflects the cumulative survival probabilities
throughout time.
 It is the rate of units, individuals, organisations, etc. not yet
reached, at time t, by the event studied.
 When events are studied, the dependent variable is frequently the
length of time until the event.
 The survival function is the unconditional probability that an event
has not yet occurred at the period of time t.
225
The Survival Function …
 A function describing the proportion of individuals surviving to or
beyond a given time or
 probability that a randomly selected individual will survive beyond
time t.
 Notation:
 T ≡ survival time of a randomly selected individual
 t ≡ a specific point in time.
 S(t) = P(T > t) ≡ Survival Function
226
The Survival Function…
Example 1: If t=100 years, S(t=100)
is
probability of
surviving beyond 100 years.
Ŝ(t)=number of patients surviving longer than t
total number of patients in the study
Example 2:
 Event = death,
 scale = months since Rx
 “S(t) = 0.3 at t = 60”
227
Survival Function
 From example 2 “The 5 year survival probability is 30%”, hence
“70% of patients die within the first 5 years”
Basic Properties of survival function:
 S(0)=1
 S(∞)=0, for example Everyone dies → S(∞) =0
 S(t) is non increasing
228
Hazard Function
 The hazard function h(t) is the probability of dying “at” time t.
 Also called the instantaneous failure rate, the age-specific failure
rate and force of mortality.
h(t ) =
f (t )
S (t )
 Groups set at risk, which is the set of units, individuals, in the
sample, which are at risk regarding the event occurring at a certain
point in time.
229
Hazard Function
 For instance, at the first period of time (day, week, month, year), the
set of units of the sample is at risk.
 The hazard rate is the conditional probability for an event to occur
to a unit of the sample at a specific time since the unit is at risk.
1
=
h(t ) lim
p (t ≤ T ≤ t + ∆t / T ≥ t )
∆t → 0 ∆t
1 p ([t ≤ T ≤ t + ∆t ] ∩ [T ≥ t ])
= lim
∆t → 0 ∆t
p (T ≥ t )
1 p (t ≤ T ≤ t + ∆t )
= lim
∆t → 0 ∆t
p (T ≥ t )
f (t )
=
S (t )
230
Hazard Rate
 Example
 Event = death,
 scale = months since Rx
 “h(t) = 1% at t = 12 months”
 “At 1 year, patients are dying at a rate of 1% per month”
 “At 1 year the chance of dying in the following month is
1%”
231
Connection Between the different Quantities
f (t )
Hazard from density and survival : h(t) =
S (t )
∞
Survival from density : S(t) = ∫ f (u )du
t
dS (t )
Density from survival : f (t ) = −
dt
t
∫
( − h ( u ) du )
Density from hazard : f (t ) = h(t )e
0
t
Survival from hazard: S(t) = e
Hazard from survival : h(t) = -
∫
( − h ( u ) du )
0
d
ln S (t )
dt
232
Estimates of Survival Probabilities
 Unlike ordinary regression models, survival methods correctly
incorporate information from both censored and uncensored
observations in estimating important model parameters.
 The dependent variable in survival analysis is composed of two
parts:
 one is the time to event and
 the other is the event status, which records if the event of interest
occurred or not.
233
Estimates of Survival Probabilities…
 It is possible to estimate two functions that are dependent on
time, the survival and hazard functions.
 The survival and hazard functions are key concepts in survival
analysis for describing the distribution of event times.
 While these are often of direct interest, many other quantities of
interest (e.g., median survival) may subsequently be estimated
from knowing either the hazard or survival function.
234
Estimates of Survival Probabilities …
 It is generally of interest in survival studies to describe the
relationship of a factor of interest (e.g. treatment) to the time to
event, in the presence of several covariates, such as age, gender,
race, etc.
 A number of models are available to analyze the relationship of a
set of predictor variables with the survival time. Methods include
parametric, nonparametric and semiparametric approaches.
235
Kaplan-Meir Estimates of Survival Probabilities
 The goal is to estimate a population survival curve from a
sample.
 If every patient is followed until death, the curve may be
estimated simply by computing the fraction surviving at each
time.
 However, in most studies patients tend to drop out, become lost
to follow up, move away, etc.
 It allows estimation of survival over time, even when patients
drop out or are studied for different lengths of time.
236
Kaplan-Meir Estimates of Survival Probabilities…
 Also called Product Limit
 Non-parametric estimate of the survival function
 Empirical probability of surviving past certain times in the sample
 Applicable only for right censored data
 Commonly used to compare two study populations
 It is a atep-function: (the Kaplan–Meier estimate does not change
between events, nor at times when only censorings occur. It drops
only at times when a failurehas been observed.
237
Kaplan-Meir Estimates of Survival Probabilities…
Limitations:
 Mainly descriptive
 Doesn’t control for covariates
 Requires categorical predictors
 Can’t accommodate time-dependent variables
238
Kaplan-Meir Estimate
 When there are no censored data, the KM estimator is simple
and intuitive:
 Estimated S(t)= proportion of observations with failure times > t.
 For example, if you are following 10 patients, and 3 of them die
by the end of the first year, then your best estimate of S(1 year) =
70%.
 When there are censored data, KM provides estimate of S(t)
that takes censoring into account
239
Kaplan-Meir Estimate …
 The PL method assumes that censoring is independent of the
survival times:(that is, the reason an observation is censored is
unrelated to the cause of failure).
 K-M estimates are limited to the time interval in which the
observations fall
 If the largest observation is uncensored, the PL estimate at that
time equals zero
 Median survival is a point of time when S(t) is 0.5, that is 50%
of the subjects aquired the event.
240
Kaplan-Meir Estimate …
 Note that:
 (1) for each time period the number of individuals present at the
start of the period is adjusted according to


the number of individuals censored and
the number of individuals who experienced the event of interest in the
previous time period, and
 (2) for ties between failures and censored observations, the
failures are assumed to occur first.
241
Kaplan-Meir Estimate
Observed event times
 Let there be K distinct event times
t1 < t j <  < t k
 At each time tj, there are nj individuals set at risk (where at risk
means individuals who die at time tj or later)
 dj is the number who have the event at time tj
Multiply the probability of surviving event time t with the
probabilities of surviving all the previous event times.
Sˆ (t ) =
∏
j: t j ≤ t
(n j − d j )
nj
for 0 ≤ t ≤ t +
represents estimated survival probability at time t: P(T>t)
d
j
= Proportion that failed at the time tj
nj
1−
dj
nj
=Proportion surviving the event time
242
Kaplan-Meir Estimate
Consider the following Data pseudo example
Time
At risk Died
censored
Point Survival
probability (pj)
Cumulativ
e survival
(Sj)
0
31
2
3
(31-2)/31=0.9355
0.9355
1
26
1
2
(26-1)/26=0.9615
0.8995
2
23
1
2
(23-1)/23=0.9565
0.8604
3
20
1
2
(20-1)/20=0.95
0.8173
243
Example: Kaplan-Meir Estimate
Number left
Number failed
Prob. of Survival
0
17
0
1.0000
6*
16
0
1.000
10
16
2
0.8750
16
14
1
0.8125
16*
14
1
0.8125
16*
14
1
0.8125
17
11
1
0.7386
20
10
1
0.6648
23
9
1
0.5909
26
8
1
0.5170
26*
8
1
0.5170
29*
6
1
0.5170
30*
5
1
0.5170
31*
4
1
0.5170
38*
3
1
0.5170
* Indicates censored observations.
Time
244
Kaplan-Meir Estimate Using R
A lot of functions and data sets in the survival function
> library(survival) #load it.
> data(aml) #load the data set aml
> aml
#see the data
 To estimate the distribution of lifetimes non parametrically, based
on right censored observations, we use the Kaplan-Meier estimator.
 The R function to do that is
survfit()
(part of the survival
package).
245
Kaplan-Meir Estimate Using R
> aml2<-Surv(aml$time,aml$status)
#### creates an object with
censoring indicated with a +
[1]
[16]
9
12
13
13+ 18
16+ 23
27
23
30
28+ 31
33
43
34
45+ 48 161+
5
5
8
8
45
 This Surv() object can then be entered into the survfit() function
 to obtain Kaplan-Meier estimates of the survival function. The
 survfit function works similarly to an lm() or glm() function, that is,
 we put the survival time data on the left hand side of the ~
 and any predictors (groups) on the right, in this case there are no
 predictors so this can be considered a simple intercept model
246
Kaplan-Meir Estimate Using R
> survfit(aml2~1)
Call: survfit(formula = aml2 ~ 1)
Records n.max n.start events median 0.95LCL 0.95UCL
23
23
23
18
27
18
45
247
Kaplan-Meir Estimate Using R
> summary(survfit(aml2~1))
Call: survfit(formula = aml2 ~ 1)
time n.risk n.event survival std.err lower 95% CI upper 95% CI
5
23
2
0.9130
0.0588
0.8049
1.000
8
21
2
0.8261
0.0790
0.6848
0.996
9
19
1
0.7826
0.0860
0.6310
0.971
12
18
1
0.7391
0.0916
0.5798
0.942
13
17
1
0.6957
0.0959
0.5309
0.912
18
14
1
0.6460
0.1011
0.4753
0.878
23
13
2
0.5466
0.1073
0.3721
0.803
27
11
1
0.4969
0.1084
0.3240
0.762
30
9
1
0.4417
0.1095
0.2717
0.718
31
8
1
0.3865
0.1089
0.2225
0.671
33
7
1
0.3313
0.1064
0.1765
0.622
34
6
1
0.2761
0.1020
0.1338
0.569
43
5
1
0.2208
0.0954
0.0947
0.515
45
4
1
0.1656
0.0860
0.0598
0.458
48
2
1
0.0828
0.0727
0.0148
0.462
248
Kaplan-Meir Estimate Using R
>plot(survfit(aml2~1,conf.type="plain")
,xlab="Time",ylab="Survival")
249
Kaplan-Meir Estimate Using R
>plot(survfit(aml2~1,conf.type="loglog"),xlab="Time",ylab="Survival")
250
Variance for Kaplan-Meir Estimate
 The Greenwood variance estimate for a K-M curve is defined as:
^
2
^
k
dj
j =1
n j (n j − d j )
Var[ S (t )] = S (t )∑
 It underestimates the true variance for small to moderate samples.
 a confidence interval for all time points t (point wise confidence
interval).
^
^
^
S (t ) ± zα σ ( S (t ))
2
 May contain points outside the [0; 1] interval
 Use the log-log function option
251
Variance for Kaplan-Meir Estimate
 ^

Var  g ( S (t )) 


2
1
1
 '  ^ 
^ 
^ 
=
=
g
S
t
Var
S
t
Var
S
t
(
)
(
)
(
)





2
2
 
^
^


 

 
 


 log S (t ) 
 log S (t ) 




dj
k
∑ n (n
j =1
j
j
−dj)
 confidence interval for log(-log(S(t))) is:
^
^
log(− log S (t )) ± zα Var[log(− log S (t )]
2
 Confidence interval for S(t) is obtained by back-transforming:
 ^

 S (t ) 



exp  ± zα

2

^


Var  log( − log S ( t ) 






252
Confidence interval for Survival Curves
 The Greenwood variance estimate for a K-M curve is defined
as:
 We can use Greenwood variance estimate to derive a confidence
interval for all time points t (point wise confidence interval).
 What might be a potential problem with using this estimate for
a confidence interval?
 Hint: 0 ≤S(t)≤1. But may contain points outside the [0; 1]
interval.
253
Confidence interval for Survival Curves
Solution:
Transform the survivor function so that the confidence interval
falls in the [0, 1] range.
The usual solution is to use the log-log function option. Let g
(S(t)) = log (-log (S(t))).
Using the delta method, we can get a variance estimate for the
transformed function


Var  g ( S (t )) 


^
2
1
1
 '

^ 
^ 
=
=
g
S
t
Var
S
t
Var
S
t
(
)
(
)
(
)





2
2
 
^
^





 





 log S (t ) 
 log S (t ) 




^
dj
k
∑ n (n
j =1
j
j
−dj)
254
Confidence interval for Survival Curves
and therefore our confidence interval for log(-log(S(t))) is
^
^
log(− log S (t )) ± zα Var[log(− log S (t )]
2
Confidence interval
for S(t) is obtained by back-
transforming:


 S (t ) 


^
exp( ± zα
2
^


Var  log( − log S ( t ) 


255
Confidence interval for Survival Curves
 Example: Let us assume the following artificial data and compute
the log-og back transformed confidence interval.
Time
(tj)
Number Number
at risk
of event
(dj)
(nj)
6
7
10
13
16
22
23
21
17
15
12
11
7
6
Survival
(s(tj))
3
1
1
1
1
1
1
0.8571
0.8067
0.7529
0.6902
0.6275
0.5378
0.4482
Standard 95%
deviation confidence
interval
lower upper
0.0764
0.6197 0.9516
0.0870
0.5635 0.9228
0.0964
0.5032 0.8894
0.1068
0.4316 0.8491
0.1141
0.3675 0.8049
0.1282
0.2678 0.7468
0.1346
0.1881 0.6801
2
1
1
 ^
  '  ^ 
^ 
^ 
=
=
Var  g ( S (t ))   g=
S
t
Var
S
t
Var
S (t ) 
(
)
(
)





2
2
^
^

  


 

 


 log S (t ) 
 log S (t ) 




^
^
log(− log S (t )) ± zα Var[log(− log S (t )]
2
^ 
 S (t ) 


exp( ± zα
2
dj
k
∑ n (n
j =1
j
j
−dj)
^


Var  log( − log S ( t ) 


256
Comparison of Survival Curves
 As in most statistics, a key objective is to test whether
subpopulations behave in the same way.
 As survival analysis is important to compare the survival times of
different groups.
 Various tests have been proposed for testing for differences in
survival between categorical covariates.
257
Comparison of Survival Curves
 Test whether subpopulations behave in the same way.
 Plot the corresponding estimates of the survivor functions on the
same axes.
 Have a look at the following graph
258
Logrank
 Due to censoring , classical tests such as t-test and Wilcoxon test
cannot be used for comparison of the survival times
 Various tests have been designed for comparison of survival
curves, when censoring is present
 The most popular ones are:
 Logrank test
 Wilcoxon (Gehan) test
 The Logrank test has more power than Wilcoxon for detecting late
differences.
259
Logrank
Compute the following Quantities:
n1i n2i di ( ni − di )
n1i di
e1i =
v1i =
Where di =
d1i + d 2i and ni =
n1i + n2i
2
ni
ni ( ni − 1)
O1 − =
E1
k
∑(d
−e
)
=
V
k
∑v
1i
1i
1
1i
i 1 =i 1
Compute the "Z"-Statistic (Software packages often square this to get a Chi-Square):
TMH =
2
χ MH
O1 − E1
~ N (0,1) Under H 0 : No differences in Survival Functions
V1
(O1 − E1 ) 2
 χ12
=
V1
Alternative (less preferred, but easier computationally) method:
k
O2 = ∑ d 2i
E2 = O1 + O2 − E1
i =1
Compute the Chi-Square statistic:
X2
( O1 − E1 )
E1
2
+
( O2 − E2 )
E2
2
~ χ12 Under H 0 : No differences in Survival Functions
260
Logrank
 Tests the null hypothesis that the survival curves in the two groups
are the same.
 Example for logrank on the following factious data where +
indicates censored observation
 Treatment Old
 3, 5, 7, 9+, 18
 Treatment New
 12, 19, 20, 20+, 33+
Check whether there is a difference on the survival time of
individuals on the new and old treatment using log-rank test.
261
Example Logrank
Days
Trt Old
Trt New
at risk at
(n1i)
risk(n2i)
Trt Old died
(d1i)
Trt New died
(d2i)
Expected
3
5
5
1
0
0.200
0.2500
5
4
5
1
0
0.444
0.2469
7
3
5
1
0
0.375
0.2344
9+
2
5
0
0
0.000
0.0000
12
1
5
0
1
0.000
0.1389
18
1
4
1
0
0.200
0.1600
19
0
4
0
1
0.000
0.0000
20
0
3
0
1
0.000
0.0000
33
0
1
0
0
0.000
0.0000
1.219
1.0302
Total
4
O1 − E1 )
− 1.219 )
(=
(4 =
2
2
=
χ MH
V1
e1i
Variance
V1i
2
1.0302
7.51 Which indicates significant difference at 5%
n1i n2i di ( ni − di )
n1i di
=
e1i =
v1i
ni
ni2 ( ni − 1)
262
General Expression for two groups
2
 m

 ∑ wi (d1,(i ) − eˆ1,(i ) 
 ~ χ2
TestStatistic =  i =1 m
1
2
w
∑ i vˆ1,(i )
i =1
 The log-rank test uses wi = 1. It puts emphasis on larger values of
time.
 The (generalised) Wilcoxon test uses wi = nj It puts emphasis on
smaller values of time.
 The Tarone–Ware test uses wi = √n(i−). It puts emphasis on
intermediate values of time.
 m is the disticnt events time.
263
Practicing Exercise
Exercise: Pollock et al. (1989) radio-tagged 18 quail
(Colinus virginianus L.) and followed their survival. The
following are death or censoring(+) times in weeks:
3, 3, 6, 8, 8+, 9, 9+, 9+, 10, 10+, 12+, 13+, 13+, 13+,
13+, 13+, 13+,13+.
Construct the Kaplan–Meier estimate of the survival
function, the variance of this estimate, and a 95%
confidence interval (plain, log and log-log).
264
Parametric Survival Models
 The Kaplan-Meier estimator is a very useful tool for estimating
survival functions.
 Sometimes, we may want to make more assumptions that allow us
to model the data in more detail.
 By specifying a parametric form for S(t), we can
 easily compute selected quantiles of the distribution
 estimate the expected failure time
 derive a concise equation and smooth function for estimating S(t),
H(t) and h(t)
 estimate S(t) more precisely than KM assuming the parametric form
is correct!
265
Exponential Distribution
Characterized by one parameter λ> 0
Leads to a constant hazard function, h(t)=λ
The survival function is, S(t)= exp(-λt)
The density function is, f(t)= λexp(-λt)
Memory loss property; P(T ≥t+z | T ≥ t) = P(T≥z)
Not reasonable in many applications
 Hazards usually not constant over time
266
Exponential Distribution
An empirical check of the exponential distribution for a
set of survival data is provided by plotting the log of the
survival function estimate versus log time. Such a plot
should approximate a straight line through the origin .
267
Weibul Distribution
Two parameters:
α- shape parameter >0
β - scale parameter >0
 S0(t)=exp(-βtα)
 f0(t)=αβtα-1exp(-βtα)
 h0(t)=αβtα-1
268
Weibull Distribution
 hazard decreases monotonically with time if α< 1
 hazard increases monotonically with time if α> 1
 hazard is constant over time if α = 1 (exponential case): second
parameter makes it more flexible than exponential
 An empirical check for the Weibull distribution is provided by a
plot of the log-log estimate versus log time. The plot should give
approximately a straight line.
269
Modeling Survival Data
 We could use a semi-parametric model, one where the baseline
hazard rate is not specified.
 The most common semi-parametric model is the Cox Proportional
Hazards model (Cox, 1972), typically called the Cox Model or PH
regression.
 The hazard rate is simply evaluated at every data point in light of
the covariates.
 Cox Regression builds a predictive model for time-to-event data
270
Modeling Survival Data
 Note that information from censored subjects contributes
usefully to the estimation of the model.
 Proportional hazards assumption: the hazard for any individual
is a fixed proportion of the hazard for any other individual
 Multiplicative risk
P (t ≤ T < t + ∆t / T ≥ t )
h(t ) = lim
∆t 
→ 0
∆t
271
Modeling Survival Data
 In words: the probability that if you survive to t, you will
succumb to the event in the next instant.
f (t )
Hazard from density and survival : h(t) =
S (t )
272
Components of Cox PH
A baseline hazard function that is left unspecified but must
be positive (=the hazard when all covariates are 0)
A linear function of a set of k fixed covariates that is
exponentiated. (=the relative risk)
hi (t ) = h0 (t )e
β1 xi 1 +...+ β k xik
Can take on any form!
log=
hi (t ) log h0 (t ) + β1 xi1 + ... + β k xik
273
Cox PH Model
hi (t ) = h0 (t )e β1 xi1 +...+ β k xik
log=
hi (t ) log h0 (t ) + β1 xi1 + ... + β k xik
Hazard for person i (eg a smoker)
HR
=
i, j
Hazard
ratio
hi (t )
=
h j (t )
h0 (t )e β1 xi1 +...+ β k xik
β1 ( xi 1 − x j 1 ) +...+ β1 ( xik − x jk )
e
=
β x +...+ β k x jk
h0 (t )e 1 j 1
Hazard for person j (eg a non-smoker)
274
Parameter interpretation
β
(1) + β
(60)
age
hi (t ) h0 (t )e smoking
β smoking (1− 0)
HRlung cancer / smoking
= =
=
e
h j (t ) h0 (t )e β smoking (0) + βage (60)
HRlung cancer / smoking = e
β smoking
 This is the hazard ratio for smoking adjusted for age.
β
(0) + β
(70)
age
hi (t ) h0 (t )e smoking
β age (70 − 60)
=
=
HRlung cancer /10− years increase=
e
in age
h j (t ) h0 (t )e β smoking (0) + βage (60)
HRlung cancer /10− years increase in age = e
β age (10)
 This is the hazard ratio for a 10-year increase in age,
adjusted for smoking.
 Exponentiating a continuous predictor gives you the
hazard ratio for a 1-unit increase in the predictor.
275
Example
 Study on Systolic Hypertension in the Elderly Program (SHEP);
This was a study of 4,736 persons over age 60 with isolated
systolic hypertension (i.e., people with high systolic blood
pressure and normal diastolic blood pressure) to see if treatment
with a low-dose diuretic and/or betablocker would reduce the rate
of strokes compared with the rate in the control group treated with
placebo.
Variables
Coefficient
Se
Exp(Coeff)
Race
-0.1031
0.2607
0.90
Sex (male)
0.1707
0.1952
1.19
Age
0.0598
0.01405
1.06
History of
diabetes
0.5322
0.2397
1.70
Smoking
(Baseline)
0.6214
0.2390
1.86
Interpreting results
 This means that a person with untreated systolic hypertension who
has a history of diabetes has 1.7 times the risk of having a stroke
than a person with the same other characteristics but no diabetes.
This can also be stated as a 70% greater risk.
 The risk at age=5 is 1.35, There is a 35% increase in risk of future
stroke per 5-year greater age at baseline, controlling for all the
other variables in the model.
Chapter Five
Disease Screening
278
Introduction
 Screening is systematic application of a test or investigation to
people
 Screening is the process by which unrecognized disease or
defects are identified, using tests which can be applied rapidly
and on to large numbers of people.
Introduction
 screening programm can be

population
screening
(sometimes referred to as ‘mass
screening’), in which the aim is to screen everyone in a particular
population

all newborn babies

everyone over the age of 50 years

‘individual screening’ or ‘targeted screening’

frequent eye-tests are carried out on people with diabetes
Introduction
 When examining a screening test we tend to look most closely at its:
 Validity: compare the screening test against some “gold standard” and as a
measure calculate:

Sensitivity

Specificity
 Reproducibility: do the tests repeatedly in the same individuals and
calculate measures of:

Intrasubject Variation

Interobserver Variation-example Kappa measures of agreement
 Efficacy: use following measures

Positive Predictive Value and

Positive Predictive Value
Sensitivity and Specificity
 Sensitivity- is probability that a person having
the disease is
detected by the test
= P (test positive | they have the disease)
= P(T+|D+)
 Specificity- is probability that a person who does not have the
disease is classified that way by the test
= P(test negative | they don’t have the disease)
= P(T-|D-)
Sensitivity and Specificity
Disease “Gold standard”
Test
Result
Present
Absent
Total
Positive
TP
FP
All who test +
Negative
FN
TN
All who test -
Total
All with the
disease
All without
the disease
TP
Sensitivity =
TP + FN
TN
specificity =
FP + TN
Negative predictive and Positive Predictive values
 For a measure of the efficacy of the test we use
 Positive Predictive Value – is probability that someone who tests
positive for the disease will actually have the disease
= P (have disease | positive test result)
 Negative Predictive Value- is probability that someone who tests
negative for the disease will actually have no the disease
=P (don’t have disease | negative test result)
Negative predictive and Positive Predictive values
Disease “Gold standard”
Test
Result
Present
Absent
Total
Positive
TP
FP
All who test +
Negative
FN
TN
All who test -
Total
All with the
disease
All without
the disease
TP
PPV =
TP + FP
TN
NPV =
FN + TN
One of the reasons Positive Predictive Value is used as a measure of
efficacy is because it depends on the prevalence of the disease
Related Concepts to Screening
 False Positive:
 the test reports a positive result for a person who is disease
free.
 The false positive rate is given by P(D-| T+)=c/(a+c).
 False Negative:

the test reports a negative result for a person who actually
has the disease.
 The false negative rate is given by: P(D+| T-)=b/(b+d).
Related Concepts to Screening
 Which false result is the more serious depends on the situation.
 But we generally worry more about false positives in screening
tests.
 We don't want to tell someone that they have a serious disease
when they do not really have it.
Calculating False Positive and False Negative
Disease “Gold standard”
Test
Result
Present
Absent
Total
Positive
TP
FP
All who test +
Negative
FN
TN
All who test -
Total
All with the
disease
All without
the disease
FN
false positive =
TP + FN
FP
false negative =
TN + FP
Cuttoff Point
 Used to screen for a quantitative risk factor
 Plasma Glucose levels
Diabetes
 Body Mass Index
Obesity
 Blood pressure
Hypertension
 A perfect separation between groups is difficult
 Distribution of the test results will overlap
Cuttoff Point
 Lowering the cutoff point for the screening test will
 Increase true positives
 Increase sensitivity
 Decrease true negatives
 Decrease specificity
 Highering the cutoff point for the screening test will
 Decrease true positives
 Decrease sensitivity
 Increase true negatives
 Increase specificity
Cuttoff Point
There is trade-offs between sensitivity and specificity!!!
Failing to detect some true cases because of lower sensitivity or misclassifying some
people as diseased because of lower specificity highly depends on:
 the prevalence of the disease
 the severity of the disease
 the potential fatality of the disease
 how good the test is
 the acceptability of the test to people
Receiver Operating Characteristic (ROC) curve
 the true positive rate (Sensitivity) is plotted in function of the false
positive rate (100-Specificity) for different cut-off points
 Each point on the ROC curve represents a sensitivity/specificity pair
corresponding to a particular decision threshold
 A test with perfect discrimination (no overlap in the two
distributions) has a ROC curve that passes through the upper left
corner (100% sensitivity, 100% specificity).
 Therefore the closer the ROC curve is to the upper left corner, the
higher the overall accuracy of the test (Zweig & Campbell, 1993).
Receiver Operating Characteristic (ROC) curve
Chapter Six
Clinical Trials
295
New Drug Development
RX
Chemicals in
test tubes
New medicine
widely used in humans
?
296
Introduction to Clinical Trials
Trial:
 Is from the Anglo-French prier
 Meaning to try broadly
 Refers to the action or process of putting something to a test
or proof
297
Introduction to clinical trial
Clinical:
 Is

from clinic
From the French cliniquẻ
 From
Greece klinike
 Refers to the practice of carrying for the sick at the bedside
298
Definition of clinical trial
Narrowly defined:
 “the action or process of putting something to a test or proof at the
bedside of the sick.”
 Broadly defined:
“A clinical trial may be defined as a carefully designed,
prospective medical study which attempts to answer a
precisely defined set of questions with respect to the effects
of a particular treatment or treatments.”
299
Old Paradigm of Clinician
Unsystematic observations (experience) are the best
way of developing knowledge.
 Knowledge of pathophysiology coupled with common
sense may effectively guide clinical practice.
300
New Paradigm of clinicians
Evidence-Based Medicine
Clinical instincts are important, but they must be
guided and modified by the results of carefully
recorded unbiased observations.
While pathophysiologic mechanisms are important,
the response to therapy may occasionally be contrary
to expectations.
Clinical trials provide the most objective and unbiased
data to guide therapeutic decisions.
301
What Is Evidence-Based Medicine?
EBM is “the conscientious, explicit and judicious use
of current best evidence in making decisions about the
care of the individual patient. It mean integrating
individual clinical expertise with the best available
external clinical evidence from systematic research.”
David Sackett
302
Assessing the Benefit of a New
Therapy
 Expert opinion
 Physiological concepts
 Clinical experience
 Retrospective studies
 Clinical trials
 The primacy of clinical trials can be traced to the rise of
evidence based medicine
303
Evidence Based Medicine
Case report: a demonstration only that some event of
clinical interest is possible.
Case series: a demonstration of certain possibly related
clinical events but subject to large selection biases.
Database analysis: treatment is not determined by
experimental design but by factors such as physician or
patient preference. The data are unlikely to have been
collected specifically to evaluate efficacy.
304
Evidence Based Medicine
Observational study: the investigator takes advantage
of “natural” exposures or treatment selection and
chooses a comparison group by design.
Controlled clinical trials: the treatment is assigned by
design. Endpoint ascertainment is actively performed
and analyses are planned in advance
305
New Drug Development
 Assessing the benefit of a new therapy
 Clinical experiences
 Physiological concepts
 Clinical trials: primacy for EBM
306
RX
10 000 molecules
1 new product
$1 billion
8 - 10 year
(Acute treatment)
12 years
(Chronic treatment)
307
New Drug Development
Pharmaceutical development
Preclinical
Clinical
 Phase I trials
 Phase II trials
 Phase III trials
Postmarketing surveillance
 Phase IV trials/study
308
Pharmaceutical developmentDiscovery
of compound, synthesis and
purification of drug substances, manufacturing
procedures
Pre-clinical (animal) studies Pharmacological profile, acute
toxicity
Investigational New Drug Application
Phase I clinical trials
Small; focus on safety
Phase II clinical trials
Medium size, focus on safety and
short term efficacy
Phase III clinical trials Large and comparative, focus on
efficacy and cost benefits
New Drug Application
Phase IV clinical trials
“real world” experience; demonstrate
cost benefits; rare adverse reactions
309
Phase I Clinical Trials
Small number of subjects (20-100)
Focus on safety
Pharmacokinetics
Pharmacodynamics
Toxicity
For toxic drugs: maximum tolerated dose
310
Phase II Clinical Trials
Usually several hundred patients with the medical
condition
Strict inclusion/exclusion criteria
Focus on safety and short-term efficacy
Clarify dose and dose regimen
Basis for design of “pivotal” studies
311
Phase III Clinical Trials
“Pivotal” for NDA submission
Strict inclusion/exclusion criteria
Large (hundreds - thousands of subjects)
Comparative (two or more treatment groups)
Focus on efficacy and cost benefits
312
Phase IV Clinical Trials
Efficacy in routine clinical practice
Assess unusual adverse reactions
Demonstrate cost benefits
No inclusion/exclusion criteria
313
Trial aims
The main focuses of different types of studies
Activity
Efficacy
Effectiveness
Efficiency
Main Focus
Biological effect of
the drug on the target
system
A sample welldefined patients
Overall effect of the
drug in a population
at large
Balance of costs and
effects of the drug
from a public health
perspective
Types of studies
Preclinical studies
and early clinical
trials(phase I-II)
Clinical trials (phase
II-III)
Late clinical
trials(phase IV)
Pharmaco economic
studies
314
Organization of a Clinical Trial
Planning the study
– Formulating the hypothesis
– Choosing the endpoint
– Choosing the design and sample size
Conduct of the study
– Patient accrual
– Data collection
Data analysis
Publication of results
315
Elements of a Design
Use of placebo/control group
“Blinding”
Randomization
Early stopping
Parallel groups vs. cross-over trials
Testing for superiority/equivalence
316
Use of Control Group
To obtain the information about mechanisms
not related to treatment.
Use of non-treated controls may be unethical and
problematic.
– Unethical if an effective treatment exists.
– Problematic if the knowledge of treatment can affect
evaluation of treatment effect.

Use placebo
317
“Blinding”
Concealing the treatment identity to prevent bias in
treatment outcome evaluation.

Open-label trials: both patients and clinicians know the
assigned treatment.

Single-blinded trials: the patient does not know the
treatment (but the clinician does).

Double-blinded trials: neither the patient nor the
clinician knows the treatment.
318
Randomization
Random assignment of treatment for patients in a
clinical trial
Goal: elimination of the effect of unobserved factors
on response
319
How randomization works?
Two treatments (A & B) assigned to patients with
probability ½ each.
 Consider sex:
– M males, on average ½ M get A and ½ M get B.
– F females, on average ½ F get A and ½ F get B.
– In each treatment group there will be ≈ ½ M / (½ M + ½ F) males.
– Randomization should balance the distribution of all
factors in the compared treatment groups.
320
Trial Participants Are a Selected Group
Population of patients as defined by eligibility criteria (population P)
Patients recruited (Sample A)
Formal entry into trial
(patients agreeing to
participate: sample B)
Eligible patients who, for
reason, were not entered
into the trial (often refused
consent)
Randomization
Treatment group 1
Treatment group 2
Compare outcomes
321
Randomization
 Eliminates all sources of bias except accidental bias
 Tends to ensure balance among treatments with
respect to known and unknown prognostic factors
 Guarantees the distributional assumptions of the test
statistics and estimators
322
Bias in Randomization
Selection bias
 Occurs if the allocation process is predictable. If any bias
exists as to what treatment particular types of participants
should receive, then a selection bias might occur.
Accidental bias
 Can arise if the randomization procedure does not achieve
balance on risk factors or prognostic covariates especially in
small studies.
323
Sample Size Determination
 Ho
and HA, How small a treatment difference is it
important to detect and with what degree of certainty? ( δ,
α and β.)
 Parameters
used in calculation are estimates with
uncertainty and often base on very small prior studies
 Population
may be different
 Publication
bias--overly optimistic
 Different
inclusion and exclusion criteria
324
Sample Size Determination
Quantities used in calculation:
 Variances
 mean values
 response rates
 difference to be detected
Overestimated size:
 unfeasible
 early termination
Underestimated size
 justify an increase
 extension in follow-up
 incorrect conclusion (WORSE)
325
What is α (Type I error)?
 The probability of erroneously rejecting the null hypothesis
 (Put an useless medicine into the market!)
What is β (Type II error)?
 The probability of erroneously failing to reject the null
hypothesis.
 (keep a good medicine away from patients!)

is β (Type II error)?
326
What is Power ?
Power quantifies the ability of the study to find true
differences of various values of δ.
Power = 1- β=P (accept H1|H1 is true)
the chance of correctly identify H1 (correctly identify
a better medicine)
What is δ?
δ is the minimum difference between groups that is
judged to be clinically important
327
Sample Size Calculation
H0: δ=µt-µc=0
HA: δ=µt-µc≠0
N=
(
)
2
2 Zα / 2 + Z β / 2 σ 2
δ2
Multiply the above number by 2 to get the
total number of patients in the trial
328
Example
An investigator wish to estimate the sample size
necessary to detect a 10 mg/dl difference in
cholesterol level in a diet intervention group
compared to the control group. The variance from
other data is estimated to be (50 mg/dl). For a two
sided 5% significance level, Zα/2=1.96, and for 90%
power, Zβ/2=1.282.
2N=4(1.96+1.282)2(50)2/102=1050
329
Interpretation of Sample Size
A sample size of 525 in each group will have 90%
power to detect a difference in means of 10.0
assuming that the common variance is 50.0 using a
two group t-test with a 0.05 two-sided significant
level.
330
Chapter Seven
Research Ethics
331
Introduction
 An ‘ethic’ is a moral principle or a code of conduct which …
governs what people do.
 It is concerned with the way people act or behave.
 The term ‘ethics’ usually refers to the moral principles, guiding
conduct, which are held by a group or even a profession (though
there is no logical reason why individuals should not have their
own ethical code)” (Wellington, 2000: 54)
 “Ethical concerns should be at the forefront of any research project
and should continue through to the write-up and dissemination
stages” (Wellington, 2000: 3)
332
Ethical Principles
 Guides to moral behavior
 Good:
honesty, keeping promises, helping others, respective
rights of others
 Bad: lying, stealing, deceiving, harming others
 Universality of ethical principles: should apply in the same manner
in all countries, cultures, communities
 Relativity of ethical principles:
vary from country to country,
community to community
333
Ethical Principles
 Meaning given to ethics are relative to time, place, circumstance,
and the person involved
 Research ethics is to protect rights and welfare of research
participants and to protect the wider society or community within
which the research is being conducted
334
Protection of Human Research Project
 Research: means “a systematic investigation, including
research development, testing, or evaluation, designed to
develop or contribute to generalizable knowledge”
 Human subject: means “a living individual about whom an
investigator … conducting research obtains
 data through intervention or interaction with the individual, or
 identifiable private information”
335
Protection of Human Research Project
 Protection of human subjects is based upon the principles
 Respect for human dignity
 Respect for free and informed consent
 Respect for vulnerable persons
 Respect for privacy and confidentiality
 Respect for justice and inclusiveness
 Balancing harms and benefits: minimizing harm and maximizing
benefit
336
Protection of Human Research Project
Respect for persons
 Every person has the right to determine what shall happen to
him or her – participation must be voluntary
 Protect the multiple and interdependent interests of the person
(bodily, psychological, cultural integrity)
 Special consideration and protection is extended to “vulnerable”
subjects such as children, persons with cognitive disabilities,
prisoners, and institutionalized persons
337
Protection of Human Research Project
 Beneficence
 No person shall be placed at risk unless the risks are reasonable
in relation to the anticipated benefits
 Justice
 Risks and benefits should be justly distributed – who ought to
receive the benefits of research and who should bear its
burdens?
338
Informed Consent
 What is consent?
 Defined as permission, approval, or assent
 What is informed consent?
 Consent given by the patient based on knowledge of the procedure to be
performed, including its risks and benefits, as well as alternatives to the
proposed treatment/action.
 Presumption that individuals have capacity and right to make free
and informed decisions
 In research = dialogue, process, rights, duties, requirements for free
and informed consent by the research subject
339
Informed Consent
 Participants should have access to information about the aims and
objectives of any research in which they are involved, including
sources of help, advice, support and treatment if they experience
any ill effects of participation
 The research cannot proceed without consent
 Informed consent must be maintained throughout
340
Informed Consent
Informed Consent allows individuals:
 To determine whether participating in research fits with their
values and interests.
 To decide whether to contribute to this specific research project.
 To protect themselves from risks.
 To decide whether they can fulfill the requirements necessary
for the research.
341
Informed Consent
 Informed consent should involve the provision or collection of
information on:
 Name and contact details of researcher
 Name and contact details of participant
 Aims and objectives of the research project
 Role of the participant in the research project
 Treatment of material/information collected
 Potential risks to the participant
 Sources of advice/help/support/treatment
 Voluntary participation and freedom to withdraw
342
Research Integrity
 “Integrity" means "firm adherence to a code, especially moral or
artistic values;
incorruptibility.“
 Research integrity includes:
 the use of honest and verifiable methods in proposing,
performing, and evaluating research
 reporting research results with particular attention to adherence
to rules, regulations, guidelines, and
 following commonly accepted professional codes or norms.
343
Research Integrity
 “Research misconduct means fabrication, falsification, or
plagiarism in proposing, performing, or reviewing research, or in
reporting research results.”
 In research misconduct there must be a significant departure
from accepted practices of the relevant research community.
 The misconduct must be committed intentionally, knowingly, or
recklessly
344
Research Integrity
 Shared values in scientific research are:
 Honesty:
 convey information truthfully and honoring commitments
 Accuracy:
 report findings precisely and take care to avoid errors
 Efficiency:
 use resources wisely and avoid waste
 Objectivity:
 let the facts speak for themselves and avoid improper bias
345
Authorship Policies
 Authorship of a research publication is an acknowledgement
of the substantial contribution made by a researcher.
 It carries with it both recognition of work done and
responsibility for the material contributed.
 Authorship must therefore be attributed with due regard for
the appropriate conventions.
 All persons designated as authors should qualify for
authorship, and all those who qualify should be listed.
346
Authorship Policies
 Authorship credit for original, research-based works (in any
medium) may be based on:
1.
Substantial contributions to conception and design, or
acquisition of data, or analysis and interpretation of data;
2.
Drafting the article or revising it critically for important
intellectual content;
3.
Sufficient
participation
in
the
work
to
take
public
responsibility for appropriate portions of the content; and
4.
Final approval of the version to be published.
347
Authorship Policies
 Authors should meet conditions 1, 2, 3, and 4.
 Other contributions such as provision of a key reagent, or
collection of data may also be considered as long as conditions
2, 3 and 4 are met.
Authorship credit for reviews or commentaries not based
in original research should be based on conditions 2, 3
and 4.
348
Authorship Policies
 Acquisition of funding, collection of data (for example, from a fee-
for-service core facility), or general supervision of the research
group (e.g. by former or current mentors not directly involved in
the conception or execution of the publication), alone, does not
justify authorship.
 Financial and material support should be disclosed.
 All contributors who do not meet the criteria for authorship should
be listed in an acknowledgments section.
349
Authorship Policies
 “Ghost-writing,” a practice whereby a commercial entity or its
contractor writes an article or manuscript and a scientist is listed as
an author, is not permissible.
 Making minor revisions to an article or manuscript that is
ghost-written does not justify authorship.
 All contributors who do not meet the criteria for authorship should
be listed in an acknowledgments section.
350
Data and Safety Monitoring
 All clinical investigations, including physiologic, toxicity, and
dose-finding studies (phase I); efficacy studies (phase II); efficacy,
effectiveness and comparative trials (phase III), involving greater
than minimal risk to participants (i.e., full Committee review) are,
at a minimum, required to develop a data and safety monitoring
plan to assure the safety and welfare of the research subjects.
 The method and degree of monitoring needed is related to the
degree of risk involved. be concluded successfully.
351
Data and Safety Monitoring
 A Data Safety Monitoring Committee (DSMB) is usually required
to determine safe and effective conduct and to recommend
conclusion of the trial when significant benefits or risks have
developed or the study is unlikely to be concluded successfully.
352
Data and Safety Monitoring
 Risk associated with participation in research must be
minimized to the extent practical.
 Monitoring should be commensurate with size and complexity
of the study.
 Monitoring may be conducted in various ways or by various
individuals or groups, depending on the size and scope of the
research effort.
353
Data and Safety Monitoring
 Purposes of DSMB
 Identify high rates of ineligibility determined after randomization
 Identify protocol violation that suggests clarification of changes to protocol
are needed
 Identify unexpectedly high drop out rates that threaten the trials ability to
produce credible results
 Ensure validity of study results
354
THANK YOU
355
Download