34731

advertisement
Biostatistics course
Part 13
Effect measures in 2 x 2 tables
Dr. Sc. Nicolas Padilla Raygoza
Department of Nursing and Obstetrics
Division Health Sciences and Engineering
University of Guanajuato
Campus Celaya-Salvatierra
Biosketch
 Medical Doctor by University Autonomous of Guadalajara.
 Pediatrician by the Mexican Council of Certification on





Pediatrics.
Postgraduate Diploma on Epidemiology, London School of
Hygiene and Tropical Medicine, University of London.
Master Sciences with aim in Epidemiology, Atlantic International
University.
Doctorate Sciences with aim in Epidemiology, Atlantic
International University.
Associated Professor B, Department of Nursing and Obstetrics,
Division of Health Sciences and Engineering, University of
Guanajuato, Campus Celaya Salvatierra, Mexico.
padillawarm@gmail.com
Competencies
 The reader will obtain Risk Ratio or Odds
Ratio from a 2 x 2 table.
 He (she) will calculate 95% confidence
interval from RR or OR.
 He (she) will identify potential confounders
and/or interactions.
 He (she) will apply Mantel Haenzsel test for
RR, OR and Chi-squared.
Introduction
 In part 12 of the course, we tested the
association between two categorical
variables.
 Now, we review the methods used to
measure the association.
 We will work with binary variables, so
we will use 2 x 2 tables.
Example
 A nurse in a poor area of Mexico, was informed that
many area children attending the nursery were sick of
respiratory infections.
 She designed a cohort study to investigate the
problem.
 During the following years 1000 children were
followed.
 The main research question was:

Attending nursery is associated with respiratory
infection?
Example
Attending
nursery
Respiratory
infection
Respiratory
infection
Yes
No
n
%
n
Yes
37
33.9 72
No
43
4.8 848
Total
80
8 920
Total
%
66.1 109
95.2 891
92 1000
Risk Ratio (RR)
 In health research, the term "risk" is used instead of
proportion.

For example:

The risk of infection among children attending day care was 33.9%.
 Thus, the risk ratio is the ratio of two proportions.



The risk of respiratory infection for those attending the
nursery
37 / (37 + 72) = 37/109 = 0.339
The risk of respiratory infection in children not attending day
care is: 43 / (43 + 848) = 43/891 = 0.048.
The risk ratio (RR) is the ratio of these two risks.

Risk ratio = 0.339 / 0.048 = 7.06
Risk Ratio (RR)
 In general, the risk ratio can be obtained with the
following formula, where a, b, c and d are the
frequencies in the 2 x 2 table.
Outcome
Outcome
Yes
No
Yes
a
b
a+b
No
c
d
c+d
a+c
b+d
Exposure
Total
Risk Ratio = (a /a+b) / (c/c + d)
Total
N
Odds Ratio (OR)
 The Odds Ratio (OR) is the ratio of the chance
(probability) of the results between those exposed
and the chance of the outcome among non-exposed.





The chance of infection among attendees of the
nursery is: 37 / 72 = 0,514
The chance of infection among children not attending
day care is: 43 / 848 = 0,051
The Odds Ratio of these two probabilities: OR = 0,514
/ 0,051 = 10.08
In general, the Odds Ratio was found with the following
formula:
OR = ad / bc = (a / c) / (b / d)
Confidence intervals
 In the analysis of data from children attending day
care or not, we have the option to use RR or OR, to
measure the effect of attendance at the nursery.
 Each value is an estimate only, so these values
should be reported with confidence intervals.



An approximate confidence interval at 95% for the RR
is found using the following formula:
Minimum value: RR / EF
Maximum value: RR x EF
EF = exp(1.96√(1/a) – (1/a+b) + (1/c) –(1/c+d))
Confidence intervals
 CI for the data of children who attend day
care or not, is:




EF = exp (1.96 √ 1 / 37 - 1 / 109 + 1 / 43 1/891 = 1.48
RR = 7.06
Minimum 7.06/1.48 = 4.77
Maximum value 7.06 x 1.48 = 10.45
 95% CI = 4.77 to 10.45
Confidence intervals
 An approximate confidence interval at 95%
for the OR is found using the following
formula:


Minimum value: OR / EF
Maximum value: OR x EF
EF = exp(1.96√(1/a) + (1/b) + (1/c) + (1/d))
Confidence intervals
 CI for the data of children who attend day
care or not, is:




EF = exp (1.96 √ 1 / 37 + 1 / 72 + 1 / 43 +1 /
848 = 1.65
OR = 10.08
Minimum value 10.08/1.65 = 6.11
Maximum value 10.08 x 1.65 = 16.63
 95% CI = 6.11 to 16.63
Which measure is best?
 Risk Ratios are calculated for cross-sectional
and cohort studies.

The formula for the 95% confidence interval for
RR requires larger sample sizes than for OR.
 OR are calculated for case-control and cross-
sectional studies.

In case-control studies is not possible to calculate
risks, and therefore can not calculate RR.
 There is an advantage in using OR.
 It is a consistent measure of effect, unlike RR.
Example (Cont…)
 Mexican children showed a strong association between




exposure (attending nursery) and outcome (respiratory
infection).
However such an association may be confounded by other
factor(s).
For example, although children who attend day care, seem to
have a 7 times higher risk of respiratory infection, the cause of
the infection can also be something that is associated with
children who go to daycare.
In other words, to attend the nursery may be a marker of
exposure that causes a respiratory infection.
If this is true, we can say that the association between
respiratory infections and assistance to the nursery, are
confused.
How identify a potential confounder?
 To evaluate a potential confounder, we
should consider three aspects:



The exposure
The outcome
The confounder
Example
 The nurse is interested in the association
between day care attendance and presence
of respiratory infection, but is aware that
children might be exposed to other factors
that cause respiratory infection.
 For example, overcrowding at home is a risk
factor for respiratory infection.
 It is therefore a potential confounder of the
association between attendance at day care
and respiratory infections.
Confounders
 For a variable has been a potential
confounding, it should meet three conditions:

Must be:



an independent risk factor for the outcome of
interest
should be associated with the exposure of
interest
not be in the cause pathway between exposure
and outcome.
Confounders
 How do we check these conditions in the study of
Mexican children?

Condition 1 of confusion:


Risk factor for the outcome of interest
Is there an association between overcrowding and
respiratory infection?
Overcrowding
in home
RI
Yes
RI
No
Risk of RI
Yes
54
55
54/109 =0.5
RR = 25
95%CI = 15.72 a 39.75
X2= 311.67
No
21
870
21/891= 0.02
P<<0.05
Confounders
 How do we check these conditions in the study of
Mexican children?

Condition 2 of confusion:


Association with exposure
Is there an association between overcrowding and
assistance to child care?
Overcrowding
in home
Attendance to Attendance to
nursery
nursery
Yes
No
Yes
43
66
No
35
856
X2= 170.39
P<<0.05
Confounders
 How do we check these conditions in the
study of Mexican children?

Condition 3 of confusion:


Is the potential confusion is the causal pathway?
In this example, it is unlikely that child care
assistance, is caused by overcrowding
Do we have a confounder?
 In this study, overcrowding has satisfied the three
conditions necessary for a confounding variable:



It is an independent risk factor for the outcome of
interest. Overcrowding is associated with respiratory
infection.
It is associated with the exposure of interest.
Overcrowding is associated with attendance at the
nursery.
It is not in the causal pathway. Overcrowding is
unlikely to be the cause of attendance at nursery.
Stratified tables
 Now, we know that the data must be additionaly
analyzed for to have the effect of overcrowding.
 To adjust for confounder variable, we stratified the
table 2 x 2 of interest.
 The table without stratify is called raw table.



Can be divided into strata defined by the confounder
variable.
The sample is divided into two groups, each of them the
status of overcrowding is the same.
The two groups are:

Overcrowding and without overcrowding
Stratified tables
 If we want to find childcare assistance is associated
with respiratory infection when comparing children
within the same category of overcrowding.
 The raw table for the relationship between
respiratory infections and child care assistance:
Attendance
to nursery
Respiratory
infection
Respiratory
infection
Yes
No
n
%
n
Yes
37
33.9 72
No
43
4.8 848
Total
80
8 920
Total
%
66.1 109
95.2 891
92 1000
Stratified tables
 Now, it is show stratified tables by
overcrowding and without overcrowding:
Overcrowding
Without overcrowding
Respiratory
infection
Yes
Respiratory
infection
No
Total
Nursery
Yes
10
24
34
26
Nursery
No
4
861
865
101
Total
14
885
899
Respiratory
infection
Yes
Respiratory
infection
No
Total
Nursery
Yes
61
14
75
Nursery
No
5
21
Total
66
35
RR= 4.23 X2=32.88 p=0.0000
95%CI 1.91 a 9.37
RR= 63.6 X2=178.84 p=0.0000
95%CI 21.01 a 192.56
Stratified tables
 Do you think that attendance at nursery is a risk
factor for respiratory infections among children with
overcrowding?
 Yes, children attending day care are 63 times more at
risk of respiratory infection than those who do not
attend nursery.
 The p value indicates a strong association between
attendance at daycare and respiratory infection in the
group without overcrowding.
Stratified tables
 Do you think that attendance at nursery is a risk
factor for respiratory infection in the group without
overcrowding?
 Yes, children attending day care are more than 3
times more at risk of respiratory infection than those
not attending the nursery.
 The p value indicates a strong association between
attendance at daycare and respiratory infection in this
group.
 Within each stratum, the association between
attendance at day care and respiratory infections is
now independent of overcrowding at home.
Comparison of results
 How to compare these results with those of the raw table?
 The raw table shows a strong relationship between attendance at day
care and respiratory infection, RR is different in both tables stratified
but remains a significant statistical association.
RR
95%CI
X2
P-value
Raw
7.06
4.77 a 10.45
111.88
<0.05
Overcrowding
4.23
1.91 a 9.37
32.88
<0.05
Without
overcrowding
63.6
21.01 a 192.56 178.84
<0.05
Adjusted Risk Ratios
 Nurse do not want show data divided into strata, prefer a global
estimate of the effect of attended to nursery in respiratory tract infection
adjusted by overcrowding.
 This can be done by calculate RR using a Mantel Haenzsel method.
 First, look 2 x s table in each strata.
Exposure
Disease
Yes
Diasease
No
Yes
ae
be
No
ce
de
Total
Total
ne
Risk Ratios from Mantel Haenzsel
 Adjusted RR (summarized), can be obtained with:
RRMantel Haenzsel
Ʃ a (c+d)/n
= --------------Ʃ c (a+b)/n
 This give us a average of RR initially estimate into
each table ; more important each table with more
sample size.
Adjusted Risk Ratio

We calculate overcrowding adjusted RR with Mantel Haenzsel
formula:
Overcrowding
Non-overcrowding
Respiratory
infection
Yes
Respiratory
infection
No
Total
Nursery
Yes
61
14
75
Nursery
No
5
21
Total
66
35
Respiratory
infection
Yes
Respiratory
infection
No
Total
Nursery
Yes
10
24
34
26
Nursery
No
4
861
865
101
Total
14
885
899
61 (5 + 21)/ 101 + 10 (4 + 861)/899
15.70 + 9.62
25.32
------------------------------------------------ = ----------------- = ----------- = 6.56
5 (61 + 14)/101 + 4 (10 + 24)/899
3.71 + 0.15
3.86
Adjusted Odds Ratio
 Adjusted OR is calculate in similar form that adjusted RR.
Ʃ ad/n
RMMantel Haenzel= ----------Ʃ bc/n
Exposure
Disease
Yes
Diasease
No
Yes
ae
be
No
ce
de
Total
Total
ne
Adjusted Odds Ratio
 In a cross-sectional study, on the use of quinfamide
after a amoebic dysentery, it was reported how many
are carriers of Entamoeba histolytic.
Non-carrier
Carrier
Total
Quinfamide 100
54
154
Non
quinfamide
15
72
87
Total
115
126
241
Adjusted Odds Ratio

We calculate adjusted OR by residence area, with the Mantel
Haenzsel formula:
Urban
Rural
Non-carrier
Carrier
Total
Quinfamide
Yes
35
39
74
Quinfamide
No
10
51
61
Total
45
90
Non-carrier
Carrier
Total
Quinfamide
Yes
65
14
79
Quinfamide
No
5
21
26
Total
70
35
105
135
(35 x 51 /135) + (65 x 21/105) 13.2 + 13
26.2
---------------------------------------- = ----------------- = ---------- = 7.4
(39 x 10 / 135) + (14 x 5 /105) 2.89 +0.67
3.56
Mantel Haenzsel X2
 The nurse now knows that the association between
respiratory infection and attend to nursery still is after
adjusted by overcrowding, confounder variable.
 Now, she want to calculate a Chi squared test to
significance of this association, adjusted by
confounder.
 This can be do, calculating X2Mantel-Haenzsel test.
Mantel Haenzsel X2
 To calculate adjusted Chi squared test for the
confounder, we calculate Mantel Haenzsel Chi
squared. Null hypothesis is that there is not
association between attend to nursery and
respiratory infection.
Ho : OR = 1.
[Ʃae-ƩE(ae)]2
X2Mantel Haenzsel= ------------------ƩV(ae)
Mantel Haenzsel X2
 We should go, step by step, beginning with 2 x 2 of each strata.
Exposure
Disease
Yes
Disease No Total
Yes
ae
be
No
ce
de
Total
ne
Mantel Haenzsel X2
 Mantel Haenzsel Chi squared test is an average of
individuals Chi squared of each table.
 To calculate Mantel Haenzsel Chi squared test, we
need three values of each table:




ae number of ill and exposed
E(ae) value expected of ae
V(ae) variance (standard error squared) of ae,
where,

E(ae) = total row x total column / grand total = (ae + be) x (ae +
ce)/ne
(ae + be) x (ce + de) x (ae + ce) x (be + de)
V(ae) = -------------------------------------------------------ne²(ne - 1)
Example
 Overcrowding table
 a = 61
 E(a)
= 75 x 66 / 101 = 49.01
 V(a)
= (75 x 66 x 26 x 35) / (101² x (101 - 1)) = 4.42
 Non-overcrowding table
 a = 10
 E(a)
= 34 x 14 / 899 = 0.53
 V(a)
= 34 x 14 x 865 x 885 / (899² x (899 - 1)) = 0.50

To obtain Mantel Haenzsel Chi squared test (adjusted Chi squared
by overcrowding), we add these values from the two strata, using
the formula:
[Ʃae-ƩE(ae)]2
X2Mantel Haenzsel= ------------------ƩV(ae)
Example
 To obtain Mantel Haenzsel Chi squared test (Adjusted Chi squared test
by overcrowding), we add these values, using the formula:
Overcrowding
Non-overcrowding
Total
a
61
10
71
E(a)
49.01
0.53
49.54
V(a)
4.42
0.50
4.92
X2Mantel-Haenzsel = (71 – 49.54)²/4.92= 93.60
Confusion or not confusion
 How we decide if there is confusion?
 There are nor statistical tests to demonstrate
confusion.
 We do calculate statistical tests and measure the
effect raw and stratified tables.
 Then, we calculate summarized statistical test and
we compare them with the raws, and we conclude if
there is confusion or not.
Confusion or not confusion
 If there is an important difference between raw and
adjusted estimates, we say that the association of
interest is confounding by another factor.
 We look the data of children that attend to nursery
and respiratory infection.
 After adjust by overcrowding, RR diminish from 7.06
to 6.56.
Posibles effects from confusion
 Generally there are more than one confounder.
 They can have different effects:




The association in study, can be or not significative before of
adjust for a confounder and not significative after.
The association can be significative after adjust for a
confounder but with a p-value less significative.
Strata can show oposite results and in this case, it is better,
show stratified results. This is interaction or effect modified.
Confounder can hide an existing relationship.
Bibliografía
 1.- Last JM. A dictionary of epidemiology.
New York, 4ª ed. Oxford University Press,
2001:173.
 2.- Kirkwood BR. Essentials of medical
ststistics. Oxford, Blackwell Science, 1988: 14.
 3.- Altman DG. Practical statistics for medical
research. Boca Ratón, Chapman & Hall/
CRC; 1991: 1-9.
Download