HandyRef

advertisement
Binary predictor
Binary outcome
HANDY REFERENCE SHEET – HRP/STATS 261, Discrete Data
2x2 Contingency Tables
Exposed (E)
Unexposed (~E)
Disease (D)
a
c
No Disease (~D)
b
d
Measures of Association
Risk Ratio =
a /(a  b)
c /(c  d )
95% CI: RR * exp
Odds Ratio =

1 a /(a b ) 1c /(c  d ) 

 1.96

a
c


, RR * exp

1 a /( a b ) 1c /(c  d ) 

 1.96

a
c


ad
bc
95% CI: OR * exp

1 1 1 1
   
 1.96
a b c d 

, OR * exp

1 1 1 1
   
 1.96
a b c d 

Difference in Proportions:
H0: p D / E  p D / ~ E  0 [for case-control study: p e / d  p e / ~ d  0 ]
Test Statistic:
pˆ d / e  pˆ d / ~ e
~Z




( p d / e )(1  p d / e ) ( p d / ~ e )(1  p d / ~ e )

ne
n~e
95% CI: ( pˆ d / e  pˆ d / ~ e )  1.96 *
( pˆ d / e )(1  pˆ d / e ) ( pˆ d / ~ e )(1  pˆ d / ~ e )

ne
n~e
SAS CODE:
proc freq data=yourdata;
tables yourExposure*yourOutcome /measures cl;
weight counts; *if you have grouped data;
run;
i
Binary predictor
Binary outcome
Categorical covariate
2x2xK Contingency Tables
Notation: outcome=d; predictor=e; categorical covariate=k
Steps
1. Calculate crude ORd-e (or RRd-e)
2. Calculate stratum-specific OR’s: ORd-e/k=K
3. If crude OR and stratum-specific OR’s are all similar  STOP. k is unlikely to
be a confounder or an effect modifier, and you may use usual methods for 2x2
table (see page i), ignoring k.
4. If crude OR and stratum-specific OR’s differproceed to (a) or (b) below:
a) If stratum-specific OR’s are similar to each other  suspect confounding
(i) Apply Cochran-Mantel-Haenszel test of conditional independence:
 H0: e and d are conditionally independent.
 Test Statistic:
k
Disease
Exposed
Unexposed
a
c
No
Disease
b
d
[
 (a
k
 E(a k ))] 2
i 1
~ 1
k
 Var(a
2
k)
E(a k ) 
( a  b) k ( a  c ) k
; where n k  total in stratum k
nk
Var(a k ) 
i 1
(a  b) k (a  c) k (b  d ) k (c  d ) k
nk2 (nk  1)
(ii) Calculate Mantel-Haenszel (MH) summary OR (adjusted for
confounding by k). If you rejected the null in (i), then MH OR should
be  1.0. If you did not reject null in (i), then MH OR should be 1.0.
k
ORd e / k 

ai d i
Ti

bi c i
Ti
i 1
k
i 1
For RR:
RRd e / k 

a i (c i  d i )
Ti

ci (a i  bi )
Ti
k
i 1
k
i 1
b) If stratum-specific OR’s differ from each other suspect interaction
(effect modification)
(i) Apply Breslow-Day test of homogeneity of the OR’s:
 H0: stratum-specific OR’s are equal (homogenous)
 Test Statistic: complex use computer software
 If you reject null effect modification is present. Report stratumspecific OR’s
 If you fail to reject null insufficient evidence of effect
modification; calculate Mantel-Haenszel summary OR, as above.
SAS CODE:
proc freq data=yourdata;
tables yourCovariate*yourExposure*yourOutcome /cmh;
weight counts; *if you have grouped data;
run;
ii
Categorical Variables that lack clear
predictor/outcome distinction
Multi-way Contingency Tables:
Notation: x, y, and z are categorical variables
Chi-square test of independence (RxC table):
H0: x, y, and z are independent
Test Statistic:
(Observed  Expected) 2
~  2 ( rows1)( columns1)
Expected
cells

Log-Linear Model
log( counts)     x x   y y   z z + interactions if appropriate
Odds ratio interpretation (if 2x2 table):
log( ORd e )  log(
ad
)  log a  log d  log b  log c
bc
ad
)  (   d   e   d *e )  ( )  (   d )  (   e )   d *e
bc
 OR  e  d *e
log(
SAS CODE:
Chi-square test
proc freq data=yourdata;
tables yourX*yourY*yourK / chisq;
weight counts; *if you have grouped data;
run;
Log-linear models
proc genmod data=yourdata;
model total = yourX yourY yourK YourInteractions /
dist=poisson link=log pred;
run;
iii
Unrestricted predictors and covariates
Binary outcome
Logistic Regression
Model:
ln(
Simplest form of the logistic likelihood (from 2x2):
p
)     1 x1   2 x 2  ....
1 p
e   e a
1
e c
1 d
l( )  (
) x(
) b x(
) x(
)
  e
  e

1

e
1

e
1 e
1 e
Tests of Model Fit:

Wald Test
Z

Exposed
Disease
a
No Disease
b
Unexposed
c
d
ˆ  0
asymptotic standard error ( ˆ )
Likelihood Ratio Test: where full model has n parameters and reduced model
has n-p
 2 ln
L(reduced)

L( full)
 2 ln( L(reduced))  [2 ln( L( full))] ~  2p
Interpretation of Estimated Coefficients in Odds and Probabilities:
OR interpretation: ORexposure= e
exp
OR interpretation in the presence of interaction (two binary predictors):
ORexposure/interacting factor present = e
ORexposure/interacting factor absent = e
 e   k   ek
e
e  e
Probability interpretation: P(D/E) =
1  e  e
SAS CODE:
proc logistic data = analysis;
class YourPredictor_c (ref= "YourBaseline");
* automatic dummy coding to get >1 OR against the reference group;
model YourOutcome_c (event = "Yes") = YourPredictor / lackfit;
*lackfit gives Hosmer and Lemeshow goodness of fit chi-square test;
output out = OutDataSet
p = Predicted;
*outputs predicted probabilities to new dataset;
run;
iv
Pair-Matched Data
Binary Predictor
Binary Outcome
Pair-Matched Data
Case
Matched-control
Exposed
Unexposed
a
b
c
d
Exposed
Unexposed
Odds Ratio
OR 
b
c
McNemar’s Test


H0: exposure and disease are independent
Test Statistic:
(b  c) 2
~  12
bc
Matched Data
Unrestricted Predictors and Covariates
Binary Outcome
1:M Matched Data
Conditional Logistic Regression:
The simplest likelihood (from 2x2):
e e b
1 c
l( )  (
) x(
)
1  e e
1  e e
OR interpretation: ORexposure= e
exp
SAS CODE:** SAS V9 ONLY
proc logistic data = yourData ;
model YourOutcpme_c (event= "Yes") = YourPredictor1
YourPredictor2 YourPredictor3...;
strata MatchingVariable1 MatchingVariable2 / info;
output out = OutDataset
p = Predicted;
run;
v
Download