1 732A35/732G28

advertisement
732A35/732G28
1


Before, we considered qualitative predictors  dummy variables
Now, qualitative responses
Plan:

Regression with binary response

Simple logistic regression

Multiple logistic regression

Ordinal logistic regression

Nominal logistic regression
732A35/732G28
2




Binary response:
Y:Mortality=(Died, Alive). X1:tobacco exposure X2:age
Binary response:
Y:Borrower=(Yes, No) X1:Income, X2:Age, X3: Marital
status,…
Ordinal response:
Y:Number of falls (older people, per 6 month) X1:gender,
X2:Strength index, X3: Balance index
Nominal response:
Y:Consumer choice (product A, product B, product C) X1:Age,
X2 gender, X3: Salary…
732A35/732G28
3
Straightforward idea:
 Simple regression model Yi  0, 1
Yi   0  1 X i   i

Bernoulli response
Yi
1
0

Probability
P(Yi = 1) = 
P(Yi = 0) = 1 - 
E(Y)=𝜋
732A35/732G28
4
Problems
1.
2.
Nonnormal errors:
◦ When Yi = 1
 i  1   0  1 X i
◦ When Yi = 0
 i    0  1 X i
Nonconstant error variance
 2  i    2 Yi    i 1   i   0  1 X i 1  0  1 X i 
Constraints on response
0  E (Y )    1
 Linear function will not work, need to restrict domain (plot
the response)
3.
732A35/732G28
5

Model Yi  EYi   i
exp(  0  1 X i )
EYi    i 
1  exp(  0  1 X i )
EYi  
Typical logistic regression function
1
0.9
0.8
1
0.7
1  exp(   0  1 X i )
0.6
0.5

Logit transformation
  
FL1  i   ln  i 
1 i 

Odds
i
1  i
0.4
0.3
0.2
0.1
0
-10
-8
-6
-4
-2
0
732A35/732G28
2
4
6
8
10
6

By using maximum likelihood (instead of least squares)
◦ Method maximizes a likelihood function L(data,β0,β1) and thus
finds estimates of parameters
◦ L measures how suitable β0,β1 for given data
◦ In general, computer-intensive optimization methods required
(logistic regression in particular)
Predicting new observation:
If predicted value πh>0.5  predict 1 otherwise 0
732A35/732G28
7
Interpretation of b1:
 ˆ 
 '  X i   b0  b1 X i  ln  i 
 1  ˆ i 
Look at
 '  X i  1   '  X i 

 odds2 

b1  ln 
 odds1 
b1 is logarithm of odds ratio
𝑒
𝑏1
𝑜𝑑𝑑𝑠2
=
= 𝑂𝑅
𝑜𝑑𝑑𝑠1
732A35/732G28
8
25 computer programmers with different experience have
performed a test. For each programmer we have recorded
whether he or she passed the test (Y = 1) or not (Y = 0).
Task
success (Y)
14
0
29
0
6
0
25
1
.
.
.
.
8
1
1
Success in task (Y)
Months of
experience
(X)
0,5
0
0
10
20
30
40
Months of experience (X)
9
Binary Logistic Regression: Task success (Y) versus Months of experi
Link Function: Logit
Response Information
Variable
Value
Count
Task success (Y)
1
11
0
14
Total
25
(Event)
Logistic Regression Table
95%
Odds
Predictor
Coef
SE Coef
Z
P
Constant
-3.05970
1.25935
-2.43
0.015
Months of experience (X)
0.161486
0.0649801
2.49
0.013
CI
Ratio
Lower
1.18
1.03
Upper
1.33
Log-Likelihood = -12.712
Test that all slopes are zero: G = 8.872, DF = 1, P-Value = 0.003
Goodness-of-Fit Tests
Method
Chi-Square
DF
P
Pearson
19.6226
17
0.294
Deviance
19.8794
17
0.280
5.9457
8
0.653
Hosmer-Lemeshow
10
Success in task (Y)
1
0,5
0
0
5
10
15
20
25
30
35
Months of experience (X)
11
Our example:

exp(0.1615) = 1.175
Interpretation: the odds for a programmer to succeed the task
increases by 17.5% for each extra month of experience.

Predict the probability that a programmer with 12 months
experience succeeds the task.
732A35/732G28
12

Model Yi  EYi   i
EYi    i 
exp(  0  1 X i1  ... p 1 X i , p 1 )
1  exp(  0  1 X i1  ... p 1 X i , p 1 )
EYi  

1
1  exp(  X' β)
Logit transformation
  
X' β  ln  i 
1 i 

Typical multiple logistic regression plot
Odds
1
0.8
0.6
0.4
0.2
0
10
10
5
5
i
1  i
0
0
-5
-5
-10
732A35/732G28
-10
13


Variance-covariance matrix estimated as Hessian of Likelihood
estimated at b
This is a computer routine
Wald Test: Test about single βk
Ho : k  0

Test

bk
Step 1: compute Z  sb 
k

Step 2: Plot the distribution , mark the points z(1-α/2)
and the critical area.

Step 3: define where Z is and reject H0 if it is in the critical area
H a :  k 0
732A35/732G28
14
Likelihood Ratio Test
H0: βq = βq+1= ...= βp-1 = 0
Ha: not all β in H0 are zero
1
F 
Full model:
1  exp(  X' β F )
1


Reduced model
R
1  exp(  X' β R )
 L( R ) 
Test statistics: G 2  2 ln 

 L( F ) 
Decision rule: If G 2   2 (1   , p  q) , conclude Ha
732A35/732G28
15
Months of experience
Task success
Age
Sex
14
0
44
0
29
0
45
0
6
0
52
1
25
1
48
1
18
1
22
0
4
0
19
0
18
0
20
0
12
0
42
1
22
1
21
0
6
0
19
1
30
1
24
1
11
0
48
0
30
1
31
1
5
0
18
1
20
1
49
0
13
0
22
1
9
0
41
1
32
1
58
0
24
0
29
1
13
1
29
0
19
0
35
0
4
0
54
0
28
1
24
0
22
1
57
0
8
1
32
1
gathered information about
programmers age and their
sex as well.
732A35/732G28
16
Binary Logistic Regression: Task success versus Months of ex, Age, Gender
Link Function: Logit
Response Information
Variable
Value Count
Task success 1
11
0
14
Total
25
(Event)
Logistic Regression Table
Predictor
Constant
Months of experience
Age
Gender
Coef
-2.53002
0.163344
-0.0091584
-0.358858
SE Coef
2.09690
0.0661327
0.0408756
1.10029
Z
-1.21
2.47
-0.22
-0.33
P
0.228
0.014
0.823
0.744
Odds
Ratio
1.18
0.99
0.70
95% CI
Lower Upper
1.03
0.91
0.08
1.34
1.07
6.04
Log-Likelihood = -12.650
Test that all slopes are zero: G = 8.996, DF = 3, P-Value = 0.029
Goodness-of-Fit Tests
Method
Chi-Square
Pearson
24.8743
Deviance
25.3007
Hosmer-Lemeshow
7.4477



DF
21
21
8
P
0.253
0.234
0.489
Test β1 = 0
Compute 95 % interval for β1
Test whether β2 =β3=0
17
Nominal Response
1.
Code c categories 1..c
2.
Coding : c categories, c-1 response variables
1, case i response is category j
Yij  
0, otherwise

3.
Model (one category – reference, others give C-1 sets of βs)
 ij 
exp X i ' β j 
C 1
1   exp X i ' β k 
k 1
732A35/732G28
18
Ordinal response
1.
Code categories 1..C
2.
Assumed order, i.e. category 1<category 2
Model (one slope β without β0, C-1 intercepts α1… αC-1)
PYi  j  
1
1  exp(  j  Xi ' β)
Comments:

Nominal logistic regression can be used for ordinal data,
but ordinal regression should be more precise and easy to
interpret

Estimation=Maximum likelihood
732A35/732G28
19
Suppose you are a field biologist and you believe that adult
population of salamanders in the Northeast has gotten
smaller over the past few years.
You would like to determine whether any association exists
between the length of time a hatched salamander
survives and level of water toxicity, as well as whether
there is a regional effect.
Survival time is coded as 1 if < 10 days, 2 = 10 to 30 days,
and 3 = 31 to 60 days.
3,5
3
Survival
2,5
2
1,5
1
0,5
0
0
20
40
60
80
Toxic level
20
Nominal Logistic Regression: Survival versus ToxicLevel
Response Information
Variable
Survival
Value
3
2
1
Total
Count
12
46
15
73
(Reference Event)
Logistic Regression Table
Predictor
Logit 1: (2/3)
Constant
ToxicLevel
Logit 2: (1/3)
Constant
ToxicLevel
Coef
SE Coef
Z
P
Odds
Ratio
95% CI
Lower Upper
-6,50972
0,195767
2,77595
0,0717625
-2,35
2,73
0,019
0,006
1,22
1,06
1,40
-10,9783
0,268063
3,19125
0,0788227
-3,44
3,40
0,001
0,001
1,31
1,12
1,53
Log-Likelihood = -57,937
Test that all slopes are zero: G = 17,420, DF = 2, P-Value = 0,000
Goodness-of-Fit Tests
Method
Pearson
Deviance
Chi-Square
266,244
87,239
DF
106
106
P
0,000
0,908
732A35/732G28
21
Ordinal Logistic Regression: Survival versus ToxicLevel
Link Function: Logit
Response Information
Variable
Value
Count
Survival
1
15
2
46
3
12
Total
73
Logistic Regression Table
Odds
Predictor
Coef
SE Coef
Z
P
Const(1)
-6.86978
1.61172
-4.26
0.000
Const(2)
-3.35691
1.40131
-2.40
0.017
ToxicLevel
0.119939
0.0337103
3.56
0.000
95% CI
Ratio
Lower
Upper
1.13
1.06
1.20
Log-Likelihood = -59.374
Test that all slopes are zero: G = 14.546, DF = 1, P-Value = 0.000
Goodness-of-Fit Tests
Method
Chi-Square
DF
P
Pearson
113.052
107
0.326
Deviance
90.113
107
0.880
22

Ch14: 14.1, 14.2 but only logistic mean response function.
14.3 but not likelihood functions.
14.4 but not Fitting a model
14.5
14.11 but nothing about likelihoods.
14.12 but nothing about likelihoods
14.14 briefly.
732A35/732G28
23
Download