Logistic regression models 1/12

advertisement
Logistic regression models
1/12
Example 1: dogs look like their owners?
I
Some people believe that dogs look like their owners. Is
this true?
I
To test the above hypothesis, The New York Times
conducted a quiz online. A group of dogs and owners are
photographed by Fred Conrad.
I
For each dog, four possible owners are given in the quiz.
Please choose the owner for each dog. http:
//www.nytimes.com/interactive/2015/02/16/
sports/westminster-dog-show-quiz.html?_r=0
2/12
Example 2: breast cancer data set
I
Consider the data set collected by Richardson et al. (2006)
as an example. The study aims to find genes that are
associated with the sporadic basal-like cancers (BLC), a
distinct class of human breast cancers.
I
In this example, the response variable Yi is the types of the
breast cancer. For instance, we use Yi = 0 to represent
the non-BLC type and Yi = 1 to represent the BLC type.
I
The predictors in this example are the gene expression
data or the SNPs data. For example, we could consider the
gene CSF2RA as one of the candidate gene.
3/12
Logistic regression models
I
Consider Yi to be Bernoulli distributed response. For
example, Yi could be failure or success, or could be
different treatment groups. Assume Yi ∼ Bernoulli(pi ) and
Yi is associated with the covariates Xi .
I
Model the conditional expectation of Yi . Recall that, in
linear models, we assume that E(Yi |Xi ) = XiT β and in the
non-linear models, E(Yi |Xi ) = f (Xi ; β).
I
In logistic regression model, assume that E(Yi |Xi ) = pi
depends on Xi . Namely, E(Yi |Xi ) = p(Xi ) for
0 ≤ p(Xi ) ≤ 1.
4/12
Link functions
In general, we assume that E(Yi |Xi ) = h(XiT β). Here h−1 (·) is
the link function, which links E(Yi |Xi ) with a linear function of
Xi . Three commonly used link functions: logit link, probit link
and complementary log-log link.
exp(z)
1+exp(z) ,
p
then h−1 (p) = log( 1−p
).
I
(Logit link) If p = h(z) =
I
(Probit link)If p = Φ(z) where Φ(z) is the CDF function of a
standard normal, then h−1 (p) = Φ−1 (p).
I
(Complementary log-log) If p = h(z) = 1 − exp{− exp(η)},
then h−1 (p) = log{− log(1 − p)}.
5/12
A logistic regression model
I
Response: Bernoulli distributed random variable
Yi ∼ Bernoulli(pi ) i = 1, · · · , n.
I
Systematic component: ηi =
I
Link function: h(ηi ) = pi .
Pp
j=1 Xij βj .
6/12
Estimation of β
The estimation of β can be obtained by the maximum likelihood
method. The likelihood function for β is
L(β) =
n
Y
piYi (1 − pi )1−Yi .
i=1
The log-likelihood function for β is
`(β) = log L(β) =
n
X
n
Yi log(
i=1
=
n
X
i=1
Yi XiT β −
n
X
X
pi
)+
log(1 − pi )
1 − pi
i=1
log{1 + exp(XiT β)}.
i=1
7/12
MLE of β
I
The MLE of β is
β = arg max `(β)
β
where `(β) is log-likelihood function of β.
I
We do not have closed form solution of β. But `(β) is a
concave function of β, which is relatively easy to optimize.
8/12
Score function and Hessian matrix
I
The score function of β is
n
n
n
i=1
i=1
i=1
X Xi exp(X T β)
X
∂`(β) X
i
=
Xi Yi −
=
Xi (Yi − pi ).
∂β
1 + exp(XiT β)
I
The hessian matrix of β is
n
X
exp(XiT β)
exp(XiT β)
∂`(β)
T
=
−
X
X
{1
−
}
i
i
∂β∂β T
1 + exp(XiT β)
1 + exp(XiT β)
i=1
= −X T VX
where V = diag{p1 (1 − p1 ), · · · , pn (1 − pn )} and
X = (X1 , · · · , Xn )T . Here pi = exp(XiT β)/{1 + exp(XiT β)}.
9/12
Extension to Binomial distributed data
Suppose we observe Binomial distributed response
Si ∼ Binomial(ni , pi ), where ni is known. We would like to study
the association between the response Si and some covariates
Xi . A corresponding logistic regression model is
Si ∼ Binomial(ni , pi )
p i
= XiT β
log
1 + pi
for i = 1, · · · , m.
10/12
Estimation of β
We can still apply the maximum likelihood method to estimate
β. The likelihood function for β is
L(β) =
m Y
ni
Si
i=1
piSi (1 − pi )ni −Si .
The log-likelihood function for β is
`(β) = log L(β) = C +
m
X
m
Si log(
i=1
=C+
n
X
i=1
Si XiT β −
m
X
X
pi
)+
ni log(1 − pi )
1 − pi
i=1
ni log{1 + exp(XiT β)}.
i=1
where C is constant that has nothing to do with β.
11/12
Score function and Hessian matrix
I
The score function of β is
m
n
m
i=1
i=1
i=1
X ni Xi exp(X T β) X
∂`(β) X
i
=
Xi Si −
=
Xi (Si − ni pi ).
∂β
1 + exp(XiT β)
I
The hessian matrix of β is
m
X
exp(XiT β)
exp(XiT β)
∂`(β)
T
=
−
n
X
X
{1
−
}
i
i
i
∂β∂β T
1 + exp(XiT β)
1 + exp(XiT β)
i=1
= −X T VX
where V = diag{n1 p1 (1 − p1 ), · · · , nm pm (1 − pm )} and
X = (X1 , · · · , Xm )T . Here pi = exp(XiT β)/{1 + exp(XiT β)}.
12/12
Download