Over or under dispersion Problem 1/15

advertisement
Over or under dispersion
Problem
1/15
Example 1: dogs and owners data set
I
In the dogs and owners example, we had some concerns
about the dependence among the measurements from
each individual.
I
Let Yij = 1 if the j-th quiz question was answered correctly
by the i-th person. In the data set we collected,
i = 1, · · · , 27 and j = 1, · · · , 12.
I
It is reasonable to assume that Yij s (j = 1, · · · , 12) are
dependent to each other.
2/15
Example 1 continued
I
To model the dependence among Yij s (j = 1, · · · , 12), we
could assume that Yij ∼ Bernoulli(νi ) where
νi ∼ Beta(α1 , α2 ) is a random variable.
I
Using the property of Beta distribution,
E(νi ) =
I
α1
α2
1
α1
and Var(νi ) =
.
α1 + α2
α1 + α2 α1 + α2 α1 + α2 + 1
For convenience, define pi = α1 /(α1 + α2 ) and φ as an
additional parameter. Then
E(νi ) = pi and Var(νi ) = φpi (1 − pi ).
3/15
Example 1 continued
I
By using the above model, we know that
Cov(Yij , Yij 0 ) = E(Yij Yij 0 ) − E(Yij )E(Yij 0 ) = E(νi2 ) − E 2 (νi )
= Var(νi ) = φpi (1 − pi ).
I
If φ = 0, then Yij and Yij 0 are uncorrelated. This also
implies that νi is a constant degenerated to pi .
I
If φ > 0, then Yij and Yij 0 are dependent. This corresponds
to the overdispersion case.
4/15
Example 1 continued
Let Si = Yi1 + · · · + Yini . Then we have
E(Si ) = E{E(Si |νi )} =
ni
X
E(νi ) = ni pi ,
j=1
and
Var(Si ) = E{Var(Si |νi )} + Var{E(Si |νi )}
= E{ni νi (1 − νi )} + Var{ni νi }
= ni (pi − φpi (1 − pi ) − pi2 ) + ni2 φpi (1 − pi )
= ni pi (1 − pi )[1 + (ni − 1)φ].
I
If φ = 0, no dispersion.
I
If φ > 0, over dispersion; If φ < 0, under dispersion.
5/15
Over and under dispersion problems
In a common logistic regression model,
Si ∼ Binomial(ni , pi ) and pi =
exp(XiT β)
.
1 + exp(XiT β)
If one assumes that the model for pi is correctly specified but
the observed variance of Si is larger or smaller than expected
variance ni pi (1 − pi ), then we have the so-called under or over
dispersion problems.
6/15
Detection of over or under dispersion problem
If the usual logistic regression model is correct, then the
deviance D follows a chi-square distribution with m − p degrees
of freedom.
I
If D > m − p = E(χ2m−p ), it could be an indicator of the
over dispersion problem.
I
If D < m − p = E(χ2m−p ), it could be an indicator of the
under dispersion problem.
I
But D is away from m − p could also be the result of (1):
under or over fitting; (2): wrong link function; (3): existence
of outliers; (4): binary data or ni small.
7/15
Possible reasons for dispersion
I
Variation among success probabilities.
I
Correlation among binary responses.
8/15
Over or under dispersion logistic regression model
Let Si be the number of successes among ni trials. An over or
under dispersion logistic regression model assumes that
E(Si ) = ni pi and Var(Si ) = φni pi (1 − pi ).
Moreover,
pi =
exp(XiT β)
.
1 + exp(XiT β)
Here φ is called dispersion parameter.
9/15
Quasi-likelihood
I
Recall that, in a usual logistic regression model,
Si ∼ Binomial(ni , pi ) and pi = exp(XiT β)/{1 + exp(XiT β)}.
I
The log-likelihood for β is
`(β) =
m
X
Si XiT β
i=1
I
−
m
X
ni log{1 + exp(XiT β)} + Constant.
i=1
The score function for β is
m
∂`(β) X
=
Xi (Si − ni pi )
∂β
i=1
m
m Z
X
Si − ni pi ∂ni pi
∂ X µi Si − µ
=
dµ
=
ni pi (1 − pi ) ∂β
∂β
V (µ)
Si
i=1
i=1
where V (µ) = µ(ni − µ)/ni and µi = ni pi .
10/15
Quasi-likelihood for over or under dispersion models
I
Define the log quasi-likelihood for β as
Q(β) =
m
X
Qi =
i=1
I
m Z
X
i=1
µi
Si
Si − µ
dµ.
φV (µ)
The maximum quasi-likelihood estimator of β is
β̂ = arg max Q(β).
I
The above estimator β̂ is the same as the MLE of β in a
usual logistic regression model. Because φ is not useful in
the quasi-score function.
11/15
Estimation of dispersion parameter
I
Define the Pearson χ2 statistic as
χ2 =
m
X
(Si − ni p̂i )2
.
ni p̂i (1 − p̂i )
i=1
I
It can be shown that E(χ2 ) ≈ (m − p)φ.
I
Then we can estimate φ by φ̂ = χ2 /(m − p).
12/15
Deviance
I
The deviance for the over or under dispersion logistic
regression model is defined as
D = −2φQ = −2
m Z
X
i=1
I
µi
Si
Si − µ
dµ.
V (µ)
The deviance is the same as the usual logistic regression
model without dispersion parameter.
13/15
Wald type inference for β
For over or under dispersion logistic regression model,
β̂ − β ∼ N(0, φ(X T VX )−1 )
where V = diag(n1 p1 (1 − p1 ), · · · , nm pm (1 − pm )) and φ is the
dispersion parameter.
In performing the inference, we need to take the dispersion
parameter into consideration.
14/15
Likelihood ratio type inference for β
Suppose we consider comparing the following two models
Model Deviance Covariates
1
D1
x1 , · · · , xl
2
D2
x1 , · · · , xl , xl+1 , · · · , xp
for p > l. This corresponds to testing H0 : βl+1 = · · · = βp = 0
vs H1 : not H0 .
The test statistic for testing above hypothesis is
Fn =
(D1 − D2 )/(p − l)
φ̂
,
which follows an Fp−l,m−p distribution under H0 .
15/15
Download