Likelihood Ratio Test for multiparameter problems:

advertisement
Statistics 512 Notes 19:
Example 2:
Gamma distribution:
 1
 1  x / 
x
e
, 0 x


f ( x;  ,  )   ( ) 
0,
elsewhere

 log f ( X ;  ,  )
 '( )

 log   log X

( )
 log f ( X ;  ,  )
 X
  2

 
  ''( )( )    '( ) 2
1

2

( ) 


I ( ,  )   E ,  
 2X
- 1

 
2 3
  ''( )( )    '( ) 2 1 


2



 ( ) 


 
1
 
 2 
For the Illinois rainfall data,
ˆ MLE  .4408
ˆMLE  .5091






Thus,
  ''(.4408)(.4408)    '(.4408) 2
1 


2
.5091

 (.4408) 
I (ˆ MLE , ˆMLE )  


1
.4408


2
.5091
 .5091

6.133 1.964 
1.964 1.701 


infmat=matrix(c(6.133,1.964,1.964,1.704),ncol=2)
> invinfmat=solve(infmat)
> invinfmat
[,1]
[,2]
[1,] 0.2584428 -0.2978765
[2,] -0.2978765 0.9301816
Thus,

 0.259 -0.298  
0 

227
227
ˆ
(ˆ MLE   ,  MLE   )  N    , 

  0   0.298 0.259  


 227
227  

Thus, approximate 95% confidence intervals for  and
 are
0.259
 : 0.441  1.96
 (0.375, 0.507)
227
 : 0.509  1.96
0.930
 (0.384, 0.634)
227
Note: We can also use observed Fisher information or the
parametric bootstrap to form confidence intervals based on
maximum likelihood estimates.
Observed Fisher information:
D
n (O)(ˆn   )  N p (0, IdentityMatrix) where the observed
information matrix O equals
1 n  2 log f ( X i )
Oij    i 1
n
i  j
(ˆMLE
  0  1

ˆ
   O ( MLE ) 
 )  N   ,

n
0

 

 ˆMLE
  ''(ˆ MLE )(ˆ MLE )    '(ˆ MLE ) 2

2

 (ˆ MLE ) 
O
 1
 ˆ
  MLE
6.133 1.964 
1.964 1.700 


1
ˆMLE
ˆ MLE
ˆMLE 2




n
2 i 1 X i 


nˆMLE 3 
O is very close to I (ˆ MLE , ˆMLE )
Parametric bootstrap:
Resample from f ( x;ˆMLE )
*
*
For each resample data set ( X 1 , , X n ) , compute
ˆ ( X * , , X * ) . Percentile bootstrap approximate 95%
MLE
1
n
confidence interval for  j = (2.5% quantile of
ˆ
( X * , , X * ) , 97.5% quantile of ˆ
( X *,
MLE , j
1
n
MLE , j
1
, X n* ) )
# MLE for gamma
alphahatfunc=function(alpha,xvec){
n=length(xvec);
eq=-n*digamma(alpha)n*log(mean(xvec))+n*log(alpha)+sum(log(xvec));
eq;
}
mlegammafunc=function(X,alphahatlow=.01,alphahathigh
=10){
# Need to make sure that alphahatfunc(alphahatlow)>0,
# alphahatfunc(alphahathigh)<0
tempoptim=uniroot(alphahatfunc,interval=c(alphahatlow,al
phahathigh),xvec=X);
mlealphahat=tempoptim$root;
mlebetahat=mean(X)/mlealphahat;
list(mlealphahat=mlealphahat,mlebetahat=mlebetahat);
}
# Bootstrap CI
bootcigammafunc=function(X,m,signiflevel){
# X is a vector containing the original sample
# m is the desired number of bootstrap replications
n=length(X);
mlevec=mlegammafunc(X);
mlealphahat=mlevec$mlealphahat;
mlebetahat=mlevec$mlebetahat;
bootmlealphahatvec=rep(0,m);
bootmlebetahatvec=rep(0,m);
for(i in 1:m){
bootX=rgamma(n,shape=mlealphahat,scale=mlebetahat);
bootmle=mlegammafunc(bootX);
bootmlealphahatvec[i]=bootmle$mlealphahat;
bootmlebetahatvec[i]=bootmle$mlebetahat;
}
cutoff=floor((signiflevel/2)*(m+1));
bootmlealphahatsorted=sort(bootmlealphahatvec);
bootmlebetahatsorted=sort(bootmlebetahatvec);
# Lower CI endpoints
lowercialpha=bootmlealphahatsorted[cutoff];
lowercibeta=bootmlebetahatsorted[cutoff];
# Upper CI endpoints
uppercialpha=bootmlealphahatsorted[m+1-cutoff];
uppercibeta=bootmlebetahatsorted[m+1-cutoff];
list(lowercialpha=lowercialpha,uppercialpha=uppercialpha,
lowercibeta=lowercibeta,uppercibeta=uppercibeta);
}
> bootcigammafunc(illinoisrainfall,1000,.05)
$lowercialpha
[1] 0.3787617
$uppercialpha
[1] 0.516352
$lowercibeta
[1] 0.3914287
$uppercibeta
[1] 0.6354028
Likelihood Ratio Test for multiparameter problems:
The hypotheses of interest are
H 0 :    versus H1 :     C
where    is defined in terms of q, 0  q  p ,
independent constraints of the form
g1 ( )  a1 , , g q ( )  aq .
max  L( )


Likelihood ratio:
max  L( )
Reject for small values of  .
Theorem 6.5.1: Let X 1 , , X n be iid with pdf
f ( x;  (1 , ,  p )) for   . Assume the regularity
conditions (R6-R9) hold. Under the null hypothesis,
H0 :   ,
D
2 log    2 ( q )
2
Thus, we reject H 0 :    when 2 log    (q)
Example 1: Likelihood ratio test for the mean of a normal
distribution.
Let X 1 , , X n be a random sample from a normal
2
distribution with mean  and variance  unknown.
Suppose we are interested in testing
H 0 :   0 versus H1 :   0
where 0 is specified.
2
2
Let   {( ,  ) :     ,   0} denote the full
model parameter space. Here the null hypothesis parameter
space is defined as the subset of  for which the function
g1 (  ,  2 )   satisfies the constraint
g1 (  ,  2 )  0 .
  {(  ,  2 ) :   0 ,  2  0}
The MLEs for the parameter space  are ˆ  X and
1 n
ˆ 2   i 1 ( X i  X ) 2 . It is easy to show that the MLEs
n
for the parameter space  are ̂0  0 and
ˆ 0
2


2
(
X


)
i
0
i 1
n
n
. Thus, the likelihood ratio is
n
n
n
2

 1   1 
  i 1 ( X i  0 ) 

exp






 ˆ
2
ˆ
2

2



0
0




max  L( )




n
n
2
max  L( )  1   1 n

 1  i 1 ( X i  X ) 

exp







2
ˆ
2
2

 2   ˆ 




n
 1 
 n
 ˆ  exp    n ( X  X ) 2 n / 2

 2    i 1 i

  0 n

n
2 

1
 n
 ( X i  0 ) 
 ˆ  exp    i 1
 
 2
  n ( X i  0 ) 2 

2 log   n log  in1
2
  (Xi  X ) 
 i 1

2
We reject for 2 log    (1)
Using the identity
2
2
2
(
X


)

(
X

X
)

n
(
X


)
,
i1 i 0 i1 i
0
n
n




2
n
(
X


)
0

2 log   n log 1  n
2 

we have
(
X

X
)

i
i 1


n


Thus the likelihood ratio test rejects for large values of
( X  0 ) 2

2
.
(
X


)
i
0
i 1
n
n
Example 2: Linkage in genetics
Corn can be starchy (S) or sugary (s) and can have a green
base leaf (G) or a white base leaf (g). The traits starchy and
green base leaf are dominant traits. Suppose the alleles for
these two factors occur on separate chromosomes and are
hence independent. Then each parent with alleles SsGg
produces with equal likelihood gametes of the form (S,G),
(S,g), (s,G) and (s,g). If two such hybrid parents are
crossed, the phenotypes of the offspring will occur in the
proportions suggested by the able below. That is, the
probability of an offspring of type (S,G) is 9/16; type (SG)
is 3/16; type (S,g) 3/16; type (s,g) 1/16.
Alleles of first parent
Alleles
SG
Sg
of
SG
(S,G) (S,G)
second Sg
(S,G) (S,g)
parent sG
(S,G) (S,G)
sG
(S,G)
(S,G)
(s,G)
sg
(S,G)
(s,G)
(s,G)
Sg
(S,G)
(S,g)
(s,G)
(s,g)
The table below shows the results of a set of 3839 SsGg x
SsGg crossings (Carver, 1927, Genetics, “A Genetic Study
of Certain Chlorophyll Deficiencies in Maize.”)
Phenotype
Starchy green
Starchy white
Sugary green
Sugary white
Number in sample
1997
906
904
32
Does the genetic model with 9:3:3:1 ratios fit the data?
Let X i denote the phenotype of the ith crossing.
Model: X 1 , , X n are iid multinomial.
P( X i  SG )  pSG , P( X i  Sg )  pSg , P( X i  sG )  psG , P( X i  sg )  psg
H 0 : pSG  9 /16, pSg  3/16, psG  3/16, psg  1/16
H1 : At least one of pSG  9 /16, pSg  3/16, psG  3/16, psg  1/16 is not correct.
Maximum likelihood for multinomial distribution:
Consider a random trial which can result in one, and only
one, of k outcomes or categories. Let p1 , , pk 1 denote
the probabilities of the 1,...,k-1 outcomes (Note:
pk  1  p1   pk 1 . Let X denote the outcome of the
trial. For X 1 , , X n iid, let Y1 , , Yk denote the number of
trials whose outcome is 1,...,k respectively.
We have
P( X 1 ,
l ( p1 ,
, X n )  p1Y1
pk 1 )  Y1 log p1 
pk 1Yk 1 (1  p1 
 Yk 1 log pk 1  (n  Y1 
 pk 1 ) n Y1 
Yk 1 ) log(1  p1 
Yk 1
.
 pk 1 )
Y n  Y1   Yk 1
l
 1 
p1 p1 1  p1   pk 1 ,...,
Y
n  Y1   Yk 1
l
 k 1 
pk 1 pk 1 1  p1   pk 1
Yj
It is easily seen that pˆ j , MLE  n satisfies these equations.
See (6.4.19) and (6.4.20) in book for information matrix.
Back to genetic model:
Likelihood ratio test:

max  L( )
(9 /16)1997 (3 /16)906 (3 /16)904 (1/16)32

max  L( ) (1997 / 3839)1997 (906 / 3839)906 (904 / 3839)904 (32 / 3839)32
9 /16
3 /16
 906 log

1997 / 3839
906 / 3839
3/16
1/16
904log
 32 log
)  387.51
904/3839
3839
2 log   2*(1997 log
Under H 0 : pSG  9 /16, pSg  3 /16, psG  3 /16, psg  1/16 ,
2 log  ~  2 (3) [there are three extra free parameters in
H1 ].
2
Reject H 0 if 2 log   .05 (3)  7.81 .
Thus we reject
H 0 : pSG  9 /16, pSg  3 /16, psG  3 /16, psg  1/16 .
What could be going on? Linkage. See handout.
Model for linkage:
1
1
1
1
pSG  (2   ), pSg  (1   ), psG  (1   ), psg  
4
4
4
4
Y
Y
Y
1
2
3
1
 1
 1
 1 
L( )   (2   )   (1   )   (1   )    
4
 4
 4
 4 
n Y1 Y2 Y3
Maximum likelihood estimate of  for corn data = 0.0357,
see handout.
Test
1
1
1
1
(2   ), pSg  (1   ), psG  (1   ), psg   vs.
4
4
4
4
H1 : pSG , pSg , psG , psg do not satisfy
H 0 : pSG 
1
1
1
1
(2   ), pSg  (1   ), psG  (1   ), psg  
4
4
4
4
for any  ,0    1
pSG 
max  L( ) (.25*(2  .0357))1997 (.25*(1  .0357))906 (.25*(1  .0357))904 (.25*.0357)32


max  L( )
(1997 / 3839)1997 (906 / 3839)906 (904 / 3839) 904 (32 / 3839)32
2 log   2.02
2
Under H 0 , 2 log  ~  (2) [there are two extra free
parameters in H1 ].
2
2 log   .05
(2)  5.99
Linkage model is not rejected.
Download