3.4 Marginal distribution

advertisement
3.4 Using the Marginal Distribution
(I) Introduction
Recall the margin density of X is
 f x |    d continuous 

m x   
discrete 
  f x |    

Example 12:



X ~ N  , 2f ,    ~ N  , 2

Then, the marginal density of X is
m x  


f x |    d 


~ N  , 2   2f




1
2  f

e
 x  2
2
2
f
1
2  

e
    2
2 2
d
◆
Note:
m  x  is sometimes called the predictive distribution for X.
Let
m c  x 
m w  x 
be the marginal density under the correct prior and
be the marginal density under the wrong prior. Then,
the statistic obtained from the data should be “close” to the same
1
statistic based on
xc
m c  x  , not on m w  x  . For example, let
be the mode of
m c  x 
xw
and
be the mode of
m w  x  . Intuitively, the observed data x should be around xc ,
not
xw .
To find the “correct” or “sensible” prior effectively, one could restrict
the choice of the priors to some class. Then, based on some criteria,
the best prior could be found. Several classes of priors are frequently
used.
1. Priors of a given functional form:
   :     g  |  ,   ,
where

is some set and

is called a hyper-parameter of the
prior.
Example 13:




   :  ~ N  , 2 ,   0, 2  0 ,


,  2

is the hyper-parameter.
2. Priors of a given functional form:
p

t
   :      0  i , 0 is any density,   1 ,, p  ,
i 1


Example 14:


X1, X 2 ,, X p , X i ~ N i , 2f ,
2
 2f
is a known constant.
p


   :      0  i ,  0 ~ N  ,  2 ,    ,  2  0.
i 1




3. Priors close to an elicited prior:
Any prior “close” to a sensible prior
For example,

0
would also be reasonable.
contamination class is
   : 1    0    q , q  L,
where L is a class of possible “contaminations” and q  is some
density function for
.
(II) Prior selection
There are several approaches to select a sensible prior. They are
(i)
the ML-II approach
(ii) the moment approach
(iii) the distance approach
(i) ML-II approach
Let

be a class of priors under consideration. ML-II (maximum
likelihood-type II) is to find
ˆ  
satisfying
mˆ x   sup m x .
 
Example 15:
3

X1, X 2 ,, X p , X i ~ N  ,1,    ~ N  , 2
Then,

m  xi  ~ N  ,1   2

The ML-II method is to find
and
 2

.
maximizing m x  .
Thus,
p
m  x   

1

2 1   2
i 1
 
 2 1   2

p
2

 x i    2
e


2 1  2
 x i  x 2
e 2 1  e
2


p  x  

2 1  
2
2

m  x 
m  x 
 0,
0

 2
 ˆ  x ,

p
ˆ 2 
Therefore,
 x
i 1
 x
2
i
p


 1  s 2  1  ˆ 2  max 0, s 2  1
ˆ   ~ N x , max 0, s 2  1 .
Example 16:
   : 1    0    q , q  L,
Then,
m  x    f  x |  1    0    q d

  f x |  q d 
 1     f  x |   0  d  
 1   m 0  x   mq  x 
4
q̂
Now, ML-II prior is to find
in L which maximizes
is the class of all possible distributions and
Let
at
ˆ
mq  x  . If L
maximizes
f x |  
 ˆ   be the distribution with P   ˆ   1 (all mass
ˆ ). Since
mq  x    f  x |  q d




  f x | ˆ q d  f x | ˆ  q d


 f x | ˆ   f  x |  ˆ  d
 m ˆ  x 
then
ˆ    1    0    ˆ   .
(ii) Moment approach
Let
 f   and  2f   be the conditional mean of
variance of X with respect to
f x |   . Also, let  m and
 m2 be the known marginal mean and variance of X with respect
to m
 x  . Then, the following equations can be used to obtain
the moment of the prior density such as

 and  2 :
 m  E  f    E  X   E E  X | Y 

 m2  E  2f    E  f     m 2


Var X   EVar X | Y   EE  X | Y   E  X  
2
5
One special example is

 f       m    E  

 f     ,  2f     2f   m2   2f   2
Example 17:

X ~ N  ,1,    ~ N  ,  2
Suppose we know
.
2
 m  1,  m
 3 . Since
 f     ,  2f    1
then
 m  1   ,
 m2  3   2f   2  1   2   2  2
Thus,
   ~ N 1,2
is the appropriate prior.
(iii) Distance approach
Let m̂ x  be the marginal density estimate of X obtained from
the data. Also, let
mˆ  x  
 f x |  ˆ  d

be the marginal density when the best prior is found. Then, we try to
find ˆ to minimize
6
  mˆ x  

ˆ , mˆ   E log 
d m
  mˆ x  

 mˆ x  
m
ˆ x dx continuous 
  log 

 mˆ x  

 log  mˆ x  m
x  m x   ˆ x  discrete 

 ˆ 

mˆ
Note:
ˆ x   mˆ x   d m
ˆ , mˆ   0 .
m
7
Download