Stat 543 Assignment 1 (due Monday January 25, 2016)

advertisement
Stat 543 Assignment 1 (due Monday January 25, 2016)
Likelihood Functions, Bayes Paradigm, and Simple Decision Theory
1. In this problem we’ll consider some issues concerning the de…ning and meaning of a
"likelihood function." Underlying what follows, suppose that X1 ; X2 ; : : : ; X10 are iid
exponential with mean : That is, we’ll suppose that the marginal pdf is
f (xj ) = I [x
0]
1
exp
x
Here are two data sets (pseudo-randomly) generated from this model with
Data Set #1 :
Data Set #2 :
= 2:
2:661; 3:764; 0:079; 1:286; 0:081; 0:233; 2:283; 2:600; 0:321; 0:421
2:426; 1:852; 0:803; 11:524; 0:055; 1:266; 0:506; 0:448; 0:749; 1:930
(a) Plot, on the same set of axes, the natural logarithms of the likelihood functions (as
de…ned in class for jointly continuous observations) based on Data Sets #1 and
#2. (It is often convenient to work on a log-likelihood scale, especially in models
with independent observations.) Are these two functions exactly the same? Do
they provide some reliable indication of how Vardeman generated the data?
(b) (Type I left censoring) Suppose that what is observable is not Xi , but rather
Wi = Xi I [Xi > 1]. That is, if an observation is less than 1, I observe a "0"
(that I might as well take as a code for "observation was less than 1"). In this
circumstance, a sensible de…nition of a likelihood function would seem to be
Lleft ( ) =
10
Y
I [wi = 0] 1
exp
1
+ I [wi
1]
1
exp
wi
i=1
Plot the logarithm of this likelihood function for Data Set #1. How does it
compare to that for Data Set #1 in part (a)? Does it have a similar "vertical
position" on the plot? Does it have a similar "shape"?
(c) (Rounding/digitalization) Using the joint pdf as a likelihood function essentially
means that regardless of how many digits are written down, a data value entered
to make the likelihood is treated as an "exact" "real number." That is, "2:661"
implicitly has an in…nite number of trailing 0’s behind it. Another (more realistic)
point of view is to think of "2:661" as meaning "something between 2:6605 and
2:665." Under this interpretation, a more sensible likelihood function than the
one used in part (a) is
L:0005 ( ) =
10
Y
(F (xi + :0005j )
F (xi
:0005j ))
i=1
where F (xj ) is the exponential cdf. Plot the logarithm of L:0005 ( ) for Data Set
#1 and compare it to the loglikelihood plotted in part (a). Then do something
similar, supposing that Data Set #1 had been recorded and reported to only 2
decimal places, and then to only 1 decimal place. How do these two loglikelihoods
compare to ln L:0005 ( ) and to the one plotted in part (a)?
1
2. Suppose that X1 ; X2 ; : : : ; X5 are iid N( ;
randomly) generated from this model with
Data Set #1 :
Data Set #2 :
2
). Below are two data sets (pseudo= 0 and = 1:
0:041; 0:705; 0:088; 0:103; 1:9203
0:195; 0:551; 0:821; 0:319; 0:457
Make contour plots of the loglikelihood functions for these two data sets (for 2 <
< 2 and 0 < < 2) (using the de…nition of likelihood function given in class for
jointly continuous observations). If you haven’t before made a contour plot in R, see
the syntax provided for the last part of Lab #8 at
http://www.public.iastate.edu/~vardeman/stat401/stat401B.html
3. There are two possible diagnostic tests for a certain medical condition. Individuals
from some population (say infants born at hospital X) are tested for the condition. In
3 separate studies
10 of 100 tested with Test A alone had a "positive" test result
20 of 100 tested with Test B alone had a "positive" test result
50 tested with both Test A and Test B broke down as below
+
Test A
Test B
+
4 2
8 36
Think of running both tests on an individual as producing an outcome falling naturally
into one of the 4 cells of the table below (with corresponding probabilities indicated).
Test A
+
Test B
+
p++ p+
p + p
Since p
= 1 (p++ + p+ + p + ), we might think of this as a statistical problem
with a 3-dimensional parameter vector p = (p++ ; p+ ; p + ). Write an appropriate
loglikelihood for p based on all data collected in the three studies.
4. The following are required
(a) B&D Problems 1.2.2, 1.2.3, 1.2.14
(b) B&D Problems 1.3.3, 1.3.9, 1.3.10, 1.3.11, 1.3.12, 1.3.18 (use n = 5)
5. B&D Problem 1.2.5 is optional, not required but recommended.
LATE ADJUSTMENT: Hold Problem 2, and B&D 1.3.11 and 1.3.12 for Assignment 2.
2
Hint for Problem 1.2.14 of B&D
The general story in this problem is this:
For X1 ; X2 ; X3 ; : : : ; Xn ; Xn+1 iid f (xj ) and ~g ( ) ; the joint density of the observables and
parameter is
n+1
Y
f (x; ) = g ( )
f (xi j )
i=1
and from this, one can (in theory) get the marginal of the vector of observables X by
integrating out , i.e.
Z
n+1
Y
fg (x) = g ( )
f (xi j ) d
i=1
THEN, from this one can contemplate …nding the conditional distribution of Xn+1 jX1 ; X2 ; : : : ; Xn
that B&D call the predictive distribution (note that you don’t get this "for free" in terms of
assumptions ... the form of fg depends on g).
Now actually carrying out the above program is going to be very unpleasant except for
"convenient" cases. In fact, the only multivariate distributions that are really "convenient"
are the MVN distributions. Thankfully, (and not surprisingly) this problem uses them.
That is, think of writing
Xi = + i
for the
i
iid N(0;
2
0)
independent of ~N( 0 ; 20 ). Then
0
1
X1
B X2
C
B
C
B ..
C
B .
C
B
C MNV ( 0 1; )
B Xn C
B
C
@ Xn+1 A
for an appropriate
that you can …gure out easily enough, since …nding variances and
covariances is straightforward. (VarXi = 20 + 20 ;Cov(Xi ; Xj ) = 20 ;Cov(Xi ; ) = 20 ).
From this, the marginal of X1 ; X2 ; X3 ; : : : ; Xn ; Xn+1 is obvious (also MVN) and you can
use facts about conditionals of MVN distributions to get the conditional distribution of
Xn+1 jX1 ; X2 ; : : : ; Xn .
In doing the above, you will probably need to use a matrix fact about the inverse of a certain
patterned matrix. The covariance matrix of X (and of any subset of the entries of X) is of
the form
aI + bJ
for J a matrix of 1’s. The inverse of such a matrix has the form
I+ J
for
=
1
and
a
=
b
; where k is the dimension of the (square) matrix I (or J ).
a (a + kb)
3
Download