EXAMPLES OF STATISTICAL LIKELIHOODS X X X X X X X X X X X

advertisement
EXAMPLES OF STATISTICAL LIKELIHOODS
                          
1. Binomial likelihood
Suppose that you perform n independent success/fail trials in which the success
probability is p at each trial. Let
0
Yi = 
1
if trial i is a failure
if trial i is a success
Do this for i = 1, 2, …, n.
Let likelihood for Yi is Li = pYi 1  p  i . This is especially simple as Yi only has the
values 0 and 1. This likelihood will be either 1 – p or p.
1Y
The likelihood for the whole problem is L =
L
i 1
i
  p 1  p  
n
n
=
Yi
1  Yi
i 1
n
=
p
Yi
i 1
1  p 
n 
n
Yi .
i 1
n
If you let X =
Y
i 1
i
, it’s clear that the likelihood depends only on X. Of course, the
n
n x
likelihood of X is known to be   p x 1  p  , but this likelihood does not involve
 x
the Yi’s.
The Yi’s carry information about the ordering of the successes and failures, and the
likelihoods for X and for the Yi’s are different. This distinction plays no role in most
statistical problems.
1
EXAMPLES OF STATISTICAL LIKELIHOODS
                          
2. Normal sample likelihood
Suppose that X1, X2, …, Xn is a sample from the normal distribution with mean  and
standard deviation . The likelihood for Xi is
1
 2  xi   
1
Li =
e 2
 2
2
The likelihood for the entire sample is
n
1
2

  xi   
 2  xi    
 1
1
2 2 i 1
e
L =  Li =  
=
e 2

n /2
n  2 
i 1

i 1   2
n
1
n
2
3. Regression likelihood
Suppose that x1, x2, … , xn are fixed numbers. Suppose that Y1, Y2, …, Yn are
independent random variables with Yi ~ N(0 + 1 xi , 2). The likelihood for Yi is
 2  yi  0  1 xi  
1
e 2
 2
1
Li =
2
The likelihood for the entire problem is
n
2
1
 2   yi  0  1 xi  
 2  yi  0  1 xi   
 1
1
2  i 1
2

e
L =  Li =  
e
 = n
n /2
  2 
i 1

i 1   2
n
1
n
It can be helpful to look closer at the sum in the exponent.
n
  y  
i
i 1
=
0
 1 xi 

2
n
n
n
n
n
i 1
i 1
i 1
i 1
i 1
 yi2  2 0  yi  2 1  xi yi  2 0 1  xi  n 02  12  xi2
The exponent involves five sums. The regression estimates for 0, 1, and  are
computed from these five sums.
2
2
EXAMPLES OF STATISTICAL LIKELIHOODS
                          
4. Exponential sample
Suppose that X1, X2, …, Xn is from the exponential density with mean . Then the
likelihood for Xi is
x
1  i
e
Li =

The likelihood for the entire problem is
n
 1  xi 
1    xi
L =  Li =   e   = n e i 1

i 1
i 1  

n
1
n
If the individual likelihood is written Li =  e   xi , the problem is equivalent. The
1
parameters are related as  =
.

5. Censored exponential sample
Suppose that X1, X2, …, Xn are independent random variables, each from the density
 e -x. Actually we are able to observe the Xi’s only if they have a value T or below.
This corresponds to an experimental framework in which we are observing the lifetimes
of n independent objects (light bulbs, say), but that the experiment ceases at time T.
Suppose that K of the Xi’s are observed; call these values X 1 , X 2 , X 3 , .... , X K . The
remaining n - K values are censored at T ; operationally, this means that there were
n - K light bulbs still burning when the experiment stopped at time T.
The random variables here are mixed discrete and continuous, and it is a little painful to
think of a likelihood. Make this all discrete by breaking up the time axis into little dx
pieces. The likelihood associated with X i is  e   xi dx . For each of the values noted
as T , the associated likelihood is P[ X > T ] = e -  T. The overall likelihood is
L =  e  T 
nK
  e
K
  xi
dx = e  T
nK 
K e

K
 xi
i 1
 dx n  K
i 1
It is now convenient to ignore the (dx)n-K ; this is problematic only to those who agonize
over the measure theory. Note that K is random in this likelihood. One can debate
n 
whether there should be a binomial coefficient   .
K
3
EXAMPLES OF STATISTICAL LIKELIHOODS
                          
6. Birth process
Start with X0 = 1. Thereafter, let the conditional distribution of Xt | Xt – 1 = xt - 1 be
Poisson with parameter  xt – 1. The idea is that one female (for generation 0) produces a
number of daughters which follows a Poisson distribution with parameter . Thereafter,
these X1 daughters (generation 1) will independently produce daughters according to a
Poisson distribution with parameter . The total number of daughters for generation 2
follows a Poisson distribution with parameter  X1. Observe that if at any time the total
number of daughters produced is zero, then the process dies out forever.
Let’s write the likelihood through K generations.
It should be clear that we can do this as
L(x1, x2, … , xK)
= L(x1) L(x2 | x1) L(x3 | x1, x2) L(x4 | x3, x2, x1) … L(xK | xK – 1 , … , x1)
= L(x1) L(x2 | x1) L(x3 | x2)
L(x4 | x3)
…
L(xK | xK – 1)
This step is based on the observation that the process is Markovian.
Each Xt will depend on (X1, X2, …, Xt - 1) only through the most
recent value Xt – 1 .
x
   x1   x1  x1  2
= e
 e
x1 !  
x2 !




 x  x2  x3
 e 2
x3 !




 x  x3  x4
3
 e
x4 !


 x  xK 1  xK
K 1
 ...  e
xK !





Each factor is helped by the fact that Poisson random variables
combine. Once X1 is observed at value x1, then X2 is generated as
the sum of x1 independent Poisson random variables, each with
parameter . This sum will have Poisson parameter  x1.
K

= e
K 1
 xi
i 0
 xi
 i 1
K
x !
i 1
K 1
x
i 1
xi 1
i
i
This uses the notational convenience x0 = 1.
4
Download