Desirable Properties of Point Estimators

advertisement
STATISTICS: MODULE 12122
CHAPTER 5 DESIRABLE PROPERTIES OF POINT ESTIMATORS
As we said in Chapter 4, in estimation we are concerned about obtaining the best
~ , σ , σ 2 , p or
possible estimates of the unknown population parameters such as µ , µ
functions of these parameters. In order to help us to decide what is the best estimator
to use in any situation, we have to know about the properties of these estimators and
choose those which have ‘good’properties such as unbiasedness, efficiency,
consistency. We need therefore to know what these terms mean but also we need to
know about the sampling distributions of certain estimators.
5.1. Point estimators
A point estimator is a statistic obtained from the sample, which is used to estimate an
unknown parameter or function of the parameter. It is therefore a random variable as
it varies from sample to sample and so it has a sampling (probability) distribution.
Example 5.1
Suppose X 1 , X 2 , X 3 ,......., X n is a random sample from a population which has
mean µ and variance σ 2 .
Estimating µ
If we wish to estimate the population mean µ , we could use the sample mean X
X + X 2 + .......+ X n
where X = 1
, and we know from Chapter 4 that the sampling
n
distibution of X is exactly Normal if the population is Normal (see 4.7) or
approximately Normal if the population is non-normal, the approximation to normality
improving as n, the sample size increases(see 4.12). We will see in this chapter that X
is an unbiased, efficient and consistent estimator of µ and so it is a good point
estimator of µ .
Estimating σ 2
If we wish to estimate the population variance σ 2 , we can use
n
S2 =
∑
1
(Xi − X)
n− 1
∑ (X
n
2
or
S *2 =
1
− X)
2
i
n
.
We will see in this chapter that S 2 is an unbiased, consistent point estimator of σ 2
which is more efficient an estimator than S *2 . Also S *2 is a biased estimator of σ 2 so
the ‘best’estimator of σ 2 is S 2 . We will consider the sampling distribution of S 2 later
on. It involves a distribution called the χ 2 (chi-square) distribution.
2
5.2 Desirable Properties of Point Estimators
Certain point estimators are better than others because they have good properties. We
will consider those next.
1.
Unbiasedness
Definition Suppose θ∃ is a point estimator of the parameter θ, then θ$is an
unbiased estimator of θ if E(θ∃) = θ . The bias of an estimator is given by b θ∃
()
()
where b θ∃ = E(θ∃) − θ .
Asymptotic Bias The asymptotic bias is
lim
n→ ∞
()
b θ∃ .
Example 5.2
Suppose X 1 , X 2 , X 3 ,......., X n is a random sample from a Normal population which
∑ (X
n
has mean µ and variance σ 2 . Show that if S 2 =
( )
and Var S 2 =
1
− X)
2
i
n− 1
( )
, then E S 2 = σ 2
2σ 4
2
2
. Is S an unbiased estimator of σ ?
(n − 1)
Solution We need to know about the sampling distribtion of S 2
Sampling distribution of the sample variance S 2
Suppose we take many many random samples of size n from this Normal population and for
each sample we compute the sample variance S 2 . The sample variance S 2 will vary from
sample to sample. What can we say about its sampling distribution ?.
In theory
(n − 1)S 2
~ χ 2 (n − 1)
σ2
i.e. the sampling distribution of S 2 is some form of chi-square distribution with
parameter (or degrees of freedom) equal to (n − 1).
3
Example 5.3
In a Binomial experiment (e.g suppose you are looking at whether there is
discrimination against women in the police force with regards to promotion or
ascertaining how many consumers prefer Goldtaste to other brands of coffee in a
consumer survey) suppose Y is the number of successes in a random sample of size n
( e.g. Y = the number of women policeofficers who get promoted or the number of
consumers who prefer Goldtaste). Assume Y ~Bin( n , p) where p is the probability of
success (e.g. the probability of promotion or the probability of consumer preferring
Goldtaste). The two following statistics are proposed as estimators of p.
p∃1 =
Y
n
and
p∃2 =
(Y + 1)
n+ 2
(a) Show that p∃1 is unbiased but p∃2 is not. What can you say about the asymptotic
bias of the two estimators?
(b) Suppose a random sample of 240 policewomen is taken and the number of them
who have been promoted is 36. Suppose also that a random sample of 960 policemen
is taken and the number of them who have been promoted is 288. Obtain point
estimates of the probabilities that policemen and policewomen get promoted using
both point estimators defined in (a). Which estimator is likely to give the best
estimates?
2.
Efficiency
The mean squared error of a point estimator θ∃is defined as:
(
M.S.E. ( θ∃) = E θ∃− θ
)
2
Estimator θ∃1 is said to be more efficient than estimator θ∃2 if:
M.S.E. ( θ∃1 ) < M.S.E. ( θ∃2 )
M.S.E. ( θ∃) We can obtain the mean squared error in terms of the variance of θ∃and
the bias of θ∃as follows:
2
∃ + θ2 =
E θ∃− θ = E θ∃2 − 2 θθ
(
)
(
)
If an estimator is unbiased then M.S.E. ( θ∃) = Var ( θ∃) so an unbiased estimator θ∃1 is
said to be more efficient than another unbiased estimator θ∃ if
Var ( θ∃1 ) < Var( θ∃2 )
2
4
Example 5.4 Let X1 , X 2 ,... X n denote a random sample from a population with
probability density function :
f ( x) = λx λ− 1
= 0
0 < x < 1, λ > 0
elsewhere.
λ
. Obtain an expression for the
It is proposed that X be used as an estimator of
λ+ 1
M.S.E. ( X ).
3
Consistency
Definition
Let θ∃be an estimator of parameter θ based on a random sample of size n,
then θ∃is a consistent estimator of θ if :
lim
n→ ∞
or
(
P θ∃− θ ≤ε
) = 1 for all values of ε and ε > 0
plim θ∃= θ
(where ε is very small e.g. 10 − 20 )
i.e the (sampling) probability distribution of θ∃gets more and more concentrated
around θ as the sample size n increases towards infinity.
We say that θ∃converges in probability to θ as n → ∞ .
As you can see this definition is not an easy definition to check but fortunately there is
a sufficient condition for consistency (a sufficient condition is one which guarantees
the truth of something, see Chapter 1, QM1) and this can be used to prove consistency
in many cases.
Sufficient condition for consistency
A sufficient but not necessary condition for estimator θ∃to be a consistent estimator
of θ is that
M.S.E. ( θ∃) → 0 as n → ∞
N.B. It is important to realise that if M.S.E. ( θ∃) → 0 as n → ∞ , it does not follow
that θ∃is inconsistent. There are some such estimators which you will meet in
Econometrics. Their M.S.E.’s do not tend to 0 as n → ∞ , but they are consistent.
5
Example 5.5
Show that the estimator X is a consistent estimator of
λ
in Example 5.4.
λ+ 1
Example 5.6
(a) Suppose we have a random sample from a Normal population N ( µ , σ 2 ) .
Sample mean X and E( X )= µ , Var ( X ) =
σ2
n
~
~ π σ2 
Sample median X and Var X ≅   for large n
2n
()
()
~
~
X and X are both unbiased estimators of µ but Var ( X ) < Var X . So
~
X is a more efficient estimator of µ than X .
Hence X is the ‘best’estimator to use when estimating µ .
(b) From Example 5.2, S 2 is an unbiased estimator of σ 2 and hence
2σ 4
M.S.E. ( S 2 ) = Var S 2 =
(n − 1)
( )
So as n → ∞ , M.S.E. ( S 2 ) → 0 so S 2 is a consistent estimator of σ 2 .
∑ (X
n
Considering S
( )
so E S
*2
=
*2
=
− X)
2
i
1
n
(n − 1)E (S 2 )
n
=
, S
*2
(n − 1)σ 2
n
n − 1)S 2
(
=
n
≠ σ 2 hence S *2 is a biased estimator of σ 2 .
Also although S *2 is a consistent estimator of σ 2 , it can be shown that
M.S.E.( S *2 ) > M.S.E. ( S 2 ), so S *2 is not as efficient an estimator of σ 2 as S 2 .
Hence S 2 is the ‘best’estimator to use when estimating σ 2 .
C.Osborne March 2000
6
Example 5.3
In a Binomial experiment suppose Y is the number of successes in a random sample of size n.
Assume Y ~Bin( n , p) where p is the probability of success. The two following statistics are
proposed as estimators of p.
p∃1 =
Y
n
and
p∃2 =
(Y + 1)
n+ 2
(a) Show that p∃1 is unbiased but p∃
2 is not. What can you say about the asymptotic bias of
the two estimators?
(b) Suppose a random sample of 240 policewomen is taken and the number of them who have
been promoted is 36. Suppose also that a random sample of 960 policemen is taken and the
number of them who have been promoted is 288. Obtain point estimates of the probabilities
that policemen and policewomen get promoted using both point estimators defined in (a).
Which estimator is likely to give the best estimates?
(c) Show that the two estimators are consistent.
Solution Note that I have added part(c) to the original question.
(a) Here the parameter being estimated is p and the estimators are p∃1 and p∃
2.
i.e θ = p and θ∃= p∃1 or p∃
2
We need to look at E ( p∃1 ) and
using the notation of Chapter 5.
E ( p∃2 ).
As Y ~Bin( n , p) , E (Y )= mean of the Binomial = np
1
Y  1
E ( p∃1 )= E   = E (Y ) = np = p so p∃1 is an unbiased estimator of p.
n  n
n
E ( p∃2 )=
E (Y + 1)
n+ 2
=
1
1
1
E (Y + 1)=
E (Y )+ 1) =
(np + 1) ≠ p.
(
n+ 2
n+ 2
n+ 2
Hence p∃2 is an biased estimator of p.
The bias of p∃2 = b( p∃2 )= E ( p∃2 )- p =
=
1
(np + 1)- p
n+ 2
(np + 1) − p (n + 2 )
n+ 2
=
1− 2p
.
n+ 2
Asymptotic bias of p∃
1 =
lim
n→ ∞
b( p∃1 )= 0 since b( p∃1 )= 0 as p∃1 is an unbiased estimator of p.
Asymptotic bias of p∃
2 =
lim
n→ ∞
b( p∃2 )= 0 since
1− 2p
→ 0 as n → ∞ .
n+ 2
(b) Let pW = P( Policewoman is promoted) and let p M = P( Policeman is promoted)
Using p∃1 =
Y
36
288
, p∃W =
= 0.1500 and p∃M =
=0.3000.
n
240
960
7
(Y + 1)
37
288
= 0.1529 and p∃M =
= 0.3004
n+ 2
242
960
∃
Which is the best estimator of p, p∃
1 or p2 ?
Using p∃2 =
, p∃W =
This is not an easy question to answer. On the basis of unbiasedness, you would theoretically
∃
choose p∃
1 since it is unbiased whereas p2 is biased. However as the sample size n → ∞ ,
b( p∃2 ) → 0 and here in this particular example on police promotion, you can see that for large
∃
n the estimates obtained by using p∃
1 and p2 agree to at least 2 decimal places. So practically
∃
speaking there is not much to choose between the two estimators p∃
1 and p2 when considering
unbiasedness.
As
Y ~Bin( n , p) we know that Var(Y) = np(1- p)
np(1 − p)
1
=
and
Var
Y
(
)
n2
n2
np(1 − p)
1
 Y + 1
Var ( p∃2 )= Var 
Var (Y ) =
since Var(Y +1) = Var(Y).
=
2
n + 2  (n + 2)
(n + 2)2
Y 
n 
Now Var ( p∃1 )= Var   =
Comparing these two variances, Var ( p∃2 ) < Var ( p∃1 )so that p∃2 is less variable an estimator
than p∃
1.
(c) In order to show consistency we need to calculate M . S . E ( p∃1 ) and M . S . E ( p∃2 )
and hopefully show that M . S . E ( p∃1 ) → 0 as n → ∞ , and M . S . E ( p∃2 ) → 0
as n → ∞ .
M . S . E ( p∃1 ) = Var ( p∃1 ) + ( b( p∃1 )) = Var ( p∃1 ) since b( p∃1 )= 0,
2
=
np(1 − p)
n
2
=
p(1− p)
As n → ∞ , M . S . E ( p∃1 ) → 0 .
M . S . E ( p∃2 ) = Var ( p∃2 ) + ( b( p∃2 )) =
2
np(1 − p)
from parts (a) and (b),
n
np(1 − p)
(n + 2)2
1 − 2 p 
+
= 2

n + 4n + 4  n + 2 
(
=
1 − 2 p 
+
 from parts(a) and (b).
n+ 2 
2
)
2
p(1 − p)
1 − 2 p 
+

4  n + 2 

n + 4 + 

n
and as n → ∞ , M . S . E ( p∃2 ) → 0
∃
Hence both p∃
1 and p2 are consistent estimators of p.
2
Download