Uploaded by malluc1

Slides Statistics II. Unit 1

advertisement
Statistics II
Xavier Vilà
Universitat Autònoma de Barcelona
Year 2020-2021
Statistics II
1. Introduction to Inferential Statistics and Estimation
Statistical Inference is a collection of techniques by means of which we can draw
conclusions with regards to a
Year 2019 - 2020
reality from the study of a sample of such reality
1
Statistics II
It is important to understand that
probabilistic
•
statistics is based on
•
any statistical conclusion drawn from a
to the whole
Year 2019 - 2020
reality,
techniques.
sample will not be true for sure when applied
but only with a certain probability.
2
Statistics II
Example 1 When an electoral survey is conducted it is clear that its results
do not exactly coincide with the results in the nal election. Nevertheless, if
the survey is "well done", that is, if the sample (which in this case is the set
of people interviewed) closely represents the whole reality (which in this case
is the whole population that has the right to vote), then the survey result will
be close to the nal results with a high probability
Year 2019 - 2020
3
Statistics II
1.1 Inferential Statistics: Denition and Inference
Methods
Statistical inference is mainly built upon four main concepts, which will be dened and
described below.
Population
Is the set of elements that are the object of study. The goal will be
to draw some conclusion regarding some specic feature of this population.
Example 2 All the apples in the world. The feature at study is whether an
apple falls down or not.
Example 3 Labor force in the European Union. The feature at study is
whether a worker is unemployed or not.
Example 4 Production of Intel chips in a given day. The feature at study is
whether a chip is faulty or not.
Year 2019 - 2020
4
Statistics II
Sample
Subset of the
Population used to draw conclusions about the population
Example 5 50 apples in Newton's garden.
Example 6 Unemployment statistics at the European Union.
Example 7 25 Intel chips manufactured in a given day.
Year 2019 - 2020
5
Statistics II
Parameter
Is the feature of the population that we want to know something
about. This feature has to be a numerical one and, obviously, its true value must
be unknown
Example 8 What is the proportion of falling apples.
Example 9 What is the unemployment rate at the European Union
Example 10 What is the proportion of faulty chips among those produced in
a given day.
Year 2019 - 2020
6
Statistics II
Statistic
Computation made using the elements in the
an approximation to the true value of the
sample
parameter.
and used to get
It is important to notice
that this value will be known (since we will compute it) and will be used to draw
conclusions on the true value of the
parameter, which is unknown and is what
is of interest to us.
Example 11 Proportion of falling apples among the 50 sampled apples in
Newton's garden.
Example 12 Unemployment rate among the workers interviewed in the unemployment statistics in the European Union.
Example 13 Proportion of faulty chips among the 25 selected chips produced
in a given day.
Year 2019 - 2020
7
Statistics II
From this four main concepts, the process of statistical inference works as follows:
1. Using sampling techniques that will be explained below, a
the
population that is going to be studied.
2. From this
3. From this
sample is selected from
sample, the proper computations are done in order to obtain a statistic.
statistic,
using some statistical inference technique that we will see
in other chapters, some conclusions are drawn regarding the unknown population
parameter that represents the feature of the population that is to be studied.
Year 2019 - 2020
8
Statistics II
This process can be represented as in Figure 1
Population
Parameter
(unkonwn)
Statistical
Inference
Sampling
Sample
Statistic
(known)
Figure 1: The process of Statistical Inference
Year 2019 - 2020
9
Statistics II
We can now provide a denition for Statistics (or Statistical Inference, to be more
precise) which is more formal than the one oered in the introduction.
Denition 14 Statistical Inference is a subject whose main objective is to draw
conclusions regarding a population through the study of one sample by means
of probabilistic techniques.
Year 2019 - 2020
10
Statistics II
1.2 Denition, characteristics and Distribution of the
main sample statistics: mean, variance and proportion
Once the sample is obtained (we will always assume that using a SRS), the process of
working with it and draw conclusions starts.
In this sense, the main task is now to obtain a
in statistical inference.
population
statistic,
one of the main elements
We will use it to produce conclusions regarding the unknown
parameter that is of interest to us.
The denition that follows will remind us what a
Then, the concept of
statistic is
estimate is dened.
Although these two concepts are very similar and closely related, it is very important to
notice that they are not the same thing.
Year 2019 - 2020
11
Statistics II
Denition 15 A statistic (or estimator) is a formula that uses the values in
the sample at hand (observations) in order to produce an approximation to the
true value of an unknown population parameter.
Denition 16 An estimate (or estimation) is the particular value of an estimator that is obtained from a particular sample of data and normally used to
indicate the value of an unknown population parameter.
Hence,
statistic is not a number but a formula
•
a
•
an
estimate
is the number that is obtained when the formula (the estimator) is
applied to the observations of the specic sample that we have at hand.
Year 2019 - 2020
12
Statistics II
Important
Given that the sample is obtained by means of a random technique, the
random variable
The
statistic is a
statistic will produce dierent estimates with dierent probabilities (depend-
ing on the specic sample that is nally "selected" at random).
In this sense, an
estimate is a specic realization of this random variable.
The following example aims to clarify this idea.
Year 2019 - 2020
13
Statistics II
Example 17 We want to know the average number of cars per family in a given
population.
To keep it simple, we will assume that the population is very small, only 4 families,
P opulation = {A, B, C, D}
Let us now assume that
• family A owns 1 car,
• families B and C have 2 cars each, and
• family D has 4.
Year 2019 - 2020
14
Statistics II
For the study, we
• want to obtain a random sample of size 2.
• compute the average number of cars in the sample
• use it to infer some conclusion regarding the true average in the population.
The sample mean (or just mean, for short) will play the role of statistic in this
example. We will use it to draw conclusions on the true population parameter that
is of interest to us: the average number of cars per family in the whole population,
that is, the population mean.
Year 2019 - 2020
15
Statistics II
The following Table summarizes:
1. the 6 possible samples than can be the result of a sampling process on this population,
2. the probability of being selected (all of them will have the same probability as we
are assuming SRS)
3. the estimate value that would result from applying the sample average formula
to the corresponding sample
Elements
Probability
Estimate
Year 2019 - 2020
Sample 1
Sample 2
Sample 3
Sample 4
Sample 5
Sample 6
{A, B}
{A, C}
{A, D}
{B, C}
{B, D}
{C, D}
1
6
1
6
1
6
1
6
1
6
1
6
1.5
1.5
2.5
2
3
3
16
Statistics II
In this example we can see how the statistic at use (sample mean) can take 4
dierent values, depending on which of the six possible samples is selected by the
SRS.
It is easy to see that
• the value 1.5 corresponds to two possible samples (Sample 1 and Sample 2).
• each sample has the same probability of being selected ( 16 ),
thus, the probability that the statistic takes the value 1.5 is:
P (statistic = 1.5) = P (Sample 1) + P (Sample 2) =
Year 2019 - 2020
1 1 1
+ =
6 6 3
17
Statistics II
We summarize what are the possible values the statistic can take an what is the
probability associated to each of them:

1.5



2
statistic value =
2.5



3
p=
p=
p=
p=
1
3
1
6
1
6
1
3
In this example, we have seen how the statistic can take dierent values (4 in this
case) with dierent probabilities. Hence, the statistic is a random variable
Year 2019 - 2020
18
Statistics II
It will be necessary to know their main properties and,
statistics that are more frequently used.
The main statistics (or estimators) that are studied are
specially,
the probability
distributions of the
•
the
sample mean,
•
the
sample variance,
•
the
sample proportion.
and
In all cases, we will assume that a sample of size
n
has been obtained by means of a
SRS. The elements of the sample will be denoted by
{x1, x2, · · · xn}
Year 2019 - 2020
19
Statistics II
Also, we will assume that
The sample has been selected form a population that follows a given distribution
This distribution is very important as it will inuence the sampling result and, hence,
the possible values of the
statistic as we have seen in the previous example.
In that example we have seen that the population is distributed so that there is
•
1 element with 1 car,
•
2 elements with 2 cars, and
•
1 element with 4 cars.
Year 2019 - 2020
20
Statistics II
Therefore, if we pick the sample element
have that:
p(xi = a) =







This is, in this case, the
Year 2019 - 2020
distribution
xi
1
4
1
2
1
4
0
at random from this population, we will
if a = 1
if a = 2
if a = 4
otherwise
of the population
21
Statistics II
Year 2019 - 2020
Graphically
22
Statistics II
In general, we will assume that the Sample has been obtained
by means of a SRS from a population distributed according to
a Normal Distribution with some Population Mean µ and some
Population Variance σ 2.
What does it mean ? Easy, it means that for any two numbers
for any element in our sample
a
and
b,
we have that
xi ,
p(a ≤ xi ≤ b) = p(a − µ ≤ xi − µ ≤ b − µ) =
= p(
where
Z
a − µ xi − µ b − µ
a−µ
b−µ
≤
≤
) = p(
≤Z≤
)
σ
σ
σ
σ
σ
represents the Standard Normal
Year 2019 - 2020
distribution,
usually denoted by
N (0, 1),
23
Statistics II
whose associated probabilities are found in tables.
Year 2019 - 2020
24
Statistics II
Year 2019 - 2020
Graphically
25
Statistics II
We turn next to the study of the distributions of the three main
we have discussed above, will depend on the
distribution
statistics.
These, as
of the population from which
we obtain the sample.
1.2.1
The Sample Mean
1.2.2
The Sample Variance
1.2.3
The Sample Proportion
For each case, we will be interested in knowing what is the
and the
variance
Year 2019 - 2020
distribution, the expectation
of these statistics.
26
Statistics II
Sample mean,
1.2.1 Sample Mean
denoted by
X̄ ,
is the
statistic
using the formula:
X̄ =
that is obtained from the sample
n
X
xi
i=1
n
It is normally used to infer conclusions regarding the true value of the
mean µ.
Population
Distribution
Its distribution depends on the characteristics of both the population and the sample
Year 2019 - 2020
27
Statistics II
1. If the population is
Normal, that is, Xi ∼ N (µ, σ 2) ∀i, then we have that
σ2
X̄ ∼ N (µ, )
n
2. If the population is not
Normal
but the sample is big enough, then:
X̄ − µ
q
∼ N (0, 1) (approx.)
σ2
n
3. If the population is not
sample mean X̄
Year 2019 - 2020
Normal
and the sample is small, then the distribution of the
is unknown in general.
28
Statistics II
4. If the population variance
σ2
is unknown and the population is
Normal, then
X̄ − µ
q
∼ tn−1
S2
n
S 2 is the sample variance (that we will see next) and tn−1 is the
t − student distribution with n − 1 degrees of freedom, which is very similar to the
N (0, 1) distribution and whose values can also be found in tables.
where
We turn next to the study of the
expectation
and
variance
of this statistic. To do so,
we will use the mathematical properties of the expectation and variance of a random
variable.
We will assume that the sample has been obtained from a population with
mean µ
element
and
xi
Year 2019 - 2020
population variance σ 2.
That is,
E(xi) = µ
and
population
V (xi) = σ 2
for any
in the sample.
29
Statistics II
Expectation
n
n
n
n
X
X
X
X
xi
xi
1
µ
E(X̄) = E(
)=
E( ) =
E(xi) =
=µ
n
n
n
n
i=1
i=1
i=1
i=1
Variance
n
n
n
n
X
X
X
X
xi
xi
1
σ2 σ2
V (X̄) = V (
)=
V (xi) =
=
V( )=
2
2
n
n
n
n
n
i=1
i=1
i=1
i=1
Year 2019 - 2020
30
Statistics II
Therefore, for the case of the
sample mean X̄
we have that
E(X̄) = µ
V (X̄) =
Year 2019 - 2020
σ2
n
31
Statistics II
1.2.2 Sample Variance
Sample variance, denoted by S 2, is the statistic that is obtained from the sample
using the formula:
1 X
(xi − X̄)2
S =
n−1
2
It is normally used to infer conclusions regarding the true value of the
variance σ 2.
Population
Distribution
Its distribution depends on the characteristics of the population.
Year 2019 - 2020
32
Statistics II
1. If the population is
Normal, (Xi ∼ N (µ, σ 2) ∀i), then:
(n − 1)S 2
2
∼
χ
n−1
σ2
where
χ2n−1
is the
chi-square distribution with n − 1 degrees of freedom,
whose
values are also in tables.
2. If the population is not
Normal,
then the distribution of the
sample variance
is
unknown in general, even for large samples.
Since we only know the distribution of the
Normal,
sample variance
when the population is
χ2n−1 to nd the
2
In this sense, we know that for any χ variable we have
we will use the fact that in that case its distribution is
expectation and variance easily.
that
Year 2019 - 2020
33
Statistics II
• E(χ2n−1) = n − 1
• V (χ2n−1) = 2(n − 1).
Hence, we will assume the the sample has been obtained from a
with
xi
sample mean µ and sample variance σ 2.
That is,
Normal
population
xi ∼ N (µ σ 2) for any element
in the sample. Hence:
(n − 1)S 2
2
∼
χ
n−1
σ2
Expectation
(n − 1)S 2
(n − 1)
2
2
2
E(
)
=
n
−
1
⇒
E(S
)
=
n
−
1
⇒
E(S
)
=
σ
σ2
σ2
Year 2019 - 2020
34
Statistics II
Variance
4
(n − 1)S 2
(n − 1)2
2σ
2
2
V(
)
=
2(n
−
1)
⇒
V
(S
)
=
2(n
−
1)
⇒
V
(S
)=
σ2
(σ 2)2
n−1
Therefore, for the case of the sample variance
S2
we have that
E(S 2) = σ 2
2
V (S ) =
Year 2019 - 2020
2σ 4
n−1
35
Statistics II
1.2.3 Sample Proportion
Sample proportion is used when we are interested in knowing which
proportion of elements in a population that have a given characteristic.
is the true
For instance, it might be of interest to know what is the proportion of smokers
among the second year students in this school (in this case, the
characteristic
that is of interest is "whether a student smokes or not")
Year 2019 - 2020
36
Statistics II
Sample proportion,
denoted by
π̂ ,
is the
statistic
that is obtained using the
formula:
π̂ =
xi = 1 if the i-th element in
studying and xi = 0 if it does not.
where
X xi
n
the sample has the characteristic that we are
Sample proportion π̂ is normally used to infer conclusions regarding the true population
sample π .
Distribution
In this case, the population is never
Normal
since each observation
xi
comes from a
Bernoulli random variable.
Year 2019 - 2020
37
Statistics II
Let us assume that we are looking at a population of 100 individuals out of which 45
are smokers. That is, the true
population proportion
is 45% or
From this population we want to obtain a sample of size 10.
element
xi
π = 0.45.
It is clear that for any
of the sample we will have that:
45
p(xi = 1) =
= 0.45
100
55
= 0.55
p(xi = 0) =
100
Hence, we see that each element
parameter
Year 2019 - 2020
π
(where
π
xi
in the sample follows a
is the true and unknown
Bernoulli
population proportion
distribution with
38
Statistics II
It can be shown then that
π̂ =
is a
Binomial
Normal
n
random variable.
Also, given that when samples are large a
by a
X xi
Binomial
distribution can be approximated
distribution, we can conclude that, in general:
1. If the sample is large enough
(nπ(1 − π) > 5),
π̂ ∼ N (π,
This approximation is better the closer to
then (approx.):
π(1 − π)
)
n
0, 5
is
π
and the larger is the sample
2. If the sample is not large, then the approximation is very bad.
Year 2019 - 2020
39
Statistics II
Expectation
E(π̂) = π
Variance
V (π̂) =
π(1 − π)
n
Therefore, for the case of the sample proportion
π̂
we have that
E(π̂) = π
V (π̂) =
Year 2019 - 2020
π(1 − π)
n
40
Statistics II
•
1.3 Point and interval estimation
Statistical estimation is the simplest inference technique.
It allows for a quick
approximation to the true value of the parameter of interest.
•
Its objective is to produce a rst approximated measure of the parameter we want to
study. This measure will be improved later on by means of more elaborated statistical
inference techniques.
•
We will learn how to use the statistics learned previously to produce conclusions (very
preliminary at this point) regarding the true population parameters.
and
•
condence intervals
Point estimation
will be the techniques that we will use.
Later in this section we will investigate the main properties of these estimators, as
well as other more advanced topics like
Year 2019 - 2020
Maximum likelihood
estimation and the
41
Statistics II
method of moments
which will allow us to design good estimators for the case we
do not know which one to use.
Year 2019 - 2020
42
Statistics II
•
A
point estimation
1.3.1 Point estimation
is the simplest method to produce estimations for a population
parameter, that is, an approximation to its true value.
•
To obtain a point estimation or
estimate
we just need to apply our
estimator
to the
specic sample at hand.
Example 18 Imagine that we want to obtain an approximation to the true value
of the population mean µ of a given population.
We know that the sample mean X̄ is a good estimator of µ.
Hence, this will be the estimator we use.
Imagine that the sample we have is
Sample = {1, 2, 3, 4}
Year 2019 - 2020
43
Statistics II
Then
X̄ =
1+2+3+4
= 2.5
4
Hence, in this case the point estimation (or estimate) we get for µ is 2.5
•
A
point estimation
has the advantage of being an easy and quick method of
estimation.
•
On the other hand, it does not provide much information about the parameter, and
is not very accurate either.
In the example above:
•
The value of
mean
µ
Year 2019 - 2020
X̄
that we have found suggests that the true value of the population
will be around 2.5
44
Statistics II
•
We do not know, though, if it will be larger or smaller
•
We do not know if it will be near 2.5 or not.
•
We do not know anything about the accuracy of our estimation
Such lack of precision can be somehow xed with the next method of estimation
Year 2019 - 2020
45
Statistics II
1.3.2 Interval estimation
We will use now the knowledge we have about the probability distribution of the sample
point estimation with additional information. In this
way, we will produce an interval that will contain, with some probability, the true value
statistics to supplement the
of the unknown population parameter.
That is, we will be able now to "measure" the accuracy of our estimation.
In this sense, the outcome of an
interval estimation
will be something similar to (for
the case of the mean):
µ ∈ [2.25 , 2.75]
with probability
95%
condence intervals
•
The intervals obtained using this method are called
•
The probability that the interval contains the population parameter is the
dence level, usually denoted by 1 − α.
Year 2019 - 2020
con46
Statistics II
We will study how to construct condence intervals for:
1.3.2.1
The population mean
1.3.2.2
The population variance
1.3.2.3
The population proportion
Year 2019 - 2020
µ
σ2
π
47
Statistics II
1.3.2.1 Condence Interval for the mean
We will see next how to build the condence interval for the case when we need to
produce an estimation for the population mean
Case I: Normal
µ
Population (or large sample) and
σ2
known
We know that in this case,
X̄ − µ
q
∼ N (0, 1)
σ2
n
hence
p(−z
Year 2019 - 2020
1− α
2
X̄ − µ
≤ q
≤ z1− α2 ) = 1 − α
σ2
n
48
Statistics II
where
of
z1− α2
1 − α2 .
is the value that corresponds to a
That is,
P (Z ≤ z
where
Z
represents a
Year 2019 - 2020
N (0, 1)
1− α
2
N (0, 1)
whose left tail contains an area
α
)=1−
2
and this value can be found in tables.
49
Statistics II
Graphically
α/2
α/2
−Z1−α/2
Year 2019 - 2020
0
Z1−α/2
50
Statistics II
Doing some algebra inside the inequalities we get,
r
p(−X̄ − z1− α2
multiplying by
−1
σ2
n
r
≤ −µ ≤ −X̄ + z1− α2
σ2
)=1−α
n
we reverse the "direction" of the inequalities, and hence
r
p(X̄ + z1− α2
σ2
n
r
≥ µ ≥ X̄ − z1− α2
σ2
)=1−α
n
at the end we get the interval we were looking for,
r
µ ∈ [X̄ − z1− α2
Year 2019 - 2020
σ2
n
r
, X̄ + z1− α2
σ2
]
n
with probability
1−α
51
Statistics II
Example 19 Let {x1, x2, · · · , x100} be a random sample of size 100 drawn from a
Normal population with unknown mean and variance σ 2 = 1.000.000. Construct a
condence interval with a condence level of 95% for the population mean µ if we
know that the sample mean is X̄ = 26.000.
If the condence level is 95% we have that 1 − α = 0.95. Hence, α = 0.05 and
α
2 = 0.025. Therefore,
α
1 − = 0.975
2
The interval will be of the form
r
[X̄ − z1− α2
Year 2019 - 2020
σ2
, X̄ + z1− α2
n
r
σ2
]
n
52
Statistics II
where all the values are known except for the values Z that correspond to a Normal
distribution. In this case we have to look up the tables for the value
Z1− α2 = Z0.975
That is, the value of a N (0, 1) that has to its left a probability of 0.975. In the tables
we nd
Z0.975 = 1.96
Thus,
r
σ2
r
σ2
]=
n r
[X̄ − z1− α2
, X̄ + z1− α2
n
r
1.000.000
= [26.000 − 1.96
, 26.000 + 1.96
100
1.000.000
]
100
Doing the computations, we nally get
Year 2019 - 2020
53
Statistics II
µ ∈ [25.804, 26.196] with a probability of 95%
Year 2019 - 2020
54
Statistics II
Case II: Normal
Population (or large sample) and
σ2
unknown
In the previous case we need to know the true value of the population variance
σ2
in
order to compute the interval. This is highly unusual. To overcome this problem we can
σ 2 by its unbiased estimator S 2. The only dierence is that now we can not use
N (0, 1), but the t − student with n − 1 degrees of freedom.
replace
the
r
µ ∈ [X̄ − t1− α2
S2
, X̄ + t1− α2
n
r
S2
]
n
with probability
t1− α2 is the value that corresponds to a t − student
α
of 1 −
2 and that can be found in tables as well.
where
area
(when
n
is large, then
Year 2019 - 2020
t1− α2
is approximately equal to a
1−α
whose left tail contains an
z1− α2 )
55
Statistics II
Example 20 Let {x1, x2, · · · , x100} be a random sample of size 100 drawn from
a Normal population with unknown mean and variance. Construct a condence
interval with a condence level of 95% for the population mean µ if we know that
the sample mean is X̄ = 26.000 and the sample variance is S 2 = 980.000
If the condence level is 95% we have that 1 − α = 0.95. Hence, α = 0.05 and
α
2 = 0.025. Therefore,
α
1 − = 0.975
2
The interval will be of the form
r
[X̄ − t1− α2
S2
, X̄ + t1− α2
n
r
S2
]
n
where all the values are known except for the valuest that correspond to a t−student
with n − 1 = 99 degrees of freedom. In this case we have to look up the tables for
Year 2019 - 2020
56
Statistics II
the value
t1− α2 = t0.975
That is, the value of a t − student with 99 degrees of freedom that has to its left
a probability of 0.975. In the tables we nd (since 99 degrees of freedom does not
appear in the tables we take the nearest value, 100 degrees of freedom)
t0.975(99) = 1.984
Thus
r
[X̄ − t1− α2
r
= [26.000 − 1.984
Year 2019 - 2020
S2
n
r
, X̄ + t1− α2
S2
]=
n
r
980.000
, 26.000 + 1.984
100
980.000
]
100
57
Statistics II
Doing the computations, we nally get
µ ∈ [25.803, 56, 26.196, 42] with a probability of 95%
Year 2019 - 2020
58
Statistics II
•
1.3.2.2 Condence Interval for the variance
In a similar manner, we can also construct a
condence interval
for the case of the
population variance.
•
We must remember, though, that in this case the population must follow a
Normal
distribution.
We know then that
and hence
Year 2019 - 2020
(n − 1)S 2
2
∼
χ
n−1
σ2
(n − 1)S 2
p(χ α2 ≤
≤ χ1− α2 ) = 1 − α
2
σ
59
Statistics II
where
χ α2 is
the value of a
found in tables. Similarly,
of
1 − α2 .
Year 2019 - 2020
χ2n−1 whose left tail contains an area of α2 and that can be
χ1− α2 is the value of a χ2n−1 whose left tail contains an area
60
Statistics II
Graphically
α/2
α/2
χα/2
Year 2019 - 2020
χ1−α/2
61
Statistics II
As before, we can work the inequalities out to obtain
1
σ2
1
p(
≥
)=1−α
≥
χ α2
(n − 1)S 2 χ1− α2
(n − 1)S 2
(n − 1)S 2
2
≥σ ≥
)=1−α
p(
α
α
χ2
χ1− 2
that is,
2
2
(n
−
1)S
(n
−
1)S
σ2 ∈ [
,
]
χ1− α2
χ α2
with probability
1−α
Example 21 Let {x1, x2, · · · , x100} be a random sample of size 100 drawn from
a Normal population with unknown mean and variance. Construct a condence
Year 2019 - 2020
62
Statistics II
interval with a condence level of 95% for the population variance σ 2 if we know
that the sample variance is S 2 = 4.800
If the condence level is 95% we have that 1 − α = 0.95. Hence, α = 0.05 and
α
2 = 0.025. Therefore,
1−
α
= 0.975
2
The interval will be of the form
(n − 1)S 2 (n − 1)S 2
[
,
]
α
α
χ1− 2
χ2
where all the values are known except for the values χ that correspond to a chi-square
with n − 1 = 99 degrees of freedom. In this case we have to look up the tables for
Year 2019 - 2020
63
Statistics II
the value
χ1− α2 = χ0.975 and χ α2 = χ0.025
That is, the values of a chi-square with 99 degrees of freedom that have to its left a
probability of 0.975 i 0.025 respectively, In the tables we nd
χ0.975 = 129.561 i χ0.025 = 74.222
Thus,
(n − 1)S 2 (n − 1)S 2
[
,
]=
χ1− α2
χ α2
=[
Year 2019 - 2020
99·4.800 99·4.800
,
]
129.561 74.222
64
Statistics II
Doing the computations, we nally get
σ 2 ∈ [3667.77, 6402.41] with a probability of 95%
Year 2019 - 2020
65
Statistics II
1.3.2.3 Condence Interval for the proportion
The case of the
Normal
proportion
is special for, as said before, the approximation to the
requires a large sample
(nπ(1 − π) > 5)
Then we will have
π̂ ∼ N (π,
π(1 − π)
)
n
and, similarly as in the case of the condence interval for the mean, we get:
r
π ∈ [π̂ − z
1− α
2
π̂(1 − π̂)
, π̂ + z1− α2
n
r
π̂(1 − π̂)
]
n
with probability
1−α
Example 22 In an random sample of 1000 people, 450 declare that they smoke on
a regular basis. Construct a condence interval with a condence level of 95% for the
proportion of smokers, π , in the population from which the sample has been obtained.
Year 2019 - 2020
66
Statistics II
If the condence level is 95% we have that 1 − α = 0.95. Hence, α = 0.05 and
α
2 = 0.025. Therefore,
α
1 − = 0.975
2
Let us rst compute the sample proportion, that is, the proportion of smokers in
the sample. In this case
π̂ =
450
= 0.45
1000
The interval will be of the form
r
[π̂ − z
1− α
2
π̂(1 − π̂)
, π̂ + z1− α2
n
r
π̂(1 − π̂)
]
n
where all the values are known except for the values Z that correspond to a Normal
Year 2019 - 2020
67
Statistics II
distribution. In this case we have to look up the tables for the value
Z1− α2 = Z0.975
That is, the value of a N (0, 1) that has to its left a probability of 0.975. In the tables
we nd
Z0.975 = 1.96
Thus,
r
r
π̂(1 − π̂)
π̂(1 − π̂)
[π̂ − z1− α2
, π̂ + z1− α2
]=
n
n
r
r
0.45(1 − 0.45)
0.45(1 − 0.45)
= [0.45 − 1.96
, 0.45 + 1.96
]
1000
1000
Doing the computations, we nally get
Year 2019 - 2020
68
Statistics II
π ∈ [0.4191, 0.4808] with a probability of 95%
Year 2019 - 2020
69
Statistics II
1.4 Properties of estimators: bias, eciency and
consistency
•
Once the main statistics and their probabilistic features (i.e. probability distribution,
expectation and variance) are known, we focus in this chapter on the "good"
properties that we would like the estimators to have in order for them to provide
good approximations to the parameters of interest.
•
In this sense, an estimator might, among others, satisfy the properties of being
unbiased, ecient, and consistent
Year 2019 - 2020
that we will see next.
70
Statistics II
1.4.1 Bias
Denition 23 Let θ̂ be an estimator of the population parameter θ. The bias
of θ̂ is dened as the dierence between the expected value of the estimator and
the true value of the population parameter
B(θ̂) = E(θ̂) − θ
Denition 24 An estimator θ̂ is said to be an unbiased estimator of the population parameter θ if its bias is zero
B(θ̂) = 0 or E(θ̂) = θ
Year 2019 - 2020
71
Statistics II
Example 25 Let {x1, x2, . . . , xn} be a random sample drawn from a population
with population mean µ. Then, for the sample mean X̄ we have:
E(X̄) = µ
Thus,
X̄ is and unbiased estimator of µ
Year 2019 - 2020
72
Statistics II
Example 26 Let {x1, x2, . . . , xn} be a random sample drawn from a population
with population variance σ 2. Then, for the population variance S 2 we have:
E(S 2) = σ 2
Thus,
S 2 is an unbiased estimator of σ 2
Year 2019 - 2020
73
Statistics II
Example 27 Let {x1, x2, . . . , xn} be a random sample drawn from a population
with population proportion π . Then, for the sample proportion π̂ we have:
E(π̂) = π
Thus,
π̂ is an unbiased estimator of π
Year 2019 - 2020
74
Statistics II
•
Interpretation of the unbiased property
We know that an
estimator
is a random variable, that is, takes dierent values
with dierent probabilities.
•
Hence, it is clear that it is highly unlikely that the specic value (
estimate)
that
we get once we apply the sample to the estimator exactly coincides with the true
parameter value.
•
What the unbiased property means is that the above is true "in the sense of
expectation".
When we apply the specic sample we have to the estimator the estimate will not
coincide (in general) with the true value of the parameter,
if we had 100 dierent samples to apply to the estimator then the
average
of the
100 dierent estimates produced would be very close to the true parameter value.
Year 2019 - 2020
75
Statistics II
We can compare an estimator with a "shooter" whose target is the true value of the
parameter.
•
A good "shooter" (unbiased) always aims at the center of the target, although there
is always a small probability that the shot slightly deviates from the center.
•
A bad "shooter" (biased) never aims at the center of the target.
Year 2019 - 2020
76
Statistics II
1.4.2 Eciency
The eciency criterion for an estimator, that we will see next , has two dierent versions
depending on whether the estimator is biased or unbiased.
1.4.2.1 Unbiased Estimators
Denition 28 Let θ̂1 and θ̂2 be two unbiased estimators of θ. Then, the more
ecient estimator is that of the lesser variance.
1.4.2.2 Biased Estimators
Denition 29 Let θ̂1 and θ̂2 be any two estimators of θ. Then, the more ecient estimator is that of the lesser Mean Quadratic Error (M QE) where:
M QE(θ̂) = E(θ̂ − θ)2 = V (θ̂) + B(θ̂)2
Year 2019 - 2020
77
Statistics II
The second "version" contains the rst one as a special case. Indeed, it an estimator
has zero bias's then its
M QE
and Variance coincide
Example 30 Let us consider the following alternative estimators of the population
mean µ which will be applied to a sample obtained from a population with population
mean µ and population variance σ 2
µ̂1 =
x1 + x 2 + x3
3
x1 + x 2
µ̂2 =
2
Let us check rst the bias's of each of these estimators:
Year 2019 - 2020
78
Statistics II
x1 + x 2 + x 3
B(µ̂1) = E(µ̂1) − µ = E(
)−µ=
3
1
= (E(x1) + E(x2) + E(x3)) − µ =
3
1
= 3µ − µ = µ − µ = 0
3
Year 2019 - 2020
79
Statistics II
B(µ̂2) = E(µ̂2) − µ = E(
=
=
x1 + x2
)−µ=
2
1
(E(x1) + E(x2)) − µ =
2
1
2µ − µ = µ − µ = 0
2
Hence, both estimators and unbiased.
Let us now check which one has less variance:
Year 2019 - 2020
80
Statistics II
x1 + x2 + x3
V (µ̂1) = V (
)=
3
1
= (V (x1) + V (x2) + V (x3)) =
9
1 2 σ2
= 3σ =
9
3
Year 2019 - 2020
81
Statistics II
x1 + x2
V (µ̂2) = V (
)=
2
1
= (V (x1) + V (x2)) =
4
1 2 σ2
= 2σ =
4
2
2
2
Therefore, µ̂1 is more ecient as it has less variance ( σ3 < σ2 )
Year 2019 - 2020
82
Statistics II
Interpretation of the eciency property
f we compare an unbiased estimator with a "good shooter" (as we have done before)
that always aims at the center of the target, then an estimator is more
ecient
than
another one if it "trembles" less. In other words, the more ecient estimator is the one
whose values are more concentrated around its mean.
Year 2019 - 2020
83
Statistics II
1.4.3 Consistency
•
Very often it becomes very dicult to nd ecient estimators for a specic parameter.
•
In this case we look at the so called
asymptotic properties,
that consist of the
properties that the estimators have when the sample is as large as needed.
•
In this sense, we will introduce
asymptotic bias's and
the asymptotic eciency or consistency.
the
Year 2019 - 2020
84
Statistics II
1.4.3.1 Asymptotically unbiased estimators
Denition 31 An estimator θ̂ of the population parameter θ is said to be asymptotically unbiased if its bias vanishes as the sample size goes to innity. Formally,
θ̂ is an unbiased estimator of θ if
lim B(θ̂) = 0
n→∞
Example 32 Let us consider the following estimator of the population variance
(σ 2)
2
S̃ =
Year 2019 - 2020
Pn
i=1 (xi
− X̄)2
n
85
Statistics II
It is easy to check that if
2
Pn
S =
then
− X̄)2
n−1
i=1 (xi
n−1 2
S
S̃ =
n
2
and hence
E(S̃ 2) = E(
Therefore
n−1 2
n−1
n−1 2
S )=
E(S 2) =
σ
n
n
n
σ2
n−1 2
2
B(S̃ ) = E(S̃ ) − σ =
σ −σ =−
n
n
2
2
2
That is, S̃ 2is a biased estimator of σ 2 since E(S̃ 2) 6= σ 2.
Year 2019 - 2020
86
Statistics II
Nevertheless, S̃ 2 is an asymptotically unbiased estimator of σ 2, for its bias vanishes
as the sample grows. Indeed,
σ2
lim B(S̃ ) = lim − = 0
n→∞
n→∞
n
2
1.4.3.2 Consistent Estimators
The property of consistency not only considers the behavior of the bias as the sample
grows large, but also looks at the variance. That is,
of the
M QE
Year 2019 - 2020
consistency
refers to the behavior
of the estimator as the sample size goes to innity.
87
Statistics II
Denition 33 An estimator θ̂ of the population parameter θ is said to be consistent it its Mean Quadratic Error vanishes as the size of the sample goes to
innity. Formally, θ̂ is a consistent estimator of θ if
lim EQM (θ̂) = 0
n→∞
Example 34 Let us consider the estimator of σ 2 that we have seen before, S̃ 2.
We already know that it it a biased estimator for σ and that its bias is B(S̃ ) =
2
2
σ2
−n
.
We will compute now its variance in order to study the behavior of its EQM as the
sample size goes to innity
n−1 2
n−1 2
(n − 1)2 2(σ 2)2 2(n − 1)σ 4
2
V (S̃ ) = V (
S )=(
) V (S ) =
=
n
n
n2
n−1
n2
2
Year 2019 - 2020
88
Statistics II
Hence
2
4
4
σ
2(n
−
1)σ
(2n
−
1)σ
2
+
(−
EQM (S̃ 2) = V (S̃ 2) + B(S̃ 2)2 =
)
=
n2
n
n2
and then
(2n − 1)σ 4
lim EQM (S̃ ) = lim
=0
2
n→∞
n→∞
n
2
Therefore, S̃ 2is a consistent estimator of σ 2
Year 2019 - 2020
89
Statistics II
1.5 Methods of point estimation: maximum likelihood
and method of moments
•
When we need to produce estimations for population parameters that are "standard",
µ, σ 2, π ),
(
•
there are
good estimators
at hand: (
X̄, S 2, π̂ ).
When we need to estimate a dierent population parameter (for instance the median
or the kurtosis) we do not have a "candidate" for estimator.
•
The
Maximum Likelihood method
and the
Method of Moments
provide techniques
to build good estimators of a given population parameter.
Year 2019 - 2020
90
Statistics II
1.5.1 Maximum Likelihood estimation
The intuition of the method is as follows:
•
After performing a totally random sampling (SRS) we obtain a specic sample, and
there must be a reason for it (since we could have obtained a dierent one).
•
Well, probably we have obtained this specic sample because the parameter value
we want to estimate is such that the sample we have obtained is the one with the
highest probability of been selected.
•
In this sense, the
maximum likelihood method
nds the value of the parameter that
maximizes the probability of obtaining the sample at hand.
The process takes three steps, starting with the sample we have,
the probability density function of the population that contains
want to estimate,
Year 2019 - 2020
{x1, x2, · · · xn} and
the parameter (θ) we
f (x; θ).
91
Statistics II
We will rst introduce the general method, and later we oer an example to clarify it.
Suppose that we want to estimate the parameter
given by
f (x; θ)
Step 1
Build the Likelihood function
θ
of a population with a distribution
using the sample that we have obtained
{x1, x2, · · · xn}.
The Likelihood function is the "formula" that computes the probability of having
obtained the sample we have conditional on the population parameter we want to
estimate.
L(x1, x2, · · · xn; θ) = P (X1 = x1, X2 = x2, · · · Xn = xn; θ)
Since the sample has been obtained from a population with a probability distribution
given by
f (x; θ)
Year 2019 - 2020
and that the elements in the sample are independent from each other,
92
Statistics II
the joint probability
P (X1 = x1, X2 = x2, · · · Xn = xn; θ)
can be computed as
P (X1 = x1, X2 = x2, · · · Xn = xn; θ) = f (x1; θ) · f (x2; θ) · . . . · f (xn; θ)
hence,
'
L(x1, x2, · · · xn; θ) = f (x1; θ)·f (x2; θ)·. . .·f (xn; θ) =
$
n
Y
f (xi; θ)
i=1
&
Step 2
%
Apply logarithms
The functional form of the likelihood function is often involved (the product of functions)
Year 2019 - 2020
93
Statistics II
Using logarithms we can simplify the function so that it becomes easier to deal with.
Therefore, in this step we simply apply ln
and then use the properties of logarithms in
order to simplify the form of the likelihood function
'
$
ln L(x1, x2, · · · xn) = ln
n
Y
i=1
&
Step 3
f (xi; θ) =
n
X
ln f (xi; θ)
i=1
%
Maximize
The last step is to maximize the likelihood function, that is, to nd the value of
maximizes the function
L
θ
(the probability of having obtained the sample we have).
We must compute the derivative of the likelihood function (with the logarithm)
Year 2019 - 2020
that
ln L
94
Statistics II
θ
with respect to the parameter
and make it equal to zero to nd the value of
θ
that
maximizes it.
'
$
∂ ln L(x1, · · · xn; θ)
=0
∂θ
&
From here we nd the value of
The solution will be the
Example 35
Let
θ
%
solves the above equation.
maximum likelihood estimator
{x1, x2, · · · xn}
of
θ,
usually denoted by
θ̂M L
be a sample (independent) obtained
from a Normal population with population mean
variance
σ 2.
Find the maximum likelihood
µ and population
estimator of µ.
First, let us remember what is the probability density function corresponding to a
Year 2019 - 2020
95
Statistics II
N (µ, σ 2):
2
1
− 12 ( x−µ
σ )
f (x; µ, σ ) = √
e
2πσ
2
Year 2019 - 2020
96
Statistics II
Step 1
Likelihood Function
n
Y
x −µ
1
− 12 ( iσ )2
√
e
=
L(x1, x2, · · · xn) =
2πσ
i=1
n
Pn xi −µ 2
1
1
−
√
=
· e 2 i=1 σ
2πσ
This would be hard to work with !. That's why we need to use logarithms.
Year 2019 - 2020
97
Statistics II
Step 2
Logarithms
ln L(x1, · · · xn) = ln
1
√
2πσ
n
·e
Pn xi −µ 2
1
− 2 i=1
σ
It still looks hard, but after using some of the properties of logarithms1 the simplication will be important
ln
1
√
2πσ
n
·e
Pn xi −µ 2
1
− 2 i=1
σ
= ln √
1
1
2πσ
1
= ln √
2πσ
n
−
n
+ ln e
2
n X
1
xi − µ
2 i=1
σ
Pn xi −µ 2
1
− 2 i=1
σ
=
ln e
The logarithm of the product is the sum of logarithms, etc.
Year 2019 - 2020
98
Statistics II
Hence
ln L(x1, · · · xn) = ln √
Pas 3
1
2πσ
n
−
2
n X
xi − µ
1
2 i=1
σ
Maximize
We have to compute the derivative of ln L(x1, · · · , xn) with respect to µ and equate
it to zero.
∂ ln L(x1, · · · xn)
∂
=
∂µ
∂µ
ln √
1
2πσ
n
−
n X
1
2 i=1
xi − µ
σ
2 !
=
n
2
n ∂
1
∂ 1 X xi − µ
=
ln √
−
=
∂µ
∂µ
2
σ
2πσ
i=1
Year 2019 - 2020
99
Statistics II
2
n
n
n X
X
X
1
1
∂ xi − µ
xi − µ
1
xi − µ
=0−
=−
2
(− ) =
2
2 i=1 ∂µ
σ
2 i=1
σ
σ
σ
i=1
Hence,
∂L(x1, · · · xn)
=0⇒
∂µ
and nally,
n
X
i=1
Year 2019 - 2020
xi =
n X
xi − µ
i=1
n
X
i=1
µ⇒
σ2
n
X
i=1
n
n
X
1 X
= 0 ⇒ 2(
xi −
µ) = 0
σ i=1
i=1
Pn
xi = nµ ⇒ µ =
i=1 xi
n
100
Statistics II
That is, the maximum likelihood estimator of the population mean µ is the sample
mean X̄
'
$
Pn
µ̂M L =
&
Year 2019 - 2020
i=1 xi
n
= X̄
%
101
Statistics II
1.5.2 Method of moments
Consider a population distributed according to the density function
the unknown
The
population parameter
method of moments
Step 1
f (x, θ),
where
θ
is
that we want to study.
proceeds in 3 simple steps
Compute the expectation of
make it equal to the average
µ
X
according to the density function above and
of the population
Z
µ = E(X) =
xf (x, θ)dx
The result of this integral will be a function of the parameter
θ.
Hence we should
have something as
µ = g(θ)
Year 2019 - 2020
102
Statistics II
Step 2
Since we know that
X̄
is a good estimator of
µ, we just set µ = X̄
, that is:
X̄ = g(θ)
Step 3
Finally, just inverting the function
g
we can express
and we are done ! We have found an estimator for
moments estimator θ̂M M
θ
θ
X̄
method of
as a function of
that is called the
θ̂M M = g −1(X̄)
Example 36 Consider a population distributed according to the density function
f (x, θ) =
(θ + 1)xθ
0
0≤x≤1
otherwise
Find the Method of Moments estimator of θ
Year 2019 - 2020
103
Statistics II
Step 1 Expectation
Z
µ = E(x) =
1
x(θ + 1)xθ dx = (θ + 1)
1
x(θ+1)dx =
0
0
= (θ + 1)
Z
(θ+2) 1
x
θ+2
=
0
(θ + 1)
(θ + 2)
Hence, we can write
µ=
θ+1
θ+2
Step 2 Use the estimation of µ
θ+1
X̄ =
θ+2
Year 2019 - 2020
104
Statistics II
Step 3 Solve for θ
θ̂M M =
Year 2019 - 2020
1 − 2X̄
X̄ − 1
105
Download