Uploaded by Jesus Saves

STATSLIDE8

advertisement
SAMPLE STATISTICS
A random sample x1 , x2 , . . . , xn from a distribution f (x) is a set of independently and identically variables with xi ∼ f (x) for all i. Their joint p.d.f is
f (x1 , x2 , . . . , xn ) = f (x1 )f (x2 ) · · · f (x2 ) =
n
f (xi ).
i=1
The sample moments provide estimates of the moments of f (x). We need to know how they
are distributed
The mean x̄ of a random sample is an unbiased estimate of the population moment µ = E(x),
since
x n
1
i
E(x̄) = E
=
E(xi ) = µ = µ.
n
n
n
The variance of a sum of independent variables is the sum of their variances, since covariances
are zero. Therefore
x n 2
σ2
1 i
V (xi ) = 2 σ =
= 2
.
V (x̄) = V
n
n
n
n
Observe that V (x̄) → 0 as n → ∞. Since E(x̄) = µ, the estimates become increasingly concentrated around the true population parameter. Such an estimate is said to be
consistent.
1
The sample variance is not an unbiased estimate of σ 2 = V (x), since
1
E(s2 ) = E
(xi − x̄)2
n
1
2
(xi − µ) + (µ − x̄)
=E
n
1
=E
(xi − µ)2 + 2(xi − µ)(µ − x̄) + (µ − x̄)2
n
= V (x) − 2E{(x̄ − µ)2 } + E{(x̄ − µ)2 } = V (x) − V (x̄).
Here, we have used the result that
1
E
(xi − µ)(µ − x̄) = −E{(µ − x̄)2 } = −V (x̄).
n
It follows that
(n − 1)
σ2
2
2
E(s ) = V (x) − V (x̄) = σ −
= σ2
.
n
n
Therefore, s2 is a biased estimator of the population variance. For an unbiased estimate, we
should use
(xi − x̄)2
2
2 n
=
.
σ̂ = s
n−1
n−1
However, s2 is still a consistent estimator, since E(s2 ) → σ 2 as n → ∞ and also V (s2 ) → 0.
The value of V (s2 ) depends on the distribution of underlying population, which is often
assumed to be a normal.
2
Theorem. Let x1 , x2 , . . . , xn be a random sample from the normal population N (µ, σ 2 ).
Then, y = ai xi is normally distributed with
E(y) =
ai E(xi ) = µ
ai
and
V (y) =
a2i V
(xi ) = σ
2
a2i .
Any linear function of a set of normally distributed variables is normally distributed. If
xi ∼ N (µ, σ 2 );
i = 1, . . . , n
is a normal random sample then
x̄ ∼ N (µ, σ 2 /n).
Let µ = [µ1 , µ2 , . . . , µn ] = E(x) be the expected value of x = [x1 , x2 , . . . , xn ] and let
Σ = [σij ; i, j = 1, 2, . . . , n] be the variance–covariance matrix. If a = [a1 , a2 , . . . , an ] is a
constant vector, then a x ∼ N (a µ, a Σa) is a normally distributed with a mean of
E(a x) = a µ =
ai µi
and a variance of
V (a x) = a Σa =
=
i
a2i σii +
i
j
i
3
ai aj σij
j=i
ai aj σij .
Let ι = [1, 1, . . . , 1] . Then, if x = [x1 , x2 , . . . , xn ] has xi ∼ N (µ, σ 2 ) for all i, there is
x ∼ N (µι, σ 2 In ), where µι = [µ, µ, . . . , µ] and In is an identity matrix of order n. Writing
this explicitly, we have
 
   2

µ
x1
0 ··· 0
σ
 x2 
 µ   0 σ 2 · · · 0 

  
.
x=
.. . .
.. 
 ...  ∼ N  ..  ,  ...

. . 
.
.
xn
0
µ
Then, there is
x̄ = (ι ι)−1 ι x =
0
· · · σ2
1 ι x ∼ N (µ, σ 2 /n)
n
and
1 2
ι {σ 2 I}ι
σ 2 ι ι
σ2
= 2 =
σ =
,
n
n2
n
n
where we have used repeatedly the result that ι ι = n .
If we do not know the form of the distribution from which the sample has been taken, we
can still say that, under very general conditions, the distribution of x̄ tends to normality as
n → ∞:
The Central Limit Theorem states that, if x1 , x2 , . . . , xn is a random sample from a distribution with mean µ and variance σ 2 , then the distribution of x̄ tends
to the
√
2
normal distribution N (µ, σ /n) as n → ∞. Equivalently, (x̄ − µ)/(σ/ n) tends in
distribution to the standard normal N (0, 1) distribution.
4
Distribution of the sample variance
To describe the distribution of the sample variance, we need to define the chi-square distribution:
Definition. If z ∼ N (0, 1) is distributed as a standard normal variable, then z 2 ∼ χ2 (1) is
distributed as a chi-square variate with one degree of freedom.
If zi ∼ N (0, 1); i = 1, 2, . . . , n are a set of independent and identically distributed standard
normal variates, then the sum of their squares is a chi-square variate of n degrees of freeedom:
n
zi2 ∼ χ2 (n).
u=
i=1
The mean and the variance of the χ2 (n) variate are E(u) = n and V (u) = 2n respectively.
Theorem. The sum of two independent chi-square variates is a chi-square variate with
degrees of freedom equal to the sum of the degrees of freedom of its additive components. If
x ∼ χ2 (n) and y ∼ χ2 (m), then (x + y) ∼ χ2 (m + n).
If x1 , x2 , . . . , xn is a random sample from a standard normal N (0, 1) distribution, then
x2i ∼ χ2 (n).
If x1 , x2 , . . . , xn is a random sample from an N (µ, σ 2 ) distribution, then (xi −µ)/σ ∼ N (0, 1)
and, therefore,
n
(xi − µ)2
2
∼
χ
(n).
2
σ
i=1
5
Consider the identity
2
(xi − µ) =
({xi − x̄} + {x̄ − µ})
2
2
=
{xi − x̄} + 2{x̄ − µ}{xi − x̄} + {x̄ − µ}
=
{xi − x̄}2 + n{x̄ − µ}2 ,
2
which follows from the fact that the cross product term is {x̄ − µ}
decomposition of a sum of squares features in the following result:
{xi − x̄} = 0. This
The Decomposition of a Chi-square statistic. If x1 , x2 , . . . , xn is a random sample
from a standard normal N (µ, σ 2 ) distribution, then
n
n
(x̄ − µ)2
(xi − µ)2
(xi − x̄)2
=
+n
,
2
2
2
σ
σ
σ
i=1
i=1
with
(1)
(2)
n
(xi − µ)2
i=1
n
i=1
σ2
∼ χ2 (n),
(xi − x̄)2
∼ χ2 (n − 1),
2
σ
(x̄ − µ)2
2
∼
χ
(1),
(3)
n
σ2
where the statistics under (2) and (3) are independently distributed.
6
Sampling Distributions
(1) If u ∼ χ2 (m) and v ∼ χ2 (n) are independent chi-square variates with m and n
degrees of freedom respectively, then
F =
u
m
v
∼ F (m, n),
n
which is the ratio of the chi-squares divided by their respective degrees of freedom,
has an F distribution of m and n degrees of freedom, denoted by F (m, n).
(2) If z ∼ N (0, 1) is a standard normal variate and if v ∼ χ2 (n) is a chi-square variate
of n degrees of freedom, and if the two variates are distributed independently, then
the ratio
v
∼ t(n)
t=z
n
has a t distributed of n degrees of freedom, denoted t(n).
Notice that
z2
2
t =
∼
v/n
χ2 (1)
1
7
χ2 (n)
n
= F (1, n).
CONFIDENCE INTERVALS
Let z ∼ N (0, 1). From the tables of the standard normal, we can find numbers a, b such
that, for any Q ∈ (0, 1), there is P (a ≤ z ≤ b) = Q. The interval [a, b] is called a Q × 100%
confidence interval for z. The length of the interval is minimised when it is centred on
E(z) = 0.
A confidence interval for the mean. Let xi ∼ N (µ, σ 2 ); i = 1, . . . , n be a random sample.
Then
σ2
x̄ − µ
√ ∼ N (0, 1).
x̄ ∼ N µ,
and
n
σ/ n
Therefore, we can find numbers ±β such that
x̄ − µ
√ ≤ β = Q.
P −β ≤
σ/ n
But, the following events are equivalent:
σ
σ
x̄ − µ
√ ≤ β ⇐⇒ x̄ − β √ ≤ µ ≤ x̄ + β √
.
−β ≤
σ/ n
n
n
√
√
The probability that [x̄ − βσ/ n, x̄ + βσ/ n] falls over µ is
σ
σ
= Q,
P x̄ − β √ ≤ µ ≤ x̄ + β √
n
n
which means that we Q × 100% confident that µ lies in the interval.
8
A confidence interval for µ when σ 2 is unknown. The unbiased estimate of σ 2 is
σ̂ 2 = (xi − x̄)2 /(n − 1). When σ 2 is replaced by σ̂ 2 ,
√
n(x̄ − µ)
∼ N (0, 1)
σ
is replaced by
√
n(x̄ − µ)
∼ t(n − 1).
σ̂
To demonstrate this result, consider writing
√
!
"
√
2
(xi − x̄)
n(x̄ − µ)
n(x̄ − µ)
=
,
σ̂
σ
σ 2 (n − 1)
and observe that σ is cancelled from the numerator and the denominator.
The denominator contains (xi − x̄)2 /σ 2 ∼ χ2 (n − 1). The numerator is a standard normal
variate. Therefore
"
2
χ (n − 1)
N (0, 1)
∼ t(n − 1).
n−1
To construct a confidence interval, ±β, from the table of the N (0, 1) distribution, are replaced
by corresponding numbers ±b, from the t(n − 1) table. Then
σ̂
σ̂
P x̄ − b √ ≤ µ ≤ x̄ + b √
= Q.
n
n
9
A confidence interval for the difference between two means. Imagine a treatment
that affects the mean of a normal population without affecting its variance.
To establish a confidence interval for the change in the mean, take samples before and after
the treatment. Before treatment, there is
xi ∼ N (µx , σ 2 );
i = 1, . . . , n
and
x̄ ∼ N
σ2
µx ,
n
,
and, after treatment, there is
yj ∼ N (µy , σ 2 );
j = 1, . . . , m
and
ȳ ∼ N
σ2
µy ,
m
.
Assuming that the samples are mutually independent, the difference between their means is
(x̄ − ȳ) ∼ N
Hence
σ2
σ2
µx − µy ,
+
n
m
(x̄ − ȳ) − (µx − µy )
∼ N (0, 1).
2
2
σ
σ
+
n
m
10
.
If σ 2 were known, then, for any given value of Q ∈ (0, 1), a number β can be found from the
N (0, 1) table such that
P
(x̄ − ȳ) − β
σ2
+
n
σ2
m
≤ µx − µy ≤ (x̄ − ȳ) + β
σ2
n
+
σ2
m
"
= Q,
giving a confidence interval for µx − µy . Usually, σ 2 has to be estimated from the sample
information. There are
(xi − x̄)2
σ2
∼ χ (n − 1)
2
and
(yj − ȳ)2
2
∼
χ
(m − 1),
σ2
which are independent variates with expectations equal to the numbers of their degrees of
freedom. The sum of independent chi-squares is itself a chi-square with degrees of freedom
equal to the sum of those of its constituent parts. Therefore,
(xi − x̄)2 +
σ2
(yj − ȳ)2
∼ χ2 (n + m − 2)
has an expected value of n + m − 2, whence the unbiased estimate of the variance is
2
σ̂ =
(xi − x̄)2 + (yj − ȳ)2
.
n+m−2
11
If the estimate is used in place of the unknown value of σ 2 , then we get


!
(xi − x̄)2 + (yj − ȳ)2 
(x̄ − ȳ) − (µx − µy )  (x̄ − ȳ) − (µx − µy )
&
=
2
2


σ 2 (n + m − 2)
σ2
σ2
σ̂
σ̂
+
n
m
+
n
m
N (0, 1)
= t(n + m − 2),
∼& 2
χ (n+m−2)
n+m−2
which is the basis for a confidence interval.
A confidence interval for the variance. If xi ∼ N (µ, σ 2 ); i = 1, . . . , n is a random
sample, then (xi − x̄)2 /(n−1) is an unbiased estimate of the variance and (xi − x̄)2 /σ 2 ∼
χ2 (n − 1). Therefore, from the appropriate chi-square table, we can find numbers α and β
such that
(xi − x̄)2
≤β =Q
P α≤
σ2
for some chosen Q ∈ (0, 1). From this, it follows that
P
1
≥
α
2
σ
1
≥
β
(xi − x̄)2
= Q ⇐⇒ P
(xi − x̄)
≤ σ2 ≤
β
and the latter provides a confidence interval for σ 2 .
12
2
(xi − x̄)
α
2
=Q
The confidence interval for the ratio of two variances. Imagine a treatment that
affects the variance of a normal population. It is possibile that the mean is also affected. Let
xi ∼ N (µx , σx2 ); i = 1, . . . , n be a random sample taken from the population before treatment
and let yj ∼ N (µy , σy2 ); j = 1, . . . , m be a random sample taken after treatment. Then
(xi − x̄)2
σ2
∼ χ (n − 1)
2
(yj − ȳ)2
∼ χ2 (m − 1),
2
σ
and
are independent chi-squared variates, and hence
F =
(xi − x̄)
σx2 (n − 1)
2
(yj − ȳ)
σy2 (m − 1)
2
"
∼ F (n − 1, m − 1).
It is possible to find numbers α and β such that P (α ≤ F ≤ β) = Q, where Q ∈ (0, 1)
is some chose probability value. Given such values, we may make the following probability
statement:
+
*
2
2
2
σy
(yj − ȳ) (n − 1)
(yj − ȳ) (n − 1)
≤
β
≤
= Q.
P α
σx2
(xi − x̄)2 (m − 1)
(xi − x̄)2 (m − 1)
13
Download