Applied Statistics I Liang Zhang July 14, 2008

advertisement
Applied Statistics I
Liang Zhang
Department of Mathematics, University of Utah
July 14, 2008
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
1 / 18
Point Estimation
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
2 / 18
Point Estimation
Problem: when there are more then one point estimator for parameter θ,
which one of them should we use?
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
2 / 18
Point Estimation
Problem: when there are more then one point estimator for parameter θ,
which one of them should we use?
There are a few criteria for us to select the best point estimator:
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
2 / 18
Point Estimation
Problem: when there are more then one point estimator for parameter θ,
which one of them should we use?
There are a few criteria for us to select the best point estimator:
unbiasedness,
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
2 / 18
Point Estimation
Problem: when there are more then one point estimator for parameter θ,
which one of them should we use?
There are a few criteria for us to select the best point estimator:
unbiasedness,
minimum variance,
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
2 / 18
Point Estimation
Problem: when there are more then one point estimator for parameter θ,
which one of them should we use?
There are a few criteria for us to select the best point estimator:
unbiasedness,
minimum variance,
and mean square error.
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
2 / 18
Point Estimation
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
3 / 18
Point Estimation
Definition
A point estimator θ̂ is said to be an unbiased estimator of θ if E (θ̂) = θ
for every possible value of θ. If θ̂ is not unbiased, the difference E (θ̂) − θ
is called the bias of θ̂.
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
3 / 18
Point Estimation
Definition
A point estimator θ̂ is said to be an unbiased estimator of θ if E (θ̂) = θ
for every possible value of θ. If θ̂ is not unbiased, the difference E (θ̂) − θ
is called the bias of θ̂.
Principle of Unbiased Estimation
When choosing among several different estimators of θ, select one that is
unbiased.
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
3 / 18
Point Estimation
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
4 / 18
Point Estimation
Proposition
Let X1 , X2 , . . . , Xn be a random sample from a distribution with mean µ
and variance σ 2 . Then the estimators
Pn
Pn
(Xi − X )2
2
2
i=1 Xi
µ̂ = X =
and σ̂ = S = i=1
n
n−1
are unbiased estimator of µ and σ 2 , respectively.
e and
If in addition the distribution is continuous and symmetric, then X
any trimmed mean are also unbiased estimators of µ.
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
4 / 18
Point Estimation
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
5 / 18
Point Estimation
Principle of Minimum Variance Unbiased Estimation
Among all estimators of θ that are unbiased, choose the one that has
minimum variance. The resulting θ̂ is called the minimum variance
unbiased estimator ( MVUE) of θ.
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
5 / 18
Point Estimation
Principle of Minimum Variance Unbiased Estimation
Among all estimators of θ that are unbiased, choose the one that has
minimum variance. The resulting θ̂ is called the minimum variance
unbiased estimator ( MVUE) of θ.
Theorem
Let X1 , X2 , . . . , Xn be a random sample from a normal distribution with
mean µ and variance σ 2 . Then the estimator µ̂ = X is the MVUE for µ.
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
5 / 18
Point Estimation
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
6 / 18
Point Estimation
Definition
Let θ̂ be a point estimator of parameter θ. Then the quantity E [(θ̂ − θ)2 ]
is called the mean square error (MSE) of θ̂.
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
6 / 18
Point Estimation
Definition
Let θ̂ be a point estimator of parameter θ. Then the quantity E [(θ̂ − θ)2 ]
is called the mean square error (MSE) of θ̂.
Proposition
MSE = E [(θ̂ − θ)2 ] = V (θ̂) + [E (θ̂) − θ]2
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
6 / 18
Point Estimation
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
7 / 18
Point Estimation
Definition
The standard
error of an estimator θ̂ is its standard deviation
q
σθ̂ = V (θ̂). If the standard error itself involves unknown parameters
whose values can be estimated, substitution of these estimates into σθ̂
yields the estimated standard error (estimated standard deviation) of
the estimator. The estimated standard error can be denoted either by σ̂θ̂
or by sθ̂ .
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
7 / 18
Methods of Point Estimation
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
8 / 18
Methods of Point Estimation
The Invariance Principle
Let θ̂ be the mle of the parameter θ. Then the mle of any function h(θ) of
this parameter is the function h(θ̂).
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
8 / 18
Methods of Point Estimation
The Invariance Principle
Let θ̂ be the mle of the parameter θ. Then the mle of any function h(θ) of
this parameter is the function h(θ̂).
Proposition
Under very general conditions on the joint distribution of the sample, when
the sample size n is large, the maximum likelihood estimator of any
parameter θ is approximately unbiased [E (θ̂) ≈ θ] and has variance that is
nearly as small as can be achieved by any estimator. Stated another way,
the mle θ̂ is approximately the MVUE of θ.
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
8 / 18
Confidence Intervals
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
9 / 18
Confidence Intervals
Example (a variant of Problem 62, Ch5)
The total time for manufacturing a certain component is known to have a
normal distribution. However, the mean µ and variance σ 2 for the normal
distribution are unknown. After an experiment in which we manufactured
10 components, we recorded the sample time which is given as follows:
1
2
3
4
5
time 63.8 60.5 65.3 65.7 61.9
6
7
8
9
10
time 68.2 68.1 64.8 65.8 65.4
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
9 / 18
Confidence Intervals
Example (a variant of Problem 62, Ch5)
The total time for manufacturing a certain component is known to have a
normal distribution. However, the mean µ and variance σ 2 for the normal
distribution are unknown. After an experiment in which we manufactured
10 components, we recorded the sample time which is given as follows:
1
2
3
4
5
time 63.8 60.5 65.3 65.7 61.9
6
7
8
9
10
time 68.2 68.1 64.8 65.8 65.4
We know that both MME and MLE for the population mean µ is the
sample mean X , i.e. µ̂ = X = 64.95. How accurate is this estimation?
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
9 / 18
Confidence Intervals
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
10 / 18
Confidence Intervals
• Assume the other parameter σ is known, e.g. σ = 2.7
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
10 / 18
Confidence Intervals
• Assume the other parameter σ is known, e.g. σ = 2.7
• X is normally distributed with mean µ and variance σ 2 /n. Therefore,
X −µ
√ is a standard normal random variable.
Z = σ/
n
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
10 / 18
Confidence Intervals
• Assume the other parameter σ is known, e.g. σ = 2.7
• X is normally distributed with mean µ and variance σ 2 /n. Therefore,
X −µ
√ is a standard normal random variable.
Z = σ/
n
• For the interval [−A, A], how large should A be such that with 95%
confidence we are sure Z falls in that interval?
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
10 / 18
Confidence Intervals
• Assume the other parameter σ is known, e.g. σ = 2.7
• X is normally distributed with mean µ and variance σ 2 /n. Therefore,
X −µ
√ is a standard normal random variable.
Z = σ/
n
• For the interval [−A, A], how large should A be such that with 95%
confidence we are sure Z falls in that interval?
P(−A < Z < A) = .95
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
10 / 18
Confidence Intervals
• Assume the other parameter σ is known, e.g. σ = 2.7
• X is normally distributed with mean µ and variance σ 2 /n. Therefore,
X −µ
√ is a standard normal random variable.
Z = σ/
n
• For the interval [−A, A], how large should A be such that with 95%
confidence we are sure Z falls in that interval?
P(−A < Z < A) = .95
A is the 97.5the percentle, which is 1.96.
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
10 / 18
Confidence Intervals
• Assume the other parameter σ is known, e.g. σ = 2.7
• X is normally distributed with mean µ and variance σ 2 /n. Therefore,
X −µ
√ is a standard normal random variable.
Z = σ/
n
• For the interval [−A, A], how large should A be such that with 95%
confidence we are sure Z falls in that interval?
P(−A < Z < A) = .95
A is the 97.5the percentle, which is 1.96.
X −µ
√ < 1.96 = .95
• P −1.96 < σ/
n
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
10 / 18
Confidence Intervals
• Assume the other parameter σ is known, e.g. σ = 2.7
• X is normally distributed with mean µ and variance σ 2 /n. Therefore,
X −µ
√ is a standard normal random variable.
Z = σ/
n
• For the interval [−A, A], how large should A be such that with 95%
confidence we are sure Z falls in that interval?
P(−A < Z < A) = .95
A is the 97.5the percentle, which is 1.96.
X −µ
√ < 1.96 = .95
• P −1.96 < σ/
n
σ
σ
√
√
• P X − 1.96 · n < µ < X + 1.96 · n = .95
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
10 / 18
Confidence Intervals
• Assume the other parameter σ is known, e.g. σ = 2.7
• X is normally distributed with mean µ and variance σ 2 /n. Therefore,
X −µ
√ is a standard normal random variable.
Z = σ/
n
• For the interval [−A, A], how large should A be such that with 95%
confidence we are sure Z falls in that interval?
P(−A < Z < A) = .95
A is the 97.5the percentle, which is 1.96.
X −µ
√ < 1.96 = .95
• P −1.96 < σ/
n
σ
σ
√
√
• P X − 1.96 · n < µ < X + 1.96 · n = .95
• The interval X − 1.96 · √σn , X + 1.96 · √σn is called the 95%
confidence interval for µ.
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
10 / 18
Confidence Intervals
• Assume the other parameter σ is known, e.g. σ = 2.7
• X is normally distributed with mean µ and variance σ 2 /n. Therefore,
X −µ
√ is a standard normal random variable.
Z = σ/
n
• For the interval [−A, A], how large should A be such that with 95%
confidence we are sure Z falls in that interval?
P(−A < Z < A) = .95
A is the 97.5the percentle, which is 1.96.
X −µ
√ < 1.96 = .95
• P −1.96 < σ/
n
σ
σ
√
√
• P X − 1.96 · n < µ < X + 1.96 · n = .95
• The interval X − 1.96 · √σn , X + 1.96 · √σn is called the 95%
confidence interval for µ.
• In our case, 95% confidence interval for µ is (63.28, 66.62).
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
10 / 18
Confidence Intervals
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
11 / 18
Confidence Intervals
Interpretation of Confidence Interval
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
11 / 18
Confidence Intervals
Interpretation of Confidence Interval
• The 95% confidence interval for µ (63.28, 66.62) doesn’t mean
P(µ falls in the interval(63.28, 66.62)) = .95
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
11 / 18
Confidence Intervals
Interpretation of Confidence Interval
• The 95% confidence interval for µ (63.28, 66.62) doesn’t mean
P(µ falls in the interval(63.28, 66.62)) = .95
• It is a long-run effect: if we have 1000 random samples, then for
approximately
950 of them, µ falls
in the interval
σ
σ
X − 1.96 · √n , X + 1.96 · √n .
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
11 / 18
Confidence Intervals
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
12 / 18
Confidence Intervals
Example (a variant of Problem 62, Ch5)
The total time for manufacturing a certain component is known to have a
normal distribution. However, the mean µ for the normal distribution is
unknown. After an experiment in which we manufactured 10 components,
we recorded the sample time which is given as follows:
1
2
3
4
5
time 63.8 60.5 65.3 65.7 61.9
6
7
8
9
10
time 68.2 68.1 64.8 65.8 65.4
We know that both MME and MLE for the population mean µ is the
sample mean X , i.e. µ̂ = X = 64.95. We further assume the standard
deviation is known to be σ = 2.7. What is the 99% confidence
interval for µ?
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
12 / 18
Confidence Intervals
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
13 / 18
Confidence Intervals
Definition
A 100(1 − α)% confidence interval for the mean µ of a normal
population when the value of σ is known is given by
σ
σ
x − zα/2 · √ , x + zα/2 · √
n
n
or, equivalently, by x ∓ zα/2 ·
Liang Zhang (UofU)
√σ
n
Applied Statistics I
July 14, 2008
13 / 18
Confidence Intervals
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
14 / 18
Confidence Intervals
Graphically interpretation:
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
14 / 18
Confidence Intervals
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
15 / 18
Confidence Intervals
Example (a variant of Problem 62, Ch5)
The total time for manufacturing a certain component is known to have a
normal distribution. However, the mean µ for the normal distribution is
unknown. Thus we decide to do an experiment in which we manufacture n
components to estimate the population mean µ. We know that both
MME and MLE for the population mean µ is the sample mean X , i.e.
µ̂ = X . We further assume the standard deviation is known to be σ = 2.7.
If we want a 99% confidence interval for µ with width 3.34, how
large should n be?
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
15 / 18
Confidence Intervals
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
16 / 18
Confidence Intervals
Proposition
To obtain a 100(1 − α)% confidence interval with width w for the mean µ
of a normal population when the value of σ is known, we need a random
sample of size at least
σ 2
n = 2zα/2 ·
w
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
16 / 18
Confidence Intervals
Proposition
To obtain a 100(1 − α)% confidence interval with width w for the mean µ
of a normal population when the value of σ is known, we need a random
sample of size at least
σ 2
n = 2zα/2 ·
w
Remark:
The half-width w2 of the 100(1 − α)% CI is called the bound on the error
of estimation associated with a 100(1 − α)% confidence level.
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
16 / 18
Confidence Intervals
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
17 / 18
Confidence Intervals
Example:
Extensive experience with fans of a certain type used in diesel engines has
suggested that the exponential distribution provides a good model for time
until failure. However, the parameter λ is unknown. The following table
records the data for a size 10 sample:
1
2
3
4
5
time 1.199 0.105 0.373 0.266 0.888
6
7
8
9
10
time 0.574 0.244 0.008 0.689 0.235
What is a 95% confidence interval for λ?
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
17 / 18
Confidence Intervals
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
18 / 18
Confidence Intervals
Proposition
Let X1 , X2 , . . . , Xn i.i.d random variables from an expentional
distribution
P
with parameter λ. Then the random variable Y = 2λ ni=1 Xi has the
chi-squared distribution with 2n degrees of freedom, i.e., Y ∼ χ2 (2n)
Liang Zhang (UofU)
Applied Statistics I
July 14, 2008
18 / 18
Download