Definition

advertisement
Point Estimation
Point Estimation
Definition
A point estimate of a parameter θ is a single number that can be
regarded as a sensible value for θ. A point estimate is obtained by
selecting a suitable statistic and computing its value from the
given sample data. The selected statistic is called the point
estimator of θ.
Point Estimation
Point Estimation
Definition
A point estimator θ̂ is said to be an unbiased estimator of θ if
E (θ̂) = θ for every possible value of θ. If θ̂ is not unbiased, the
difference E (θ̂) − θ is called the bias of θ̂.
Point Estimation
Definition
A point estimator θ̂ is said to be an unbiased estimator of θ if
E (θ̂) = θ for every possible value of θ. If θ̂ is not unbiased, the
difference E (θ̂) − θ is called the bias of θ̂.
Principle of Unbiased Estimation
When choosing among several different estimators of θ, select one
that is unbiased.
Point Estimation
Point Estimation
Proposition
Let X1 , X2 , . . . , Xn be a random sample from a distribution with
mean µ and variance σ 2 . Then the estimators
Pn
Pn
(Xi − X )2
2
2
i=1 Xi
µ̂ = X =
and σ̂ = S = i=1
n
n−1
are unbiased estimator of µ and σ 2 , respectively.
e
If in addition the distribution is continuous and symmetric, then X
and any trimmed mean are also unbiased estimators of µ.
Point Estimation
Point Estimation
Principle of Minimum Variance Unbiased Estimation
Among all estimators of θ that are unbiased, choose the one that
has minimum variance. The resulting θ̂ is called the minimum
variance unbiased estimator ( MVUE) of θ.
Point Estimation
Principle of Minimum Variance Unbiased Estimation
Among all estimators of θ that are unbiased, choose the one that
has minimum variance. The resulting θ̂ is called the minimum
variance unbiased estimator ( MVUE) of θ.
Theorem
Let X1 , X2 , . . . , Xn be a random sample from a normal distribution
with mean µ and variance σ 2 . Then the estimator µ̂ = X is the
MVUE for µ.
Point Estimation
Point Estimation
Definition
Let θ̂ be a point estimator of parameter θ. Then the quantity
E [(θ̂ − θ)2 ] is called the mean square error (MSE) of θ̂.
Point Estimation
Definition
Let θ̂ be a point estimator of parameter θ. Then the quantity
E [(θ̂ − θ)2 ] is called the mean square error (MSE) of θ̂.
Proposition
MSE = E [(θ̂ − θ)2 ] = V (θ̂) + [E (θ̂) − θ]2
Point Estimation
Point Estimation
Definition
The standard
error of an estimator θ̂ is its standard deviation
q
σθ̂ = V (θ̂). If the standard error itself involves unknown
parameters whose values can be estimated, substitution of these
estimates into σθ̂ yields the estimated standard error (estimated
standard deviation) of the estimator. The estimated standard error
can be denoted either by σ̂θ̂ or by sθ̂ .
Confidence Intervals
Confidence Intervals
Example (a variant of Problem 62, Ch5)
The total time for manufacturing a certain component is known to
have a normal distribution. However, the mean µ and variance σ 2
for the normal distribution are unknown. After an experiment in
which we manufactured 10 components, we recorded the sample
time which is given as follows:
1
2
3
4
5
time 63.8 60.5 65.3 65.7 61.9
6
7
8
9
10
time 68.2 68.1 64.8 65.8 65.4
Confidence Intervals
Example (a variant of Problem 62, Ch5)
The total time for manufacturing a certain component is known to
have a normal distribution. However, the mean µ and variance σ 2
for the normal distribution are unknown. After an experiment in
which we manufactured 10 components, we recorded the sample
time which is given as follows:
1
2
3
4
5
time 63.8 60.5 65.3 65.7 61.9
6
7
8
9
10
time 68.2 68.1 64.8 65.8 65.4
We know that both MME and MLE for the population mean µ is
the sample mean X , i.e. µ̂ = X = 64.95. How accurate is this
estimation?
Confidence Intervals
Confidence Intervals
• Assume the other parameter σ is known, e.g. σ = 2.7
Confidence Intervals
• Assume the other parameter σ is known, e.g. σ = 2.7
• X is normally distributed with mean µ and variance σ 2 /n.
X −µ
√ is a standard normal random variable.
Therefore, Z = σ/
n
Confidence Intervals
• Assume the other parameter σ is known, e.g. σ = 2.7
• X is normally distributed with mean µ and variance σ 2 /n.
X −µ
√ is a standard normal random variable.
Therefore, Z = σ/
n
• For the interval [−A, A], how large should A be such that with
95% confidence we are sure Z falls in that interval?
Confidence Intervals
• Assume the other parameter σ is known, e.g. σ = 2.7
• X is normally distributed with mean µ and variance σ 2 /n.
X −µ
√ is a standard normal random variable.
Therefore, Z = σ/
n
• For the interval [−A, A], how large should A be such that with
95% confidence we are sure Z falls in that interval?
P(−A < Z < A) = .95
Confidence Intervals
• Assume the other parameter σ is known, e.g. σ = 2.7
• X is normally distributed with mean µ and variance σ 2 /n.
X −µ
√ is a standard normal random variable.
Therefore, Z = σ/
n
• For the interval [−A, A], how large should A be such that with
95% confidence we are sure Z falls in that interval?
P(−A < Z < A) = .95
A is the 97.5th percentle, which is 1.96.
Confidence Intervals
• Assume the other parameter σ is known, e.g. σ = 2.7
• X is normally distributed with mean µ and variance σ 2 /n.
X −µ
√ is a standard normal random variable.
Therefore, Z = σ/
n
• For the interval [−A, A], how large should A be such that with
95% confidence we are sure Z falls in that interval?
P(−A < Z < A) = .95
A is the 97.5th percentle, which is 1.96.
X −µ
√ < 1.96 = .95
• P −1.96 < σ/
n
Confidence Intervals
• Assume the other parameter σ is known, e.g. σ = 2.7
• X is normally distributed with mean µ and variance σ 2 /n.
X −µ
√ is a standard normal random variable.
Therefore, Z = σ/
n
• For the interval [−A, A], how large should A be such that with
95% confidence we are sure Z falls in that interval?
P(−A < Z < A) = .95
A is the 97.5th percentle, which is 1.96.
X −µ
√ < 1.96 = .95
• P −1.96 < σ/
n
• P X − 1.96 · √σn < µ < X + 1.96 · √σn = .95
Confidence Intervals
• Assume the other parameter σ is known, e.g. σ = 2.7
• X is normally distributed with mean µ and variance σ 2 /n.
X −µ
√ is a standard normal random variable.
Therefore, Z = σ/
n
• For the interval [−A, A], how large should A be such that with
95% confidence we are sure Z falls in that interval?
P(−A < Z < A) = .95
A is the 97.5th percentle, which is 1.96.
X −µ
√ < 1.96 = .95
• P −1.96 < σ/
n
• P X − 1.96 · √σn < µ < X + 1.96 · √σn = .95
• The interval X − 1.96 · √σn , X + 1.96 · √σn is called the
95% confidence interval for µ.
Confidence Intervals
• Assume the other parameter σ is known, e.g. σ = 2.7
• X is normally distributed with mean µ and variance σ 2 /n.
X −µ
√ is a standard normal random variable.
Therefore, Z = σ/
n
• For the interval [−A, A], how large should A be such that with
95% confidence we are sure Z falls in that interval?
P(−A < Z < A) = .95
A is the 97.5th percentle, which is 1.96.
X −µ
√ < 1.96 = .95
• P −1.96 < σ/
n
• P X − 1.96 · √σn < µ < X + 1.96 · √σn = .95
• The interval X − 1.96 · √σn , X + 1.96 · √σn is called the
95% confidence interval for µ.
• In our case, 95% confidence interval for µ is (63.28, 66.62).
Confidence Intervals
Confidence Intervals
Interpretation of Confidence Interval
Confidence Intervals
Interpretation of Confidence Interval
• The 95% confidence interval for µ (63.28, 66.62) doesn’t
mean
P(µ falls in the interval(63.28, 66.62)) = .95
Confidence Intervals
Interpretation of Confidence Interval
• The 95% confidence interval for µ (63.28, 66.62) doesn’t
mean
P(µ falls in the interval(63.28, 66.62)) = .95
• It is a long-run effect: if we have 1000 random samples, then
for approximately 950 of them,µ falls in the interval
X − 1.96 · √σn , X + 1.96 · √σn .
Confidence Intervals
Confidence Intervals
Example (a variant of Problem 62, Ch5)
The total time for manufacturing a certain component is known to
have a normal distribution. However, the mean µ for the normal
distribution is unknown. After an experiment in which we
manufactured 10 components, we recorded the sample time which
is given as follows:
1
2
3
4
5
time 63.8 60.5 65.3 65.7 61.9
6
7
8
9
10
time 68.2 68.1 64.8 65.8 65.4
We know that both MME and MLE for the population mean µ is
the sample mean X , i.e. µ̂ = X = 64.95. We further assume the
standard deviation is known to be σ = 2.7. What is the 99%
confidence interval for µ?
Confidence Intervals
Confidence Intervals
Definition
A 100(1 − α)% confidence interval for the mean µ of a normal
population when the value of σ is known is given by
σ
σ
x − zα/2 · √ , x + zα/2 · √
n
n
or, equivalently, by x ∓ zα/2 ·
√σ
n
Confidence Intervals
Confidence Intervals
Graphically interpretation:
Confidence Intervals
Confidence Intervals
Example (a variant of Problem 62, Ch5)
The total time for manufacturing a certain component is known to
have a normal distribution. However, the mean µ for the normal
distribution is unknown. Thus we decide to do an experiment in
which we manufacture n components to estimate the population
mean µ. We know that both MME and MLE for the population
mean µ is the sample mean X , i.e. µ̂ = X . We further assume the
standard deviation is known to be σ = 2.7. If we want a 99%
confidence interval for µ with width 3.34, how large should
n be?
Confidence Intervals
Confidence Intervals
Proposition
To obtain a 100(1 − α)% confidence interval with width w for the
mean µ of a normal population when the value of σ is known, we
need a random sample of size at least
σ 2
n = 2zα/2 ·
w
Confidence Intervals
Proposition
To obtain a 100(1 − α)% confidence interval with width w for the
mean µ of a normal population when the value of σ is known, we
need a random sample of size at least
σ 2
n = 2zα/2 ·
w
Remark:
The half-width w2 of the 100(1 − α)% CI is called the bound on
the error of estimation associated with a 100(1 − α)%
confidence level.
Confidence Intervals
Confidence Intervals
Example:
Extensive experience with fans of a certain type used in diesel
engines has suggested that the exponential distribution provides a
good model for time until failure. However, the parameter λ is
unknown. The following table records the data for a size 10
sample:
1
2
3
4
5
time 1.199 0.105 0.373 0.266 0.888
6
7
8
9
10
time 0.574 0.244 0.008 0.689 0.235
What is a 95% confidence interval for λ?
Confidence Intervals
Confidence Intervals
Proposition
Let X1 , X2 , . . . , Xn i.i.d random variables from an expentional
distribution
P with parameter λ. Then the random variable
Y = 2λ ni=1 Xi has the chi-squared distribution with 2n degrees
of freedom, i.e., Y ∼ χ2 (2n)
Large-Sample Confidence Intervals
Large-Sample Confidence Intervals
Proposition
If n is sufficiently large, the standardized variable
Z=
X −µ
√
S/ n
has approximately a standard normal distribution. This implies that
s
x̄ ± zα/2 · √
n
is a large-sample confidence interval for µ with confidence level
approximately 100(1 − α)%. This formula is valid regardless of the
shape of the population distribution.
Large-Sample Confidence Intervals
Large-Sample Confidence Intervals
Example (a variant of Problem 16)
The charge-to-tap time (min) for a carbon steel in one type of
open hearth furnace was determined for each heat in a sample of
size 46, resulting in a sample mean time of 382.1 and a sample
standard deviation of 31.5. Calculate a 95% confidence interval for
true average charge-to-tap time.
Large-Sample Confidence Intervals
Large-Sample Confidence Intervals
Example (Problem 19)
The article “Limited Yield Estimation for Visual Defect Sources”
(IEEE Trans. on Semiconductor Manuf., 1997: 17-23) reported
that, in a study of a particular wafer inspection process, 356 dies
were examined by an inspection probe and 201 of these passed the
probe. Assuming a stable process, calculate a 95% confidence
interval for the proportion of all dies that pass the probe.
Large-Sample Confidence Intervals
Large-Sample Confidence Intervals
Proposition
A confidence interval for a population proportion p with
confidence level approximately 100(1 − α)% has
r
p̂ +
lower confidence limit =
2
zα/2
2n
p̂q̂
n
− zα/2
+
2
zα/2
4n2
2 )/n
1 + (zα/2
and
p̂ +
upper confidence limit =
2
zα/2
2n
r
+ zα/2
p̂q̂
n
2 )/n
1 + (zα/2
+
2
zα/2
4n2
Large-Sample Confidence Intervals
Large-Sample Confidence Intervals
Example (Problem 16)
The charge-to-tap time (min) for a carbon steel in one type of
open hearth furnace was determined for each heat in a sample of
size 46, resulting in a sample mean time of 382.1 and a sample
standard deviation of 31.5. Calculate a 95% upper confidence
bound for true average charge-to-tap time.
Large-Sample Confidence Intervals
Large-Sample Confidence Intervals
Example (Problem 19)
The article “Limited Yield Estimation for Visual Defect Sources”
(IEEE Trans. on Semiconductor Manuf., 1997: 17-23) reported
that, in a study of a particular wafer inspection process, 356 dies
were examined by an inspection probe and 201 of these passed the
probe. Assuming a stable process, calculate a 95% lower
confidence bound for the proportion of all dies that pass the probe.
Large-Sample Confidence Intervals
Large-Sample Confidence Intervals
Proposition
A large-sample upper confidence bound for µ is
s
µ < x̄ + zα · √
n
and a large-sample lower confidence bound for µ is
s
µ > x̄ − zα · √
n
A one-sided confidence bound for p results from replacing zα/2
by zα and ± by either + or − in the CI formula for p. In all cases
the confidence level is approximately 100(1 − α)%
Download