Applied Statistics I Liang Zhang July 17, 2008

advertisement
Applied Statistics I
Liang Zhang
Department of Mathematics, University of Utah
July 17, 2008
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
1 / 23
Large-Sample Confidence Intervals
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
2 / 23
Large-Sample Confidence Intervals
Proposition
If n is sufficiently large, the standardized variable
Z=
X −µ
√
S/ n
has approximately a standard normal distribution. This implies that
s
x̄ ± zα/2 · √
n
is a large-sample confidence interval for µ with confidence level
approximately 100(1 − α)%. This formula is valid regardless of the shape
of the population distribution.
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
2 / 23
Large-Sample Confidence Intervals
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
3 / 23
Large-Sample Confidence Intervals
Example (a variant of Problem 16)
The charge-to-tap time (min) for a carbon steel in one type of open hearth
furnace was determined for each heat in a sample of size 46, resulting in a
sample mean time of 382.1 and a sample standard deviation of 31.5.
Calculate a 95% confidence interval for true average charge-to-tap time.
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
3 / 23
Large-Sample Confidence Intervals
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
4 / 23
Large-Sample Confidence Intervals
Example (Problem 19)
The article “Limited Yield Estimation for Visual Defect Sources” (IEEE
Trans. on Semiconductor Manuf., 1997: 17-23) reported that, in a study
of a particular wafer inspection process, 356 dies were examined by an
inspection probe and 201 of these passed the probe. Assuming a stable
process, calculate a 95% confidence interval for the proportion of all dies
that pass the probe.
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
4 / 23
Large-Sample Confidence Intervals
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
5 / 23
Large-Sample Confidence Intervals
Proposition
A confidence interval for a population proportion p with confidence
level approximately 100(1 − α)% has
r
p̂ +
lower confidence limit =
2
zα/2
2n
p̂ +
Liang Zhang (UofU)
+
2
zα/2
4n2
2 )/n
1 + (zα/2
and
upper confidence limit =
p̂q̂
n
− zα/2
2
zα/2
2n
Applied Statistics I
r
+ zα/2
1+
p̂q̂
n
+
2
zα/2
4n2
2 )/n
(zα/2
July 17, 2008
5 / 23
Large-Sample Confidence Intervals
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
6 / 23
Large-Sample Confidence Intervals
Example (Problem 16)
The charge-to-tap time (min) for a carbon steel in one type of open hearth
furnace was determined for each heat in a sample of size 46, resulting in a
sample mean time of 382.1 and a sample standard deviation of 31.5.
Calculate a 95% upper confidence bound for true average charge-to-tap
time.
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
6 / 23
Large-Sample Confidence Intervals
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
7 / 23
Large-Sample Confidence Intervals
Example (Problem 19)
The article “Limited Yield Estimation for Visual Defect Sources” (IEEE
Trans. on Semiconductor Manuf., 1997: 17-23) reported that, in a study
of a particular wafer inspection process, 356 dies were examined by an
inspection probe and 201 of these passed the probe. Assuming a stable
process, calculate a 95% lower confidence bound for the proportion of all
dies that pass the probe.
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
7 / 23
Large-Sample Confidence Intervals
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
8 / 23
Large-Sample Confidence Intervals
Proposition
A large-sample upper confidence bound for µ is
s
µ < x̄ + zα · √
n
and a large-sample lower confidence bound for µ is
s
µ > x̄ − zα · √
n
A one-sided confidence bound for p results from replacing zα/2 by zα
and ± by either + or − in the CI formula for p. In all cases the confidence
level is approximately 100(1 − α)%
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
8 / 23
Confidence Intervals for Normal Distribution
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
9 / 23
Confidence Intervals for Normal Distribution
Example (a variant of Problem 62, Ch5)
The total time for manufacturing a certain component is known to have a
normal distribution. However, the mean µ and variance σ 2 for the normal
distribution are unknown. After an experiment in which we manufactured
10 components, we recorded the sample time which is given as follows:
1
2
3
4
5
time 63.8 60.5 65.3 65.7 61.9
with X = 64.95, s = 2.42
6
7
8
9
10
time 68.2 68.1 64.8 65.8 65.4
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
9 / 23
Confidence Intervals for Normal Distribution
Example (a variant of Problem 62, Ch5)
The total time for manufacturing a certain component is known to have a
normal distribution. However, the mean µ and variance σ 2 for the normal
distribution are unknown. After an experiment in which we manufactured
10 components, we recorded the sample time which is given as follows:
1
2
3
4
5
time 63.8 60.5 65.3 65.7 61.9
with X = 64.95, s = 2.42
6
7
8
9
10
time 68.2 68.1 64.8 65.8 65.4
What is the 95% confidence interval for the population mean µ?
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
9 / 23
Confidence Intervals for Normal Distribution
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
10 / 23
Confidence Intervals for Normal Distribution
Theorem
Let X1 , X2 , . . . , Xn be a random sample from a normal distribution with
mean µ and variance σ 2 , where µ and σ are unknown. The random
variable
X −µ
√
T =
S/ n
has a probability distribution called a t distribution with n − 1
degrees of freedom (df). Here X is the sample mean and S is the
sample standard deviation.
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
10 / 23
Confidence Intervals for Normal Distribution
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
11 / 23
Confidence Intervals for Normal Distribution
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
11 / 23
Confidence Intervals for Normal Distribution
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
12 / 23
Confidence Intervals for Normal Distribution
Properties of t Distributions:
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
12 / 23
Confidence Intervals for Normal Distribution
Properties of t Distributions:
Let tν denote the density function curve for ν df.
1. tν is governed by only one parameter ν, the number of degrees of
freedom.
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
12 / 23
Confidence Intervals for Normal Distribution
Properties of t Distributions:
Let tν denote the density function curve for ν df.
1. tν is governed by only one parameter ν, the number of degrees of
freedom.
2. Each tν curve is bell-shaped and centered at 0.
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
12 / 23
Confidence Intervals for Normal Distribution
Properties of t Distributions:
Let tν denote the density function curve for ν df.
1. tν is governed by only one parameter ν, the number of degrees of
freedom.
2. Each tν curve is bell-shaped and centered at 0.
3. Each tν curve is more spread out than the standard normal (z) curve.
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
12 / 23
Confidence Intervals for Normal Distribution
Properties of t Distributions:
Let tν denote the density function curve for ν df.
1. tν is governed by only one parameter ν, the number of degrees of
freedom.
2. Each tν curve is bell-shaped and centered at 0.
3. Each tν curve is more spread out than the standard normal (z) curve.
4. As ν increases, the spread of the corresponding tν curve decreases.
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
12 / 23
Confidence Intervals for Normal Distribution
Properties of t Distributions:
Let tν denote the density function curve for ν df.
1. tν is governed by only one parameter ν, the number of degrees of
freedom.
2. Each tν curve is bell-shaped and centered at 0.
3. Each tν curve is more spread out than the standard normal (z) curve.
4. As ν increases, the spread of the corresponding tν curve decreases.
5. As ν → ∞, the sequence of tν curves approaches the standard normal
curve (so the z curve is often called the t curve with df=∞).
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
12 / 23
Confidence Intervals for Normal Distribution
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
13 / 23
Confidence Intervals for Normal Distribution
Notation
Let tα,ν = the number on the measurement axis for which the area under
the t curve with ν df to the right of tα,ν is α; tα,ν is called a t critical
value.
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
13 / 23
Confidence Intervals for Normal Distribution
Notation
Let tα,ν = the number on the measurement axis for which the area under
the t curve with ν df to the right of tα,ν is α; tα,ν is called a t critical
value.
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
13 / 23
Confidence Intervals for Normal Distribution
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
14 / 23
Confidence Intervals for Normal Distribution
Proposition
Let x̄ and s be the sample mean and sample standard deviation computed
from the results of a random sample from a normal population with mean
µ. Then a 100(1 − α)% confidence interval for µ is
s
s
α
α
x̄ − t 2 ,n−1 · √ , x̄ + t 2 ,n−1 · √
n
n
or, more compactly, x̄ ± t α2 ,n−1 · √sn .
An upper confidence bound for µ is
s
x̄ + tα,n−1 · √
n
and replacing + by − in this latter expression gives a lower confidence
bound for µ, both with confidence level 100(1 − α)%.
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
14 / 23
Confidence Intervals for Normal Distribution
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
15 / 23
Confidence Intervals for Normal Distribution
Example (a variant of Problem 62, Ch5)
The total time for manufacturing a certain component is known to have a
normal distribution. However, the mean µ and variance σ 2 for the normal
distribution are unknown. After an experiment in which we manufactured
10 components, we recorded the sample time which is given as follows:
1
2
3
4
5
time 63.8 60.5 65.3 65.7 61.9
with X = 64.95, s = 2.42
6
7
8
9
10
time 68.2 68.1 64.8 65.8 65.4
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
15 / 23
Confidence Intervals for Normal Distribution
Example (a variant of Problem 62, Ch5)
The total time for manufacturing a certain component is known to have a
normal distribution. However, the mean µ and variance σ 2 for the normal
distribution are unknown. After an experiment in which we manufactured
10 components, we recorded the sample time which is given as follows:
1
2
3
4
5
time 63.8 60.5 65.3 65.7 61.9
with X = 64.95, s = 2.42
6
7
8
9
10
time 68.2 68.1 64.8 65.8 65.4
What is the 95% confidence interval for the 11th component?
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
15 / 23
Confidence Intervals for Normal Distribution
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
16 / 23
Confidence Intervals for Normal Distribution
Proposition
A prediction interval (PI) for a single observation to be selected from a
normal population distribution is
r
1
x̄ ± t α2 ,n−1 · s 1 +
n
The prediction level is 100(1 − α)%.
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
16 / 23
Confidence Intervals for Normal Distribution
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
17 / 23
Confidence Intervals for Normal Distribution
Example (a variant of Problem 62, Ch5)
The total time for manufacturing a certain component is known to have a
normal distribution. However, the mean µ and variance σ 2 for the normal
distribution are unknown. After an experiment in which we manufactured
10 components, we recorded the sample time which is given as follows:
1
2
3
4
5
time 63.8 60.5 65.3 65.7 61.9
with X = 64.95, s = 2.42
6
7
8
9
10
time 68.2 68.1 64.8 65.8 65.4
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
17 / 23
Confidence Intervals for Normal Distribution
Example (a variant of Problem 62, Ch5)
The total time for manufacturing a certain component is known to have a
normal distribution. However, the mean µ and variance σ 2 for the normal
distribution are unknown. After an experiment in which we manufactured
10 components, we recorded the sample time which is given as follows:
1
2
3
4
5
time 63.8 60.5 65.3 65.7 61.9
with X = 64.95, s = 2.42
6
7
8
9
10
time 68.2 68.1 64.8 65.8 65.4
What is the 95% confidence interval such that at least 90% of the values
in the population are inside this interval?
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
17 / 23
Confidence Intervals for Normal Distribution
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
18 / 23
Confidence Intervals for Normal Distribution
Proposition
A tolerance interval for capturing at least k% of the values in a normal
population distribution with a confidence level 95%has the form
x̄ ± (tolerance critical value) · s
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
18 / 23
Confidence Intervals for Normal Distribution
Proposition
A tolerance interval for capturing at least k% of the values in a normal
population distribution with a confidence level 95%has the form
x̄ ± (tolerance critical value) · s
The tolerance critical values for k = 90, 95, and 99 in combination with
various sample sizes are given in Appendix Table A.6.
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
18 / 23
Confidence Intervals for the Variance of a Normal
Population
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
19 / 23
Confidence Intervals for the Variance of a Normal
Population
Example (a variant of Problem 62, Ch5)
The total time for manufacturing a certain component is known to have a
normal distribution. However, the mean µ and variance σ 2 for the normal
distribution are unknown. After an experiment in which we manufactured
10 components, we recorded the sample time which is given as follows:
1
2
3
4
5
time 63.8 60.5 65.3 65.7 61.9
with X = 64.95, s = 2.42
6
7
8
9
10
time 68.2 68.1 64.8 65.8 65.4
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
19 / 23
Confidence Intervals for the Variance of a Normal
Population
Example (a variant of Problem 62, Ch5)
The total time for manufacturing a certain component is known to have a
normal distribution. However, the mean µ and variance σ 2 for the normal
distribution are unknown. After an experiment in which we manufactured
10 components, we recorded the sample time which is given as follows:
1
2
3
4
5
time 63.8 60.5 65.3 65.7 61.9
with X = 64.95, s = 2.42
6
7
8
9
10
time 68.2 68.1 64.8 65.8 65.4
What is a 95% confidence for the population variance σ 2 ?
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
19 / 23
Confidence Intervals for the Variance of a Normal
Population
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
20 / 23
Confidence Intervals for the Variance of a Normal
Population
Theorem
Let X1 , X2 , . . . , Xn be a random sample from a distribution with mean µ
and variance σ 2 . Then the random variable
P
(n − 1)S 2
(Xi − X )2
=
σ2
σ2
has s chi-squared (χ2 ) probability distribution with n − 1 degrees of
freedom (df).
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
20 / 23
Confidence Intervals for the Variance of a Normal
Population
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
21 / 23
Confidence Intervals for the Variance of a Normal
Population
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
21 / 23
Confidence Intervals for the Variance of a Normal
Population
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
22 / 23
Confidence Intervals for the Variance of a Normal
Population
Notation
Let χ2α,ν , called a chi-squared critical value, denote the number on the
measurement axis such that α of the area under the chi-squared curve
with ν df lies to the right of χ2α,ν .
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
22 / 23
Confidence Intervals for the Variance of a Normal
Population
Notation
Let χ2α,ν , called a chi-squared critical value, denote the number on the
measurement axis such that α of the area under the chi-squared curve
with ν df lies to the right of χ2α,ν .
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
22 / 23
Confidence Intervals for the Variance of a Normal
Population
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
23 / 23
Confidence Intervals for the Variance of a Normal
Population
Proposition
A 100(1 − α)% confidence interval for the variance σ 2 of a normal
population has lower limit
(n − 1)s 2 /χ2α ,n−1
2
and upper limit
(n − 1)s 2 /χ21− α ,n−1
2
A confidence interval for σ has lower and upper limits that are the
square roots of the corresponding limits in the interval for σ 2 .
Liang Zhang (UofU)
Applied Statistics I
July 17, 2008
23 / 23
Download