Six Sigma Black Belt Training

advertisement
LSSG Black Belt Training
Estimation: Central Limit Theorem and
Confidence Intervals
Central Limit Theorem
Assume a population with a non-normal distribution.
Mean = µ
Stdev = σ
If we took a sample of size 50 from this population, what would it look like?
CLT - Multiple Samples from the same Population
Each sample of n=50 from the same
population will tend to look like the
population, and the sample means
will be close to the population
mean.
X1
The sample means are unbiased
estimators of the population mean.
They will vary randomly above and
below the actual population mean.
X2
X3
If all such samples (n=50) were
drawn, how would the sample
means be distributed?
CLT – Sampling Distribution of X
Most sample means will be close to the population mean.
 Some sample means will be a little farther away.
 A few will be quite a bit off the mark.
 A rare number will be extremely far away.
In other words, the X values will be approximately normally distributed.

µx
How would the mean of this distribution compare to the original
population mean? How about the standard deviation of this distribution?
How would sample size affect this relationship?
Central Limit Theorem Statement
For sufficiently large sample sizes (typically n>30), the distribution of
the sample means (X-Bar) is approximately normal, and
1.
Mean of sample means = Population Mean
2.
Standard Deviation of sample means =
(Std Dev. of Population/ square root of n)
This standard deviation of the sample means is also called the standard error.
Additional inference:
Since the X-bars are normally distributed, 95% of all samples (large
enough n) from a population will yield an X-bar that is within 2
standard errors from the population mean.
Confidence Intervals
We take a sample of 64 parts from a population, and want to
estimate the population mean of the part length. The sample
mean is 25 mm. The population standard deviation is known to
be 0.2 mm.
From CLT, we know that this sample mean (25) is within 2 standard
errors (actually 1.96) of the population mean, with 95%
confidence.
Hence the reverse is also true.
Thus, population mean is
X-bar ± 2 * SE
Here, SE = 0.16 / √64 = 0.16/8 = 0.02
Thus 95% CI for µ is given by
25 ± 2*0.02, or 25 ± 0.04 mm
The value 0.04 is the Margin of Error (MOE)
Confidence Intervals – Unknown σ
In reality, σ is generally unknown, and must be substituted
with s, the sample standard deviation. In that case, the
margin of error is higher, and is computed using the tdistribution rather than the standard normal (z dist). Thus
instead of 1.96 standard errors for 95% confidence, we
use a larger number obtained from the t-tables.
(In Excel, type =tinv(0.05,df), where df is the degrees of
freedom, equal to n-1. Here in the previous problem, df is
63).
Download