Uploaded by Ogochukwu A.D.

ADMS 2320 Notes

advertisement
ADMS 2320: Business Statistics
Chapter 10: Introduction To Estimation
We use sample data to estimate a population parameter by using estimators.
Confidence interval estimator of 𝜇.
It is used to estimate the population mean when the population standard deviation is known.
𝑥̅ - z𝛼/2
𝜎
√𝑛
< 𝜇 < 𝑥̅ + z𝛼/2
𝜎
√𝑛
The probability 1 - 𝛼 is called the confidence level; CI = 1 - 𝛼
𝑥̅ - z𝛼/2
𝜎
√𝑛
is called the lower confidence limit (LCL).
𝑥̅ + z𝛼/2
𝜎
√𝑛
is called the upper confidence limit (UCL).
The two side of the confidence interval formula is based on
𝑥̅ −𝜇
z𝛼/2 = 𝜎/ 𝑛.
√
Rearrange the formula, the confidence interval can be also written as:
𝜇 = 𝑥̅ ± z𝛼/2
𝜎
√𝑛
Where for the symbol ±; the “+” represents the upper confidence limit (UCL) and “-” represents the
lower confidence limit (LCL).
Examples
Q10.27: How many rounds of golf do physicians (who play golf) play per year? A survey of 12 physicians
revealed the following numbers:
3
41
17
1
33
37
18
15
17
12
29
51
Estimate with 95% confidence the mean number of rounds per year played by physicians, assuming that
the number of rounds is normally distributed with a standard deviation of 12.
S10.27:
This question wants us to estimate the population mean (𝜇) when population SD (𝛔) is known.
̅ ± z𝜶/𝟐
Use confidence interval estimator of 𝜇: 𝜇 = 𝒙
𝝈
√𝒏
CI = 1 – 𝛼
.95 = 1 – 𝛼
𝛼 = .05
𝜎 = 12
𝑥̅ = 𝛴
𝑥𝐼
𝑛
n = 12
=
3 + 41 + 17 + 1 + 33 + 37 + 18 + 15 + 17 + 12 + 29 + 51
=
12
22.83
z𝛼/2 = z.05/2 = z.025 = 1 – .025 = .975
z.025 = 1.96
𝜇 = 𝑥̅ ± z𝛼/2
𝜎
√𝑛
= 22.83 ± z.05/2
12
√12
= 22.83 ± 1.96
12
√12
= 22.83 ± 6.79
16.04 < 𝜇 < 29.62
The mean number of rounds of golf physicians play per year is between 16.04 and 29.62.
Q10.37: A statistics professor is in the process of investigating how many classes university students
miss each semester. To help answer this question, she took a random sample of 100 university students
and asked each to report how many classes he or she had missed in the previous semester. Estimate
the mean number of classes missed by all students at the university. Use a 99% confidence level and
assume that the population standard deviation is known to be 2.2 classes.
S10.37:
This question wants us to estimate the population mean (𝜇) when population SD (𝛔) is known.
̅ ± z𝜶/𝟐
Use confidence interval estimator of 𝜇: 𝜇 = 𝒙
𝜎 = 2.2
CI = 99%
n = 100
CI = 1 – 𝛼
.99 = 1 – 𝛼
𝛼 = .01
𝑥̅ = 10.21 (from Appendix A of the textbook)
𝜇 = 𝑥̅ ± z𝛼/2
𝜎
√𝑛
𝝈
√𝒏
z𝛼/2 = z.01/2 = z.005 = 1 – .005 = .995
z.005 = 2.57
𝜇 = 𝑥̅ ± z𝛼/2
𝜎
√𝑛
= 10.21 ± (2.57)
2.2
√100
= 10.21 ± 0.5654
9.64 < 𝜇 < 10.78
LCL = 9.64
UCL = 10.78
The mean number of classes missed by all students at the university is between 9.64 and 10.78.
Q10.13:
a. A statistics practitioner took a random sample of 50 observations from a population with a
standard deviation of 25 and computed the sample mean to be 100. Estimate the population
mean with 90% confidence.
b. Repeat part (a) using a 95% confidence level.
c. Repeat part (a) using a 99% confidence level.
d. Describe the effect on the confidence interval estimate of increasing the confidence level.
S10.13:
Determining sample size:
z𝛼/2𝜎 2
)
𝐵
n=(
Solving n if the population standard deviation 𝛔, the confidence level 1 - 𝛼, and the bound on the error
of estimation B are known.
Any non-integer value must be rounded up.
z𝛼/2𝜎 2
)
𝐵
Eg. If n = (
z𝛼/2𝜎 2
)
𝐵
n=(
= 84.41, rounded to 85.
= 389.67, rounded to 390.
Examples
Q10.47:
a. Determine the sample size required to estimate a population mean to within 10 units given that
the population standard deviation is 50. A confidence level of 90% is judged to be appropriate.
b. Repeat part (a) changing the standard deviation to 100.
c. Re-do part (a) using a 95% confidence level.
Repeat part (a) wherein we wish to estimate the population mean to within 20 units
S10.47:
z𝜶/𝟐𝝈 2
);
𝑩
Determining sample size to estimate a Mean: n = (
where B, is the margin of error z𝛼/2
given.
a. 𝜎 = 50
B = 10
CI = 90% or 0.90 (CI = 1 - 𝛼) ---> thus 𝛼 = .10
z𝛼/2 = z.10/2 = z.05 = 1 – .05 = .95
z.05 = 1.64
z𝛼/2𝜎 2
)
𝐵
n=(
=(
1.64 ∗50 2
)
10
b. 𝜎 = 100
= 67.2400 ≈ 68
𝜎
,
√𝑛
is
z𝛼/2𝜎 2
)
𝐵
n=(
=(
1.64 ∗100 2
)
10
= 268.9600 (round up) ≈ 269
c. .95= 1 - 𝛼 ---> thus 𝛼 = .05
z𝛼/2 = z.05/2 = z.025 = 1 – .025 = .975
z.025 = 1.96
z𝛼/2𝜎 2
)
𝐵
n=(
=(
1.96 ∗50 2
)
10
= 96.0400 (round up)≈ 97
d. B = 20
z𝛼/2𝜎 2
)
𝐵
n=(
=(
1.64 ∗50 2
)
20
= 16.8100 (round up)≈ 17
Q10.59: A statistics professor wants to compare today’s students with those 25 years ago. All his
current students’ marks are stored on a computer so that he can easily determine the population mean.
However, the marks 25 years ago reside only in his musty files. He does not want to retrieve all the
marks and will be satisfied with a 95% confidence interval estimate of the mean mark 25 years ago. If
he assumes that the population standard deviation is 12, how large a sample should he take to estimate
the mean to within 2 marks?
S10.59:
We want to find the sample size to estimate the mean.
Margin of error estimation is no more than 2.
B=2
CI = 95%
.95 = 1 - 𝛼
𝛼 = .05
z𝛼/2 = z.05/2 = z.025 = 1 – .025 = .975
z.025 = 1.96
z𝛼/2𝜎 2
)
𝐵
n=(
=
(1.96 ∗12)2
22
= 138.2976 (round up) ≈ 139
He should take a sample size of 139 to estimate the mean to within 2 marks.
On chapters 10 and 11, we did inference about a population mean 𝜇 when the standard deviation 𝛔 is
known.
we used z-statistic and z-estimator of 𝜇
This chapter, we will look into a more realistic approach.
We are going to use t-statistic and t-estimator instead.
Inference about a population mean 𝜇 when the standard deviation 𝛔 is unknown.
When the population standard deviation is unknown and the population is normal, the test statistic for
testing hypotheses about 𝜇 is
𝑡=
𝑥̅ − 𝜇
𝑠/√𝑛
which is student t-distributed with 𝜈 = 𝑛 − 1 degree of freedom.
This formula is similar to z-statistic but instead of using “𝛔”, substitute it with “s”, which is the sample
standard deviation.
Confidence interval estimator of the population mean 𝜇 when the standard deviation 𝛔 is unknown is
x̅ ± 𝑡𝛼/2
𝑠
√𝑛
x̅ - 𝑡𝛼/2
Where
x̅ + 𝑡𝛼/2
; 𝑣=n–1
𝑠
√𝑛
𝑠
√𝑛
is the Lower confidence limit (LCL)
is the upper confidence limit (UCL)
Examples
Q12.15:
A random sample of 8 observations was drawn from a normal population. The sample mean and sample
standard deviation are x̅ = 40 and s = 10.
a. Estimate the population mean with 95% confidence.
b. Repeat part (a) assuming that you know that the population standard deviation is 𝛔 = 10.
c. Explain why the interval estimate produced in part (b) is narrower than that in part (a).
S12.15:
a.
.95 = 1 – 𝛼
CI = 1 – 𝛼
𝛼 = .05
Since sample standard deviation is given instead of population, use Confidence Interval Estimator of 𝜇
when 𝛔 is unknown:
x̅ ± 𝒕𝜶/𝟐
𝒔
√𝒏
; 𝒗=n–1
𝑡𝛼/2 , 𝑣 = 𝑡𝛼/2 , n – 1 = 𝑡.025, 7 (use t table)
𝜇 = 40 ± 𝑡.025, 7
10
√8
= 40 ± 2.365 (3.536)
= 40 ± 8.36
LCL = 31.64 and UCL = 48.36
b. 𝛔 = 10
Since 𝛔 is given, use confidence interval estimator of 𝜇 when 𝛔 is known (from chapter 11):
x̅ ± 𝒛𝜶/𝟐
𝝈
√𝒏
𝑧𝛼/2 = 𝑧.05/2 = 𝑧.025
𝜇 = x̅ ± 𝑧𝛼/2
𝜎
√𝑛
= 40 ± 𝑧.025
10
√8
10
√8
= 40 ± 1.96
= 40 ± 6.93
LCL = 33.07 and UCL = 46.93
c. t-Distribution is more widely spread out than the standard normal distribution.
:. 𝑧𝛼/2 is smaller than 𝑡𝛼/2 .
Q12.19:
a. A random sample of 11 observations was taken from a normal population. The sample mean
and standard deviation are x̅ = 74.5 and s = 9. Can we infer at the 5% significance level that the
population mean is greater than 70?
b. Repeat part (a) assuming that you know that the population standard deviation is 9 .
c. Explain why the conclusions produced in parts (a) and (b) differ.
S12.19:
a. n = 11 x̅ = 74.5
H0: 𝜇 = 70
s = 9, and 𝛔 is unknown.
𝛼 = 0.05
H1: 𝜇 > 70
Rejection Region:
(right tail test) Since 𝑡𝛼 = 𝑡𝛼 , 𝑣 = 𝑡.05, 11−1 = 𝑡.05,
10
= (from t table) 1.812
Therefore reject H0: 𝑡 > 𝑡𝛼 = 1.812
𝑡=
𝑥̅ − 𝜇
𝑠/√𝑛
=
74.5 −70
9/√11
= 1.6583
∵ 𝑡=1.6583 < 𝑡𝛼 = 1.812 ∴ Do not reject H0
There is not enough evidence to infer that the population mean is greater than 70.
b. H0: 𝜇 = 70
H1: 𝜇 > 70
Population standard deviation 𝛔 is given = 9 (use Z distribution)
Rejection Region:
(right tail test) Since 𝑧𝛼 = 𝑧.05 = (from Z table) 1.645
Therefore reject H0: 𝑧 > 𝑧𝛼 = 1.645
z=
𝑥̅ − 𝜇
𝜎/√𝑛
=
74.5 −70
9/√11
= 1.6583 ≈ 1.66
Since, z =1.6583 > 𝑧𝛼 = 1.645
∴ Reject H0
And (optional) p-value = P(z > 1.66) = 1 - .9515 = .0485 (< 0.05, H1 is true)
There is enough evidence to infer that the population mean is greater than 70.
c. Student t–distribution is more widely spread out than the standard normal
Q12.30:
University bookstores order books that instructors adopt for their courses. The number of copies
ordered matches the projected demands. However, at the end of the semester, the bookstore has too
many copies on hand and must return them to the publisher. A bookstore has a policy that the
proportion of books returned should be kept as small as possible. The average is supposed to be less
than 10%. To see whether the policy is working, random sample of book titles was drawn, and the
fraction of the total originally ordered that are returned is recorded and listed here. Can we infer at the
10% significance level that the mean proportion of returns is less than 10%?
4
15
11
7
5
9
4
3
5
8
S12.30:
n = 10 𝛼 = 0.10 𝛔 is unknown
H0: 𝜇 = 10
H1: 𝜇 < 10 (mean proportion of returns is less than 10%)
Sample mean and standard deviation:
x̅ =
∑𝑥𝑖
𝑛
∵ s2 =
=
4+15+11+7+5+9+4+3+5+8
10
1
712
− ]
[631
10−1
10
=
126.9
9
=
71
10
= 7.1
= 14.1 ∴ s = √14.1 = 3.7550
Rejection Region:
(left tail test) 𝑡 < -𝑡𝛼 = -𝑡𝛼 , n – 1 = -𝑡.10, 9 = (t table) -1.383
Test statistic:
𝑡=
𝑥̅ − 𝜇
𝑠/√𝑛
7.1 −10
√10
= 3.7550/
∵ 𝑡 < -𝑡𝛼
→
= -2.4422
-2.4422 < -1.383
∴ Reject H0
There is enough evidence to infer that the mean population of the return is less than 10%.
Inference about a population variance.
We are interested in drawing inferences about a population’s variability, so the parameter we need to
investigate is the population variance 𝜎 2
Eg. H1: 𝜎 2 > #
Test statistic for 𝜎 2 :
The test statistic used to test hypotheses about 𝜎 2 is
𝒳2 =
(𝑛−1)𝑠2
𝜎2
Which is chi-squared distributed with 𝜈 = 𝑛 − 1 degrees of freedom when the population random
variable is normally distributed with variance equal to 𝜎 2 .
Confidence interval estimator of 𝜎 2 :
Lower Confidence Limit (LCL) =
Upper Confidence Limit (UCL) =
(𝑛−1)𝑠2
2
𝒳𝛼/2
(𝑛−1)𝑠2
2
𝒳1−𝛼/2
12.71:
a. The sample variance of random sample of 50 observations from normal population was found to
be s2 = 80. Can we infer at the 1% significance level that 𝛔2 is less than 100?
n = 50 s2 = 80 𝛼 = 0.01
H0: 𝛔2 = 100
H1: 𝛔2 < 100
2
2
2
Rejection Region: 𝒳 2 < 𝒳1−𝛼
= 𝒳1−𝛼,
= 𝒳.99,
≈ (𝒳 2 table) 29.7
𝑛−1
49
Reject H0: 𝒳 2 < 29.7
𝒳2 =
(𝑛−1)𝑠2
𝝈2
=
(50−1)80
100
= 39.2 ; p-value = .1596
2
∵ 𝒳 2=39.2 > 𝒳1−𝛼
= 29.7
∴ Do not reject H0
There is not enough evidence to conclude that the population variance (𝛔2) is less than 100.
b. Repeat part (a) increasing the sample size to 100
n = 100
2
Reject H0: 𝒳 2 < 𝒳1−.01,
𝒳2 =
(𝑛−1)𝑠2
𝝈2
=
100−1
(100−1)80
100
2
= 𝒳.99,
≈ (𝒳 2 table) 70.1
99
= 79.2 ; p-value = .0714
2
∵ 𝒳 2=79.2 > 𝒳1−𝛼
= 70.1
∴ Fail reject H0
There is not enough evidence to infer that the population variance (𝛔2) is less than 100.
Download