Document 15930253

advertisement
252solnA2 1/31/08 (Open this document in 'Page Layout' view!)
A. Parameter Estimation
1. Review of the Normal Distribution
A1
2. Point and Interval Estimation
3. A Confidence Interval for the Mean when the Population Variance is Known.
4. A Confidence Interval for the Mean when the Population Variance is not Known.
A2, text 8.20, 8.50 [8.21, 8.50] (8.21, 8.50) – Answers from both editions will be provided for 8.21. 8.95 on CD (8.93)
Graded Assignment 1 (Will be posted)
5. Deciding on Sample Size when working with a Mean
A3, 8.38 [8.36] (8.36)
6. A Confidence Interval for a Proportion.
Text 8.24, 8.25, 8.26, 8.58, 8.94 on CD [8.22, 8.23, 8.24, 8.58, 8.93a,c on CD] (8.22, 8.23, 8.25, 8.58, 8.91a,c)
7. A Confidence Interval for a Variance.
Text 12.1-12.2 [9.72] (9.67), A4
8. (A Confidence Interval for a Median.)
Optional - A5 -- solution is posted.
----------------------------------------------------------------------------------------------------------------------------- ---Problems A2 through 8.36 are in this document.
Problems involving A Confidence Interval for the Mean
PROBLEM A2: If n  64 and x  11 .50 , find 95% confidence intervals for the mean under the following
circumstances:
a.   6.30, N  3000
b.   6.30 , N  300
c. s  6.30, N  3000
d. s  6.30 , N  300
Solution: Use the formulas from Table 3 of the syllabus supplement or from the outline.
a)   x  z  x  11.50  1.960 .7875   11.50  1.54 or 9.96 to 13.04
2
x
6.30

 .7875 z 2  z.025  1.960 . More formally, we can say
n
64
P9.96    13.04  .95 or make a diagram.
b)   x  z  x  11.50  1.960 .6996   11.50  1.37 or 10.13 to 12.87
x 
2
x
236
N n
6.30 300  64
 0.7875 .8884   .6996
 0.7875 

299
300

1
N

1
64
n
z 2  z.025  1.96
c)   x  tn1 s x  11.50  1.998 .7875   11.50  1.57 or 9.93 to 13.07
x 
2
sx 
sx
n

6.30
64
 .7875
t  2
n 1
63
 t .025
 1998
.
1
252solnA2 1/31/08 (Open this document in 'Page Layout' view!)
d)   x  tn1 s x  11.50  1.998 .6996   11.50  1.40 or 10.10 to 12.90
2
sx
sx 
n
N n
6.30

N 1
64
300  64
 .6996
300  1
63
tn1  t.025
 1.998
2
Exercise 8.21 (8th Edition): Problem asks for 95% confidence intervals for costs of hotels and cars using a
sample of 20 cities. It then asks for assumptions about the population needed to assure that interval is valid.
Row
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
City
Hotel
Seattle
San Francisco
Los Angeles
Phoenix
Denver
Minneapolis
Chicago
St. Louis
Dallas
Houston
Detroit
Cleveland
New Orleans
Pittsburgh
Atlanta
Boston
New York
Washington
Orlando
Miami
Sum
a) For hotels
x
x
1
 3297 ,
Car
x1
x2
174
194
188
120
121
142
198
125
179
138
130
129
117
149
144
275
311
216
124
123
3297
46
48
39
32
40
47
53
48
51
48
49
43
45
44
50
39
72
52
36
31
913
x
2
1
x12
x 22
30276
37636
35344
14400
14641
20164
39204
15625
32041
19044
16900
16641
13689
22201
20736
75625
96721
46656
15376
15129
598049
2116
2304
1521
1024
1600
2209
2809
2304
2601
2304
2401
1849
2025
1936
2500
1521
5184
2704
1296
961
43169
 598049 and n  20. So x 
 x  3297  164 .85 ,
n
20
598049  20 164 .85 
54538 .55


 2870 .45 , and s  2870.45  53.5766
n 1
19
19
We use the formula for a confidence interval when the population standard deviation is unknown.
s
53 .5766
19
sx  x 
 11 .9801 , t n1  t .025
 2.093 , so
2
20
n
  x  t n1 s  164 .85  2.093 11.9801   164 .85  25.07 or 139.78 to 189.92.
s2 

2
2
 nx
2
x
x
b) For car rentals
s2 
x
2
2
 nx
2
 913 ,
x
2
2
 43169 and n  20. So x 
2
 78 .4500 , and s  8.8572 . So s x 
n 1

  x  tn1 s x  45.65  4.15 or 41.50    49.80
sx
n

 x  913  45.65 ,
n
8.8572
20
20
19
 1.9805 , t n1  t .025
 2.093 , so
2
2
c) According to the Instructor’s Solution Manual we need to assume that the population has a normal
distribution since the sample size of 20 is not large enough to invoke the central limit theorem for distributions
that are not symmetric.
2
252solnA2 1/31/08 (Open this document in 'Page Layout' view!)
Exercise 8.21 (9th Edition): ): Problem asks for 95% confidence intervals for costs of hotels and cars using a
sample of 20 cities. It then asks for assumptions about the population needed to assure that interval
is valid.
Row
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
City
Seattle
San Francisco
Los Angeles
Phoenix
Denver
Minneapolis
Chicago
St. Louis
Dallas
Houston
Detroit
Cleveland
New Orleans
Pittsburgh
Atlanta
Boston
New York
Washington
Orlando
Miami
Sum
a) For hotels
x
x
1
Hotel
Car
x1
x2
176
178
223
124
139
167
257
159
167
180
141
145
142
148
173
243
273
262
133
116
3546
45
42
36
38
38
53
51
53
46
48
53
40
49
49
46
46
69
47
40
39
928
 3546 ,
x
2
1
x12
30976
31684
49729
15376
19321
27889
66049
25281
27889
32400
19881
21025
20164
21904
29929
59049
74529
68644
17689
13456
672864
x 22
2025
1764
1296
1444
1444
2809
2601
2809
2116
2304
2809
1600
2401
2401
2116
2116
4761
2209
1600
1521
44146
 672864 and n  20. So x 
 x  3546  177 .30 ,
n
20
672864  20 177 .3
44158 .2


 2324 .125 , and s  2324.125  48.2091
n 1
19
19
We use the formula for a confidence interval when the population standard deviation is unknown.
s
48 .2091
19
sx  x 
 10 .7799 , t n1  t .025
 2.093 , so
2
n
20
  x  t n1 s  177 .305  2.093 10.7799   177 .30  22.56 or 154.74 to 199.86.
s2 

2
2
 nx
2
x
x
b) For car rentals
s2 
x
2
2
 nx
2
 928 ,
x
2
2
 44146 and n  20. So x 
2
 57 .2000 , and s  7.5631 . So s x 
n 1

  x  tn1 s x  46.40  3.54 or 42.86    49.94
sx
n

 x  928  46.40 ,
7.5631
20
n
20
19
 1.6912 , t n1  t .025
 2.093 , so
2
2
c) According to the Instructor’s Solution Manual we need to assume that the population has a normal
distribution since the sample size of 20 is not large enough to invoke the central limit theorem for distributions
that are not symmetric. Given the approach taken by the text to checking this, I decided to run some graphs on
Minitab. The results follow on the next page.
3
252solnA2 1/31/08 (Open this document in 'Page Layout' view!)
MTB > boxplot hotel
Boxplot of Hotel
MTB > pplot hotel
Probability Plot of Hotel
MTB > boxplot car
Boxplot of Car
MTB > pplot car
4
252solnA2 1/31/08 (Open this document in 'Page Layout' view!)
Probability Plot of Car
The Minitab ‘help’ menu gives the following information for understanding probability plots
The y-axis, and sometimes the x-axis, of a probability plot are transformed so that the fitted
distribution (center blue line) forms a straight line.
If the distribution fits your data:

The plotted points will roughly form a straight line.

The plotted points will fall close to the fitted line.
 The Anderson-Darling statistic will be small, and the associated p-value will be larger than your chosen
-level. (Commonly chosen levels for  include 0.05 and 0.10.)
Minitab also displays approximate 95% confidence intervals (curved blue lines) for the fitted
distribution. These confidence intervals are point-wise, meaning that they are calculated
separately for each point on the fitted distribution without controlling for family-wise error. Usually,
points outside the confidence intervals occur in the tails. In the lower half of the plot, points to the
right of the confidence band indicate that there are fewer data in the left tail than one would
expect based on the fitted distribution. In the upper half, points to the right of the confidence band
indicate that there are more data in the right tail than one would expect.
I’m not going to get into how the Anderson-Darling statistic is calculated, but the p-value on the graph
should be above .05 if the distribution does not differ from Normal. In the case of ‘Hotel,’ the two graphs
together seem to indicate definite skewness to the right. The results from ‘Car’ are much less clear, since
the trouble seems to be caused by one pesky point and the p-value is relatively large. We can say that the
confidence interval for ‘Hotel’ is not reliable and that we are not sure about ‘Car.’
5
252solnA2 1/31/08 (Open this document in 'Page Layout' view!)
Exercise 8.20 (10th Edition) (I should have assigned 8:22) :
a) Asks for a 95% confidence interval for the mean number of days it takes to resolve a complaint.
Days
x1
54
5
35
137
31
27
152
2
123
81
74
27
11
19
126
110
110
29
61
35
94
31
26
5
12
4
165
32
29
28
29
26
25
1
14
13
13
10
5
27
4
52
30
22
36
26
20
23
33
68
2152
Days Squared
x12
2916
25
1225
18769
961
729
23104
4
15129
6561
5476
729
121
361
15876
12100
12100
841
3721
1225
8836
961
676
25
144
16
27225
1024
841
784
841
676
625
1
196
169
169
100
25
729
16
2704
900
484
1296
676
400
529
1089
4624
178754
For days
x
1
n  50. So x 
s2 
x
2
 2152 ,
x
2
1
 178754 and
 x  2152  43.04 ,
 nx
n
2

50
178754  50 43 .04 2
49
n 1
86131 .92

 1757 .794286 , and
49
s  1757.794286  41.92606
We use the formula for a confidence interval
when the population standard deviation is
s
41 .92606
unknown. s x  x 
n
50
1757 .794286
 35 .15589  5.9292
50
t n1  t 49  2.010 , so


2
.025
  x  tn1 s x
2
 43.04  2.010 5.9292   43.04  11.92 or
31.12 to 56.96.
b) The problem asks what assumption you
should make in a). The answer is that the
distribution should be Normally distributed.
c) The problem asks if the assumptions are
seriously violated. A Normal probability plot is
displayed which shows that the distribution is
not Normal. Better, note that the median is
28.50, Q1 = 13.75 and Q3 = 55.75. Since the
mean is above the median and the distance
between the median and Q3 is larger than the
distance between the mean and Q1, we can
conclude that the distribution is skewed to the
right.
d) The problem asks if this is a serious problem.
It is not because the sample is well above 30, so
that x is approximately Normally distributed.
Graphical Checks for Normality. Since the Instructors Solutions Manual used graphs to check for
Normality, I have followed suit. The column of data was placed in C1 and the commands were enabled.
The two commands used were Boxplot 'days' and pplot ‘days’. Results appear on the next page.
6
252solnA2 1/31/08 (Open this document in 'Page Layout' view!)
MTB > Boxplot 'days'
Boxplot of days
MTB > pplot 'days'
Probability Plot of days
In the boxplot, skewness is shown by the spread out values above the median. In the (Normal) probability
plot, the deviation of the points from a straight line reveals that the distribution is not Normal.
7
252solnA2 1/31/08 (Open this document in 'Page Layout' view!)
Exercise 8.50: Problem asks for a 99% confidence interval for the population total using given data.
We are told that n  25 , N  500 , x  25 .7 , s  7.8 and   .01 .
Solution: Since the population size is exactly 20 times the sample size, the Instructor’s Solution Manual
uses a finite population correction, though it is not really necessary. We use the formula for the confidence
interval when the population standard deviation is not known.
s
N n
7.8 500  25
24
 1.5220
sx  x

t n1  t.005
 2.797
2
N

1
500

1
n
25
  x  t n1 s  25.7  2.797 1.5220   25.7  4.26 or 21.44 to 29.96. If we want a confidence interval

2
x
for the population total, we can multiply our results by N  500 . If you really need an equation, you can
write N  Nx  Nt n1 s . If we just multiply the numbers by 500, we get 10720 to 14980. The

2
x
Instructor’s Solution Manual, using a slightly more accurate value of t , gets
N  x  N t 
s
n
N –n
7.8
500 – 25
or $10,721.53  Population Total 
 500  25 .7  500  2.7969 

N –1
500 – 1
25
$14,978.47.
Exercise 8.96 on CD [8.95 in 9th edition] (8.93 in 8th Edition): A stationary store wants to estimate the
mean retail value of greeting cards that it has in its 300-card inventory. (This version uses a corrected value
of t . )
a. Set up a 95% confidence interval estimate of the population mean value of all greeting cards
that are in its inventory if a random sample of 20 greeting cards selected without replacement indicates an
average value of $1.67 and a standard deviation of $0.32.
b. What is your answer to (a) if the store has 500 greeting cards in its inventory.
Solution: a) Since n  20 is more than 5% of N  300 , we need a finite population correction factor.
N  n 0.32 300  20
19

 .07155 .9364548  .07155 .9677   0.069243 t n1  t .025
 2.093
2
n N 1
20 300  1
  x  tn1 s x  1.67  2.093 0.069243   1.67  0.15 or $1.52 to $1.82. The 10th edition answer book
sx 
sx
2
gets $1.53 to $1.81.
b) Since n  20 is not more than 5% of N  500 , we do not need a finite population correction factor.
s
0.32
19
sx  x 
 .07155 t n1  t .025
 2.093
2
n
20
  x  t n1 s  1.67  2.093 0.07155   1.67  0.15 or $1.52 to $1.82.

2
x
If we are extremely cautious and use the finite population correction factor anyway,
N  n 0.32 500  20
19

 .07155 .9619238  .07155 .9808   0.070175 t n1  t .025
 2.093
2
n N 1
20 500  1
  x  tn1 s x  1.67  2.093 0.070175   1.67  0.15 or $1.52 to $1.82.
sx 
sx
2
Check these results – there is no answer given by the author!
8
252solnA2 1/31/08 (Open this document in 'Page Layout' view!)
PROBLEM A3: In a study of a grain market in an African country we want to figure out how large a sample
we must take to find a daily average price for a grain transaction. (Assume a standard deviation of
5 cents.)
a. We want a 99% confidence interval for the mean with an error of ±1 cent.
b. What if the error is to be ±1/2 cent?
z 2 2
, where z  z  z.005 since   .01 .
2
e2
a) We are told that the maximum error must be e  1 (or e  .01 ) and that   5 (or   .05 ).
Solution: We use the formula n 
From the t table, z.005  2.576 so that n 
z 2 2

2.576 2 52
12
 165 .89 . since we always
e2
round this quantity up, use a sample size of at least 166. Note that if we use n  165 , we find that

5
 90  2.576
 90  1.003 . The error term will
(if we assume that x  90 )   x  z  2
n
165

5
 90  2.576
 90  1.000 .
be slightly above 1. However, if we use n  166 ,   x  z  2
n
166
b) This time the maximum allowable error is e  0.5 , so n 
z 2 2

2.576 2 52
0.52
 663 .57 and
e2
we must use a sample size of 664. Note that this sample size is four times the size in part a.
Exercise 8.38 (8.36 in 8th and 9th edition): a) The problem wants a bound e  50 on the error term, when
  400 . b) It now asks for the same result with a bound on the error term of e   25
Solution: We use the formula for n , n 
z 2 2
z 2 2
e2
, and since   .05 , z  z  z.025  1.960 .
2
1.960 2 400 2
50 2
 245 .862 . Since this quantity is always rounded up, we use a
e2
sample size of 246 or more.
b) The only change is that now e   25 . On the basis of problem A3b), we expect our answer will be four
a) e  50 and n 
times as large. n 
z 2 2
e2


1.960 2 400 2
25 2
 983 .450 . Use n = 984 or larger.
9
Download