251solnO1

advertisement
251solnO1 4/08/08 (Open this document in 'Page Layout' view!)
O. Estimation of Parameters.
1. Point and Interval Estimation. Properties of Estimators.
2. A Confidence Interval for
Text 8.1 -8.3, 8.8. (8.1 -8.3, 8.8.)
 When 
is Known.

3. A Confidence Interval for
When  is not known.
Text 8.10, 8.11, 8.17, 8.21, 8.89! on CD [8.10, 8.11, 8.15, 8.20*]. (8.10, 8.11, 8.15, 8.17*). O1.
3. A Confidence Interval for a Proportion.
O2.
---------------------------------------------------------------------------------------------------------------------------
Confidence Intervals when the Population Standard Deviation is Known
From the Outline:
  x  z 2 s x This is not what you actually use most of the time! All that "  unknown" means is
 
that we do not have a value of the population variance  2 . If you only have the sample variance
s  , ignore this stuff and use the t table.
2
Remember that  x 
x
or  x 
n
x
N n
, but that only the first formula is used in these exercises.
N 1
n
Exercise 8.1: If x  85 ,   8 and n  64, construct a 95% confidence interval for the population mean.
Solution:   .05 . z  z.025  1.960 can be found on the last line of the t table. So
2
 xz

 85  1.960
8
 85  1.96 or 83.04 
n
64
P83 .04    86 .96   .95 .
  86.96. More formally,
Exercise 8.2: If x  125 ,   24 and n  36, construct a 99% confidence interval for the population mean.
Solution:   .01 . z  z.005  2.576 from the last line of the t table. So
2

24
 125  10 .30 or 114.70    135.30
n
36
Exercise 8.3: A market researcher says that she has 95% confidence that the mean monthly sales of a
product are between $170,000 and $200,000. Explain the meaning of this statement.
Solution: According to the Instructor’s Solutions Manual, if all possible samples of the
same size n are taken, 95% of them include the true population average monthly sales of
the product within the interval developed. Thus we are 95 percent confident that this
sample is one that does correctly estimate the true average amount.
 xz
 125  2.576
Exercise 8.8: the quality control manager at a light bulb factory needs to estimate the mean life of a batch
(population) of light bulbs. We assume that the (population) standard deviation is 100
hours. A random sample of 64 light bulbs from the batch yields a sample mean of 350.
a) Construct a 95% confidence interval for the population mean of light bulbs in this
batch.
b) Do you think that the manufacturer has the right to state that the average life of the
light bulbs is 400 hours? Explain.
c) Must you assume that the population of light bulb lives is Normal? Explain.
d) Suppose that the (population) standard deviation changed to 80 hours. How would this
change your answers to b) and c)?
1
251solnO1 4/08/08 (Open this document in 'Page Layout' view!)
Solution: We are given   .05 , x  350 ,   100 , n  64 . The answer below has been edited from the
Instructor’s Solutions Manuals for the 9th and 10th editions.
(a) The Instructor’s Solutions Manual for the 9th edition does not show the details of this solution,
but gives the following printout.
The 10th edition actually says that
(b)
(c)
X Z

n
 350  1.96 
100
or
64
325.5    374.50. If you can’t show this using the examples on the previous page, you
have a serious problem!
No. The manufacturer cannot support a claim that the bulbs last an average 400 hours.
Based on the data from the sample, a mean of 400 hours would represent a distance of 4
standard deviations above the sample mean of 350 hours. This is what is called a
hypothesis test! Our so-called null hypothesis is H 0 :   400 . Showing that 400 does
not fall on the confidence interval constructed from our sample statistic, we have shown
that the mean is significantly different from 400 (at the 5% significance level).
No. Since  is known and n = 64, from the central limit theorem, we may assume that the
sampling distribution of x is approximately normal.
(d) in the 9th edition asks us to explain why an observed value of 320 hours would not be unusual
even though it would be outside of the confidence interval just calculated. The Instructor’s
Solutions Manual for the 9th edition gives the following answer. An individual value of 320
is only 0.30 standard deviations below the sample mean of 350. The confidence interval
represents bounds on the estimate of a sample of 64, not an individual value.
(d) in the 10th edition is (e) in the 9th edition. The confidence interval is narrower based on a
process standard deviation of 80 hours rather than the original assumption of 100 hours.
(a)
(b)
X  Z

n
 350 1.96 
80
or 330.4    369.6.
64
Based on the smaller standard deviation, a mean of 400 hours would represent a
distance of 5 standard deviations above the sample mean of 350 hours. No, the
manufacturer cannot support a claim that the bulbs last an average of 400 hours.
On the other hand, if the process standard deviation was sufficiently above
100 hours, we could deny that the mean was significantly different from 400
hours. How large would that have to be? Good question! The confidence
interval is   x  z

n
1.960

 350  1.960

64
. If 350  1.960

 400 , then
64
 400  350  50 . If we take this as an equality we get
64

50 64
 204 .08 . So if  is larger than 204.08, we will
1.960
64
not be able to say that the mean is significantly different from 400.
1.960
 50 or  
2
251solnO1 4/08/08 (Open this document in 'Page Layout' view!)
Confidence Intervals when the Population Standard Deviation is Unknown
From the Outline:
  x  tn1 s x This is what you actually use most of the time! All that "  unknown" means is that
2
we do not have a value of the population variance. If you only have the sample variance, use the t
s
s
N n
table. Remember that s x  x or s x  x
. Only the first formula is used in most text problems.
n
n N 1
Exercise 8.10: Find t n 1 for a) a 95% confidence level and a sample size of 10, b) a 99% confidence
2
level and a sample size of 10, c) a 95% confidence level and a sample size of 32, d) a
95% confidence level and a sample size of 65 and e) a 90% confidence level and a
sample size of 16.
Solution: In each case we use t n 1
2
(a)
(b)
(c)
(d)
(e)
1    .95, so   .05 and
t 9  = 3.250
.005
31
t .025
64
t .025
15
t .05

2
9
= 2.262.
 .025 . n  10, so n  1  9 , and t .025
= 2.040
= 1.998
= 1.753
Exercise 8.11: If x  75 , s  24, n  36 and the parent population is Normal, find a 95% confidence
interval for the population mean.
35
Solution:   .05 . t n 1 t .025
 2.030 .

2
  x  ts x  75  2.030
24
 75  8.12 or
66.98
36
   83.12.
Exercise 8.17 [8.15 in 9th]: The problem is about a tread wear index, an important measure of a tire’s
performance. A brand of tires is graded 200. A random sample of 18 tires is taken by a consumer
organization and gives a sample mean tread wear index of 195.3 and a sample standard deviation of 21.4.
Assume a Normal distribution.
a. Set up a 95% confidence interval for the population mean.
b. Should the organization accuse the manufacturer of not meeting the standard for the grade?
c. Why is a tread wear index of 210 for a given tire not unusual?
17 
 2.110
Solution: Since x  195 .3, s  21 .4, n  18 and   .05 . t n 1 t .025

(a)
(b)
(c)
  x  ts x  195 .3  2.110
21 .4
18
2
 195 .3  10 .643
184.657 
  205.942
No, a grade of 200 is in the interval.
It is not unusual. A tread-wear index of 210 for a particular tire is only 0.69 standard
deviations above the sample mean of 195.3.
3
251solnO1 4/08/08 (Open this document in 'Page Layout' view!)
Exercise 8.89 [Not in 9th ??]: If we are sampling without replacement from a population of 200 and from
a sample of 36 get a sample mean of 75 and a standard deviation of 24, construct a 95% confidence
interval for the mean.
Solution: Our facts are x  75 , s  24, n  36, N  200 and   .05 . Our basic formula remains
  x  t n1 s but because the sample is more than 5% of the population, we need a finite population

2
x
200  36
164
4
 4 0.82412  4.9078   3.631 . The value of
200

1
199
n
35
 2.030 . So the interval is   x  t n1 s x  75  2.0303.631  75  9.55 , so
t n 1 that we need is t .025
correction. s x 
sx
N n
24

N 1
36
2
2
we can say P65 .45    84 .55   .95 .
Exercise 8.17(in 8th edition only): In order to estimate dental expenses to plan for a proposed dental plan,
a personnel department takes a random sample of dental expenses for the families of 10 employees over
the previous year. (Dental data set on disk)
Expenses
110
362
246
85
510
208
173
425
316
179
a. Set up a 90% confidence interval estimate of mean family dental exposes for all employees.
b. What assumption must be made about the population distribution in a)?
c. Give an example of a family dental expense that is outside the confidence interval but that are not
unusual for an individual family and explain why this is not a contradiction.
d. Repeat a) for a 95% interval.
e. What would the effect be in a) of changing the fourth value from $85 to $585?
Solution: Since the file is available on your disk, I downloaded it to Minitab and got the following results
with my comments added.
—————
4/9/2003 2:42:54 PM
————————————————————
Welcome to Minitab, press F1 for help.
MTB > Retrieve "E:\Content\Data Files\Minitab Data Files\Dental.MTW".
Retrieving worksheet from file: E:\Content\Data Files\Minitab Data
Files\Dental.MTW
# Worksheet was saved on Mon Apr 27 1998
Results for: Dental.MTW
MTB > Let c2=c1*c1
MTB > print c1 c2
Data Display
Row Expenses
1
2
3
4
5
6
7
8
9
10
110
362
246
85
510
208
173
425
316
179
The data you have is labeled Expenses and is in C1.
To compute the variance I squared the data and placed it in C2.
C2
12100
131044
60516
7225
260100
43264
29929
180625
99856
32041
MTB > sum c1
Sum of Expenses
Sum of Expenses = 2614.0
 x  2614
4
251solnO1 4/08/08 (Open this document in 'Page Layout' view!)
MTB > sum c2
Sum of C2
x
Sum of C2 = 856700
MTB > Describe c1
2
 856700
This is the basic command for describing a data set.
Descriptive Statistics: Expenses
Variable
Expenses
N
10
Mean
261.4
Median
227.0
TrMean
252.4
Variable
Expenses
Minimum
85.0
Maximum
510.0
Q1
157.3
Q3
377.8
MTB >
tinterval 90 c1
StDev
138.8
SE Mean
43.9
This is the command to get a 90% confidence interval.
One-Sample T: Expenses
Variable
Expenses
N
10
Mean
261.4
StDev
138.8
SE Mean
43.9
(
90.0% CI
180.9,
341.9)
So the interval is 180 .9    341 .9
(a) Printout from the Instructor’s Solutions Manual follows:
If we do this by hand, we write down the two
columns above.
x 2  856700 ,
If   .10 , n  10,
2
x
x
x 2614
110
12100
x  2614 , x 

 261 .4 and
362
131044
n
10
246
60516
85
7225
x 2  nx 2 856700  10 261 .42
510
260100
s2 

 19266 .711
208
43264
n 1
9
173
29929
. s  19266.711  138.8046
425
180625
9
 1.833 .
We use t n 1  t .05
316
99856
2
179
2041
2614
856700
138 .80
  x  ts x  261 .4  1.833
 261 .40  80 .45 or 180.95 to 341.85
10




(b)
(c)
The population of dental expenses must be approximately normally distributed.
I’ll let you think about this one.
5
251solnO1 4/08/08 (Open this document in 'Page Layout' view!)
9
If   .05 , n  10, x  261 .4 and s  138 .8046 , we use t n 1  t .025
 2.262 .
2
138 .80
  x  ts x  261 .4  2.262
 261 .40  2.262 43 .892   261 .40  99 .28 or 161.12
10
to 360.68.
(d)
(e)
The additional $500 in dental expenses, divided across the sample of 10, raises the mean by
$50 and increases the standard deviation by nearly $20. The interval half-width increases
over $11 in the process. The new interval is:
  x t
s
 311 .40  1.8331 
157 .056
n
10
Exercise 8.21(8.20 in 9th ): In New York a random sample was taken of the time required in days to
approve 27 Savings Bank Life Insurance policies. (Insurance data set on disk)
Time
73
31
92
19
56
63
16
22
50
64
18
51
28
45
69
28
48
16
31
17
17
90
17
60
17
56
91
a. Set up a 95% confidence interval estimate of mean processing time.
b. What assumption must be made about the population distribution in a)?
c. Do you think that the assumption made in b) has been seriously violated? Explain.
d. Compare the conclusions reached in a) with those of Problem 3.61 on page 126.
Solution: Since the file was available on disk, I downloaded it to Minitab and got the results below.
MTB > Retrieve "C:\Berenson\Data_Files-9th\Minitab\INSURANCE.MTW".
Retrieving worksheet from file: C:\Berenson\Data_Files9th\Minitab\INSURANCE.MTW
# Worksheet was saved on Mon Apr 09 2001
Results for: INSURANCE.MTW The data you have is labeled Time and is in C1.
MTB > let c2 = c1*c1
MTB > print c1 c2
To compute the variance I squared the data and placed it in C2.
Data Display
Row
Time
C2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
73
19
16
64
28
28
31
90
60
56
31
56
22
18
45
48
17
17
17
91
92
63
5329
361
256
4096
784
784
961
8100
3600
3136
961
3136
484
324
2025
2304
289
289
289
8281
8464
3969
6
251solnO1 4/08/08 (Open this document in 'Page Layout' view!)
23
24
25
26
27
50
51
69
16
17
2500
2601
4761
256
289
MTB > sum c1
Sum of Time
 x  1185
Sum of Time = 1185.0
MTB > sum c2
Sum of C2
x
Sum of C2 = 68629
2
 68629
MTB > describe c1
Descriptive Statistics: Time
This is the basic command for describing a data set.
Variable
Time
N
27
Mean
43.89
Median
45.00
TrMean
43.08
Variable
Time
Minimum
16.00
Maximum
92.00
Q1
18.00
Q3
63.00
MTB > tinterval 95 c1
StDev
25.28
SE Mean
4.87
This is the command to get a 95% confidence interval.
One-Sample T: Time
Variable
Time
N
27
Mean
43.89
StDev
25.28
SE Mean
4.87
(
95.0% CI
33.89,
53.89)
If we do this by hand we get the following.
obs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
x
73
19
16
64
28
28
31
90
60
56
31
56
22
18
45
48
17
17
17
91
92
63
50
51
69
16
17
1185
x2
5329
361
256
4096
784
784
961
8100
3600
3136
961
3136
484
324
2025
2304
289
289
289
8281
8464
3969
2500
2601
4761
256
289
68629
7
251solnO1 4/08/08 (Open this document in 'Page Layout' view!)
If   .05 , n  27 ,
x
 x  1185 ,  x
2
 68629 , x 
 x  1185  43.88889
n
27
and
68629  27 43 .88889 2 16620 .6667

 639 .2564 .
n 1
26
26
26
s  639.2564  25.2835 . For a 2-sided interval use t n1  t.025
 2.056 .
s2 
2
 nx 2

2
  x  ts x  43 .88889  2.056
25 .2835
 43 .89  10 .00 or 33.89 to 53.89, which means we can
27
say P33.89    53.89  . Or make a diagram of an almost Normal curve with a mean at 43.89
and mark 33.89 and 53.89. Label the area between these two points with 95% and the area in each
of the tails with 2.5%.
The remainder of the solution comes from the Instructor’s Solutions Manual.
(b)
The population distribution needs to be normally distributed.
(c)
Normal Probability Plot
100
90
80
Time
70
60
50
40
30
20
10
0
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
Z Value
Box-and-whisker Plot
Time
10
30
50
70
90
Both the normal probability plot and the box-and-whisker show that the population
distribution is not normally distributed and is skewed to the right.
(d)
With a sample size of 27 and the population distribution that appears to be skewed, the
method used in (a) is not reliable and, hence, any comparison with Problem 2.64 is likely
to be invalid.
8
251solnO1 4/08/08 (Open this document in 'Page Layout' view!)
A Summary Exam-type Problem
We are using the following formulas.
When  is known   x  z  x , where  x 
2
x
n
, or  x 
x
n
N n
when the sample is
N 1
more than 5% of the population.
s
s
When  is unknown   x  tn1 s x , where s x  x , or s x  x
2
n
n
more than 5% of the population.
N n
when the sample is
N 1
Problem O1: If n  64 and x  11 .50 , find 95% confidence intervals for the mean under the following
circumstances:
a.   6.30, N  3000
b.   6.30 , N  300
c. s  6.30, N  3000
d. s  6.30 , N  300
Solution: Use the formulas from Table 3 of the syllabus supplement or from the outline.
a)   x  z  x  11.50  1.960 .7875   11.50  1.54 or 9.96 to 13.04
2
x 
x
6.30

 .7875 z 2  z.025  1.960 . More formally, we can say
n
64
P9.96    13.04  .95 or make a diagram.
b)   x  z  x  11.50  1.960 .6996   11.50  1.37 or 10.13 to 12.87
2
x
236
N n
6.30 300  64
 0.7875 .8884   .6996
 0.7875 

299
64 300  1
n N 1
z 2  z.025  1.96
c)   x  tn1 s x  11.50  1.998 .7875   11.50  1.57 or 9.93 to 13.07
x 
2
sx 
sx

6.30
n
 .7875
64
t  2
n 1
63
 t .025
 1998
.
d)   x  tn1 s x  11.50  1.998 .6996   11.50  1.40 or 10.10 to 12.90
2
sx 
sx
n
N n
6.30

N 1
64
300  64
 .6996
300  1
63
tn1  t.025
 1.998
2
9
251solnO1 4/08/08 (Open this document in 'Page Layout' view!)
A Confidence Interval for a Proportion
Problem O2 (Black): a) A researcher wants to know what share her company holds in a large city. A sample
of 1003 people who bought CDs in the last month is taken and 256 turn out to have bought her company’s products
(CDs). Create a 95% confidence interval for the proportion that bought her company’s products. b) CD sales aren’t
what they used to be. What if we find out that there were only 10000 people who bought CDs in the city last month?
Solution: a) p  p  z  2 s p  p  z  2
pq
x 256
 .2552 , q  1  p  .7448 .
. p 
n 1003
n
.2552 .7448 
 .0001895  .01377 . z 2  z.025  1.96 . So p  .2552  1.960 .01377 
1003
 .2552  .0270 or .2282 to .2822
sp 
b) If N  10000 our sample is more than 5% of the population and we have s p 
N n
N 1
pq
n
10000  1003
.01377   0.89979 .01377   0.94857 .01377   .01306 . So
10000  1
p  .2552  1.960 .01306   .2552  .0256 .

10
Download