252y0751 10/19/07 (Open in ‘Print Layout’ format)
ECO252 QBA2
FIRST EXAM
October 4 and 8, 2007
Version 1
Name _____ KEY _________
Class hour: _____________
Student number: __________
Show your work! Make Diagrams! Include a vertical line in the middle! Exam is normed on 50 points. Answers without reasons are not usually acceptable.
I. (8 points) Do all the following. x ~ N
1.
P
P
38 .
2
3 .
75
z x
0
2
P
P
38
0 .
09
.
2
11
z
3
0
z
.
2
11
3
4999
P
.
0359
3 .
75
z
= .4640
0 .
09
For z make a diagram. Draw a Normal curve with a mean at 0. Indicate the mean by a vertical line!
Shade the area between -3.75 and zero. Because this is entirely on the left side of zero, we must subtract the area between -0.09 and zero from the area between -23.75 and zero. . If you wish, make a completely separate diagram for x . Draw a Normal curve with a mean at 3. Indicate the mean by a vertical line!
Shade the area between -38.2 and 2. This area is entirely on the left side of the mean (3), so we subtract the smaller area between 2 and the mean from the larger area between -23.4 and the mean.
2.
P
3
x
3
P
3
3
11
z
3
11
3
P
0 .
55
z
0
= .2088
For z make a diagram. Draw a Normal curve with a mean at 0. Indicate the mean by a vertical line!
Shade the area between -0.55 and zero. Because this is completely on the left of zero, but ends at zero, this is exactly the kind of probability given by the standard Normal table. Look up the probability between zero and 0.55 on the table and you are done. If you wish, make a completely separate diagram for x . Draw a
Normal curve with a mean at 3. Indicate the mean by a vertical line! Shade the area between -3 and the mean (3). Since this area ends at the mean we do not need to add or subtract.
1
252y0751 10/19/07 (Open in ‘Print Layout’ format)
3.
P
x
0
P
z
0
11
3
P
z
0 .
27
0 .
27
z
0
z
0
= .1064 + .5 = .6064
For z make a diagram. Draw a Normal curve with a mean at 0. Indicate the mean by a vertical line!
Shade the entire area above -0.27. Because this is on both sides of zero, we must add area between -0.27 and zero to the area above zero.
This is identical to way you get the p-value for a right-sided test when the z ratio is negative. If you wish, make a completely separate diagram for x . Draw a Normal curve with a mean at 3. Indicate the mean by a vertical line! Shade the area above zero. This area is on both sides of the mean (3), so we add the area between 0 and the mean to the area (50%) above the mean.
4. x
.
125
(Do not try to use the t table to get this.) For z make a diagram. Draw a Normal curve with a mean at 0. z
.
125
is the value of z with 12.5% of the distribution above it. Since 100 – 12.5 = 87.5, it is also the .875 fractile. Since 50% of the standardized Normal distribution is below zero, your diagram should show that the probability between z
.
125
and zero is 87.5% - 50% = 37.5% or we check this against the Normal table, the closest we can come to .3750 is P
P
0
0
z
z
z
.
125
1 .
15
.
3750
.
3749 .
.
If
(1.16 is also acceptable here, but clearly worse.) So confidence interval. To get from z
.
125
to x z
.
085
.
125
1 .
15 .
This is the value of z that you need for a 75%
, use the formula x
z
, which is the opposite of z
x
. x
3
1 .
15
15 .
65 . If you wish, make a completely separate diagram for x . Draw a
Normal curve with a mean at 3. Show that 50% of the distribution is below the mean (3). If 12.5% of the distribution is above x
.
125
, it must be above the mean and have 37.5% of the distribution between it and the mean.
Check:
.
5
P
x
.
3749
15 .
65
.
1251
P
z
15 .
65
11
3
P
z
1 .
15
P
z
0
0
z
1 .
15
.
125 .
This is identical to the way you normally get a p-value for a right-sided test.
2
252y0751 10/19/07 (Open in ‘Print Layout’ format)
II. (9 points-2 point penalty for not trying part a.)
Our sales of microwave ovens in five randomly picked months appear below
123 126 140 141 149 a. Compute the sample standard deviation, s , of expenditures. Show your work! (2) b. Assuming that the underlying distribution is Normal, compute a 99% confidence interval for the mean. (2) c. Redo b) when you find out that there were only 12 months to pick the data from.(2) d. Assume that the population standard deviation is 10 and create a 75% two-sided confidence interval for the mean. (2) e. Use your results in a) to test the hypothesis that the mean is below 140 at the 99% level. (3)
State your hypotheses clearly!
Solution: a) Compute the sample standard deviation,
Row x x
2 x
x
x
x
2
1 123 15129 -12.8 163.84
2 126 15876 -9.8 96.04
3 140 19600 4.2 17.64
4 141 19881 5.2 27.04
5 149 22201 13.2 174.24
679 92687 0.0 478.80 x
n x f. (Extra Credit) Given the data, test the hypothesis that the population standard deviation is below
15 (3)? s , of expenditures.
679
5
135 .
80 s
2 x
x
2 n
1 n x
2
92687
The first two columns are needed for the computational (shortcut) method. The first, third and fourth are needed for the definitional method. Using (both methods or) the definitional method wastes time.
x
check),
5
135 .
80
2
4
679
, x
x
478 .
8
4 x
2
2
92687
478 .
80
119 .
70
,
and x
x
0 n
5
.
(a s x
119 .
70
10 .
9407 b) Assuming that the underlying distribution is Normal, compute a 99% confidence interval for the mean.
(2)
x
t
2
1 s x
135 .
80
4 .
604
4 .
8929
135 .
80
22 .
53 or 113.27 to 158.33 s x
s x n
10 .
9704
5
119 .
70
5
23 .
94
4 .
8929 t
2
t
.
005
4 .
604 c) Redo b) when you find out that there were only 12 months to pick the data from. (2)
x
t
2 s x
135 .
80
4 .
604
3 .
903
135 .
80
17 .
97 or 117.83 to 153.77 s x
s x n
N
N
n
1
10 .
9407
5
12
12
5
1
119 .
70
5
7
11
23 .
94
7
11
15 .
2345
3 .
903 d) Assume that the population standard deviation is 10 and create a 75% two-sided confidence interval for the mean.
(2) We found z
.
125
1 .
15 on the last page. We have
10 , n
5 , x
135 .
80 and
x
x n
10
5
100
5
20
4 .
4721
x
z
3
x
1
135 .
80
4 .
4721
135 .
80
5 .
14 or
130.66 to 140.94. e) Use your results in a) to test the hypothesis that the mean is below 140 at the 99% level. (3) State your hypotheses clearly! The statement that the mean is below 140 does not contain an equality, so it must be an alternate hypothesis. We have the following information.
.
01 , x
135 .
80 , n
5 and s x
s x n
10 .
9704
5
4 .
8929 . Since this is a one-sided hypothesis we will use t
t
.
01
3 .
747 .
Needless to say, because of the small sample size, we are assuming that the parent distribution is Normal.
3
252y0751 10/19/07 (Open in ‘Print Layout’ format)
Our hypotheses are
H
H
1
0
:
:
140
140
.
01 , x
135 .
80 , n
5 ,
0
140 and worrying about the mean being too small, this is a left-sided test. t
t
.
01
s x
4
3 .
747
.
8929 . Since we are
There are three ways to do this. Do only one of them.
(i) Test Ratio: t
x
s x
0
135 .
80
4 .
8929
140
0 .
8584 . This is a left-sided test - the smaller the sample mean is, the more negative will be this ratio. We will reject the null hypothesis if the ratio is smaller than
t
t
.
01
3 .
747 . Make a diagram showing a Normal curve with a mean at 0 and a shaded 'reject' zone below -3.747. Since the test ratio is not below -3.747, we cannot reject H
0
.
If you wish to find a p-value for your hypothesis, note that the t-ratio is -0.8584. The p-value will be the probability that t is below -0.8584. The line of the t table for 4 degrees of freedom is below. df .45 .40 .35 .30 .25 .20 .15 .10 .05 .025 .01 .005 .001
4 0.134 0.271 0.414 0.569 0.741 0.941 1.190 1.533 2.132 2.776 3.747 4.604 7.173
What this tells us, among other things, is that P
t
0 .
941
.
20 and P
t
0 .
741
.
25 . Since -0.8584 lies between -0.941 and -0.741, the probability that t lies below -.8584 must be between .20 and .25.
.
20
p
value
.
25 . This is above our significance level of .01, so we will not reject the null hypothesis.
(ii) Critical value: We need a critical value for too far below 140, we will not believe is
x cv
140
0
3 .
747 t
2
1 s x
4 .
8929
H
0
:
x below 140. Common sense says that if the sample mean is
140 . The formula for a critical value for the sample mean
, but we want a single value below 140, so use
121 .
17 x cv
0
t
n
1 s x
.
Make a diagram showing an almost Normal curve with a mean at 140 and a shaded 'reject' zone below 121.17. Since
(iii) Confidence interval:
x
t
2 s x x
135 .
80 is not below 121.17, we do not reject H
0
.
is the formula for a two sided interval. The rule for a one-sided confidence interval is that it should always go in the same direction as the alternate hypothesis. Since the alternative hypothesis is
135 .
80
3 .
747
4 .
H
8929
1
:
154
140
.
134
, the confidence interval is
x
t
s x
or
.
Make a diagram showing an almost Normal curve with a mean at x
135 .
80 and, to represent the confidence interval, shade the area below 154.134 in one direction. Then, on the same diagram, to represent the null hypothesis, H
0
:
140 , shade the area above 140 in the opposite direction. Notice that these overlap. What the diagram is telling you is that it is possible for
154 .
134 and H
0
:
140 to both be true. (If you follow my more recent suggestions, it is actually enough to show that 140 is on the confidence interval.) So we do not reject H
0
. f) (Extra Credit) Given the data, test the hypothesis that the population standard deviation is below 15 (3)?
This is an alternate hypothesis,
.
01 s
2 x
H
1
:
15 . The null hypothesis is
119 .
70 . Table 3 says that the test ratio is
2
n
1
s
0
2
H
0
2
:
4
15
119
15
2
. Remember
.
70
2 .
128 . n
5 ,
4
252y0751 10/19/07 (Open in ‘Print Layout’ format)
Recall df
n
1
4 .
The first paragraph of the chi-squared table appears below. If we look at the 4 column, we see that the lower 1% of values of chi-squared are cut off by
2
.
99
0 .
2971 , so that the reject region is below 0.2971. Since
2
2 .
128 is above 0.2910, do not reject the null hypothesis.
Degrees of Freedom
1 2 3 4 5 6 7 8 9
0.005 7.87946 10.5966 12.8382 14.8603 16.7496 18.5476 20.2778 21.9550 23.5893
0.010 6.63491 9.2103 11.3449 13.2767 15.0863 16.8119 18.4753 20.0902 21.6660
0.025 5.02389 7.3778 9.3484 11.1433 12.8325 14.4494 16.0128 17.5346 19.0228
0.050 3.84146 5.9915 7.8147 9.4877 11.0705 12.5916 14.0671 15.5073 16.9190
0.100 2.70554 4.6052 6.2514 7.7794 9.2364 10.6446 12.0170 13.3616 14.6837
0.900 0.01579 0.2107 0.5844 1.0636 1.6103 2.2041 2.8331 3.4895 4.1682
0.950 0.00393 0.1026 0.3518 0.7107 1.1455 1.6354 2.1674 2.7326 3.3251
0.975 0.00098 0.0506 0.2158 0.4844 0.8312 1.2373 1.6899 2.1797 2.7004
0.990 0.00016 0.0201 0.1148 0.2971 0.5543 0.8721 1.2390 1.6465 2.0879
0.995 0.00004 0.0100 0.0717 0.2070 0.4117 0.6757 0.9893 1.344 1.7349
Computer output for parts b) d) e) and f) follows.
Welcome to Minitab, press F1 for help.
MTB > WOpen "C:\Documents and Settings\rbove\My Documents\Minitab\251x0751-
21.MTW".
Retrieving worksheet from file: 'C:\Documents and Settings\rbove\My
Documents\Minitab\251x0751-21.MTW'
Worksheet was saved on Wed Sep 26 2007
MTB > Onet c1;
SUBC> Confidence 99.0.
Part b)
Variable N Mean StDev SE Mean 99% CI x 5 135.800 10.941 4.893 (113.273, 158.327)
MTB > OneZ c1;
SUBC> Sigma 10;
SUBC> Confidence 75.0.
Part d)
The assumed standard deviation = 10
Variable N Mean StDev SE Mean 75% CI x 5 135.800 10.941 4.472 (130.655, 140.945)
MTB > Onet c1;
SUBC> Test 140;
SUBC> Confidence 99.0;
SUBC> Alternative -1.
Part e)
Test of mu = 140 vs < 140
99%
Upper
Variable N Mean StDev SE Mean Bound T P x 5 135.800 10.941 4.893 154.133 -0.86 0.220
MTB > WSave "C:\Documents and Settings\RBOVE\My Documents\Minitab\251x0751-
21.MTW";
SUBC> Replace.
Saving file as: 'C:\Documents and Settings\RBOVE\My
Documents\Minitab\251x0751-21.MTW'
Existing file replaced. Data was stored so macro could be found
5
252y0751 10/19/07 (Open in ‘Print Layout’ format)
MTB > %sigtest c1 225 Tests data in column 1 for variance of 225. Packaged
Minitab Macro. This is a 2-sided test.
Executing from file: sigtest.MAC
The value of the test statistic is 2.1280.
If the test statistic is less than 0.4844 or greater
than 11.1433 then there is statistical evidence indicating
that your variance does not equal to 225.0000, at alpha =
0.0500.
Part f)
MTB > name k1 'a'
MTB > name k2 'df'
Using text of the macro, I set up a 1-sided test. a is signif level.
Sig0 is variance from H0 MTB > name k3 'sig0'
MTB > name k4 'stdev'
MTB > name k5 'fcalc'
MTB > name k6 'lower'
MTB > name k7 'upper'
MTB > name k8 'flow'
MTB > name k9 'fupper' fcalc is the test statistic
For a left-sided test this is signif level
Not used in 1-sided test
Reject null below this number
Not used in 1-sided test
MTB > name k10 'var'
MTB > let var = stdev(x)
Variance
MTB > let var = var * var Variance is std dev. Squared.
MTB > let df = n(x) – 1 Sample size minus 1
MTB > let lower = a
MTB > let sig0 = 225
MTB > let Fcalc = df*var/sig0
MTB > invcdf lower Flow; Sets lower limit as chi-squared alpha
SUBC> chis df.
MTB > print fcalc flow sig0 a
fcalc 2.12800 flow 0.297109 sig0 225.000 a 0.0100000
MTB > CDF 'fcalc';
SUBC> ChiSquare 4.
The value of chi-squared we tested
The value from the table
The null hypothesis variance
Significance level
Chi-Square with 4 DF
x P( X <= x )
2.128 0.287770 This is the value of test statistic and p-value
6
252y0751 10/19/07 (Open in ‘Print Layout’ format)
III. Do as many of the following problems as you can.(2 points each unless marked otherwise adding to
13+ points). Show your work except in multiple choice questions. (Actually – it doesn’t hurt there either.) If the answer is ‘None of the above,’ put in the correct answer if possible.
1) If I want to test to see if the mean of x is larger than the given population mean
0
my null hypothesis is: i) ii)
0
0 iii)
0 iv) *
0 v) Could be any of the above. We need more information. vi) None of the above
Explanation:
0
is our alternate hypothesis since it doesn’t contain an equality.
0
is the opposite, so it must be the null hypothesis.
2) Assuming that you have a sample mean of 100 based on a sample of 36 taken from a population of 300 with a known population standard deviation of 80, the 99% confidence interval for the population mean is a) 100
2 .
576
80
36
b) * 100
2 .
576
300
300
36
1
80
36 c) 100
2 .
576
d) 100
2 .
724
80
300
80
300
e) h)
100
100
2 .
724
2 .
438
300
300
36
1 f) 100
2 .
724
g) 100
2 .
438
80
36
80
300
300
300
36
1
80
36
80
36 i) 100
2 .
438
80
36
j) 100
2 .
438
300
300
36
1
80
300 k) None of the above. Fill in a correct answer.
Explanation: The formula for a confidence interval when the variance is known when the sample is more than 20% of the population was given in the solution to problem A2 as n
36, N
300,
80, 1
99% and
.
01 .
x
z
2
x
, where x
100,
7
252y0751 10/19/07 (Open in ‘Print Layout’ format)
Here
x
x n
N
N
n
1
80
36
300
300
36
1 and z
2
z
.
005
2 .
576 . So
100
2 .
576
80
36
300
300
36
1
3) Which of the following is a Type 2 error? a) Rejecting the null hypothesis when the null hypothesis is false. b) Rejecting the null hypothesis when the null hypothesis is true. c) Not rejecting the null hypothesis when the null hypothesis is true. d) *Not rejecting the null hypothesis when the null hypothesis is false. e) All of the above f) None of the above.
4) If a random sample is gathered to get information about a population proportion, what do we mean by a p-value? a) P-value is the population proportion in the null hypothesis. b) P-value is the population proportion in the alternate hypothesis. c) P-value is the probability of a type 2 error. d) P-value is the probability that, if the null hypothesis was false, that, if we were to repeat the experiment many times, we would get a sample proportion as extreme as or more extreme than the sample proportion actually observed. e) *P-value is the probability that, if the null hypothesis was true, that, if we were to repeat the experiment many times, we would get a sample proportion as extreme as or more extreme than the sample proportion actually observed. f) P-value is the probability that the alternate hypothesis is true, given the sample proportion actually observed. g) None of the above is true.
5) If a difference in proportions (in a business-related problem) is called statistically significant at the 1% significance level, this means that a) If the null hypothesis is true, the difference in proportions is surprisingly small. b) *We must reject the null hypothesis. c) The difference in proportions is small enough so that we must take account of it in our business decisions. d) The null hypothesis is very likely to be true. e) All of the above
Assume a Normal distribution in 6) and 7).
[10]
6) (Wonnacott & Wonnacott) A company is discharging treated waste into a river. The firm is supposed to be fined if the average pollution level is above 16 parts per million. It is known that the population standard deviation is 6 parts per million. 9 measurements are taken. If we assume that the firm is discharging 16 parts or fewer per million, how high must the sample mean level of pollution be to cause us to doubt the assumption? (Use a 1% significance level, and don’t forget to state your hypotheses.) (3)
Solution: We have
0
16 ,
6 , n
9 , H
1
:
16 , H
0
:
16 ,
.
01 , z
.
01
2 .
576
x
n
6
9
2 , x cv
0
z
x
16
2 .
576
21 .
152
,
8
252y0751 10/19/07 (Open in ‘Print Layout’ format)
7) The politicians, who don’t know any statistics, decide that they will fine the company in 6) if the level of pollution exceeds 20 parts per million. . It is known that the population standard deviation is 6 parts per million. 9 measurements are taken. If we assume that the firm is discharging 16 parts per million what is the probability that they will be fined? (Think p-value?) (2)
Solution: P
x
20
P
z
20
16
2
P
z
2
.
5
.
4772
.
0228
[15]
8) It is a well-known fact that your factory has been producing a product that is 20% defective. We take a sample of 500 units of the product this month and find that 103 are defective. a) Assuming that the 20% figure is correct, how many units of the product must be examined before we can state our defect rate as a proportion
.
03 ? (Use a 95% confidence level!!!)
Solution: The outline says “The usually suggested formula is forgets that we covered.” So, we have p
.
20 , q n
1
p
1
.
2 pqz
2 e
2
.
8 , z
…..
This is the formula everyone
1 .
960 , e
.
03 and n
.
2
.
2
2
682 .
951 . So use a sample of at least 683. b) If we wish to test our belief that the proportion has risen over the previous figure, let p represent the proportion of defective items . What are our null and alternative hypotheses? (1)
Solution: The statement implied in the problem p
.
20 does not contain an equality, so the null hypothesis is the opposite p
.
20 . c) You already know that the sample size is 500. Using a 95% confidence level and assuming that your hypothesis in b) is correct, test the hypothesis. (2) [20]
Solution: From Table 3.
Interval for
Proportion
.
05 , z
z
.
05
Confidence
Interval p
p
z
2 s p s p
p q n q
1
p
1 .
645 , p
0
.
20 , q
0
Hypotheses
H
0
H
1
:
: p p
p
0 p
0
1
p
0
Test Ratio z
1
.
2
.
8 , x
p
p p
0
103
Critical Value p cv
p
0
z
2
p
q
0 p
p
0
1
p
0 n q
0
and n
500 . First, for the test ratio or critical value we need
p
p
0 q
0 n
.
2
500
we need p
103
500
.
206 , q
1
p
1
.
206
.
794 . s p
.
00032 p q n
.
017889 . For the confidence interval
.
206
.
794
500
.
00033
.
018087 . Use one of the following 3 methods.
Critical Value Method: Since we have the proportion of p cv
p
0
z
p
H
1
.
20
: p
.
20
1 .
645
, this is a right-sided test. We use a critical value for
.
017889
.
2294 . Make a diagram showing a normal curve with a center at p
0
.
20 and a rejection region above .2294. Since p
.
206 , is below .2294, we cannot reject the null hypothesis.
Test Ratio Method: z
p
p p
0
.
206
.
20
.
017889
0 .
3354 . Make a diagram showing a normal curve with a center at zero and a rejection region above 1.645. Since null hypothesis. The p-value would be P
p
.
206
z z
0 .
3354
0 .
3354
, is below 1.645, we cannot reject the
.
5
..
1368 =.3632.
This would lead to nonrejection of the null hypothesis for most common values of
, since the p-value would usually be above the significance level.
9
252y0751 10/19/07 (Open in ‘Print Layout’ format)
.
Confidence interval method: Since we have p
p
z
p
.
206
1 .
645
0 .
018087
H
.
1762
1
: p
.
20 , we need a one-sided ‘
The null hypothesis H
0
: p
.
20
’ interval. This would be
is not contradicted by the confidence interval .
p
.
1762 , since any value of the proportion between .1760 and .20 will satisfy both.
10
252y0751 10/19/07 (Open in ‘Print Layout’ format)
ECO252 QBA2
FIRST EXAM
October 8, 2007
TAKE HOME SECTION
-
Name: _________________________
Student Number and class: _________________________
IV. Do at least 3 problems (at least 7 each) (or do sections adding to at least 20 points - Anything extra you do helps, and grades wrap around) . Show your work! State H
0
and H
1
where appropriate. You have not done a hypothesis test unless you have stated your hypotheses, run the numbers and stated your conclusion.
(Use a 95% confidence level unless another level is specified.) Answers without reasons usually are not acceptable. Neatness and clarity of explanation are expected. This must be turned in when you take the in-class exam. Note that answers without reasons and citation of appropriate statistical tests receive no credit.
Failing to be transparent about which section of which problem you are doing can lose you credit. Many answers require a statistical test, that is, stating or implying a hypothesis and showing why it is true or false by citing a table value or a p-value. If you haven’t done it lately, take a fast look at ECO 252 - Things That You Should Never Do on a Statistics Exam (or
Anywhere Else) .
A group of 30 employees are interviewed to determine the minimum amount that they will take to give up a vacation day. After careful interviewing, a psychologist repots the following amounts.
479 648 522 595 547 657 578 539 553 520 499 606 612
616 488 557 621 628 511 634 625 612 509 633 616 598
627 622 512 631
My calculations say that the sum of these 30 numbers is
x
17395 and that the sum of squares is
x
2
10171575 . This is a sample of 30.
Personalize these data as follows. Take the second to last digit of your student number and multiply it by 5.
Add this quantity to each of the 30 numbers. If the second to last digit of your student number is 0, add 50.
Label your exam by version number as follows. If the second to last digit of your student number is 1, you are doing Version 1. If the second to last digit is 2, you are doing Version 2. Etc. If the second to last digit is zero you are doing version 10. Last term's exam said the following.
If you add a quantity a to a column of numbers,
x
a
x
a
2
x
60
x
60
2
x x
x
2
2
30
2
,
x
na
17395 + 1800 = ? and
30
2
. For example, if
2
10171575
a
60 x
,
na ,
120
17395
30
3600 .
Test the following
Problem 1: Count the number of people in your sample that demand more than $602.50 and make it into a sample proportion. Test the following 3 hypotheses: I) that 60% demand more than $602.50, II) that more than 60% demand more than $602.50 and III) that less than 60% demand more than $602.50, using a 98% confidence level.
For each of these three tests a) state your null and alternative hypotheses (2), b) test each one using a test ratio or a critical value for the proportion (2) and c) find a p-value for the null hypotheses (3). Label each part clearly so that I know which is I, II and III and a), b) c). Make sure that I know where the ‘reject’ zone is. d) Using the proportion you found above, how large a sample would you need to estimate a 2-sided 98% confidence interval for the proportion with and error of at most .001? Assume that your sample is of that size and show that the confidence interval has an error of at most .001. (3) [10] e) (Extra credit) Assume that you are testing the hypothesis that (II) more than 60% demand over $602.50, find the power of the test if you use a sample of 30 the true proportion is 70% (3)
11
252y0751 10/19/07 (Open in ‘Print Layout’ format)
Problem 2: Assume that the underlying data for problem 1 is not Normal and using the data for problem 1 test the following three hypotheses: I) that the median demand is $602.50, II) that median demand is more than $602.50 and III) that the median demand is less than $602.50, using a 98% confidence level. a) state your null and alternative hypotheses and the hypotheses that you will actually test for each of the 3 tests
(3), b) test each one using a test ratio or a critical value (3), c) find a p-value for the 2-sided test and explain whether and why it would lead to a rejection of the null hypothesis at the 95% confidence level (1), d)
(extra credit) Show explicitly what the conclusion in c) would be if the sample of 30 came from a population of 60. (1) e) (extra credit) find a two sided confidence interval for the median (2) [17]
Problem 3: a) Find the sample mean and sample standard deviation for the data in Problem 1 (1) b) Test the hypothesis that the mean is 602.50 using critical values for the sample mean, first stating your hypotheses clearly. Use a 98% confidence level (2) c) Test the hypothesis in b) using a test ratio. Find an approximate p-value and state and explain whether this will lead to a rejection of the null hypothesis if we continue to use a 98% confidence level. (2) d) Using the test ratio you found in c) find a p-value for the null hypothesis that the mean is at most 602.50
(1) e) Using the test ratio you found in c) find a p-value for the null hypothesis that the mean is at least 602.50
(1) f) Test the null hypothesis that the mean is at most 602.50 using an appropriate confidence interval (1) g) Test the null hypothesis that the mean is at least 602.50 using an appropriate confidence interval (1)
[26]
Problem 4: Assume that the population standard deviation is known to be 30 but that we are still working with a problem like Problem 3. (98% confidence level, sample of 30.) Do either Problem 4.1 or Problem
4.2. Make sure that I know which one!
Problem 4.1. a) Find a critical value for the sample mean if we are testing whether the population mean is below 30. Clearly state your null and alternative hypotheses (2) b) Assume that the sample mean is 30 minus the second to last digit of your student number. (Use 10 if this digit is zero.) Find a p-value for your null hypothesis. (1) c) Create a power curve for the test (6)
Problem 4.2. a) Find critical values for the sample mean if we are testing whether the population mean is 30. Clearly state your null and alternative hypotheses (2) b) Assume that the sample mean is 30 minus the second to last digit of your student number. (Use 10 if this digit is zero.) Find a p-value for your null hypothesis. (1) c) Create a power curve for the test (8)
Problem 5: In problem 4 we assumed that the population standard deviation is 30.
[37] a) Do a 98% confidence interval for the mean using the mean that you found in Problem 3 and assuming that our sample of 30 came from a population of 300. (2) b) How large a sample would we need if we wanted to make the error term no more than
1 and the sample came from an infinite population? (2) c) Using a 98% confidence level and a sample size of 30 create a confidence interval for the population standard deviation using your sample variance or standard deviation from Problem 3. (2) d) Repeat c) assuming that you had a sample of 300. (2) e) Can we say that the standard deviation is significantly different from 30 on the basis of c) and d)? (1) f) Using the data and sample size from problem 3 can we say that the standard deviation is above 30? State your hypotheses and do an appropriate hypothesis test. (3) [49]
12
252y0751 10/19/07 (Open in ‘Print Layout’ format)
Problem 1: Count the number of people in your sample that demand more than $602.50 and make it into a sample proportion. Test the following 3 hypotheses: I) that 60% demand more than $602.50, II) that more than 60% demand more than $602.50 and III) that less than 60% demand more than $602.50, using a 98% confidence level.
For each of these three tests a) state your null and alternative hypotheses (2), b) test each one using a test ratio or a critical value for the proportion (2) and c) find a p-value for the null hypotheses (3). Label each part clearly so that I know which is I, II and III and a), b) c). Make sure that I know where the ‘reject’ zone is. d) Using the proportion you found above, how large a sample would you need to estimate a 2-sided 98% confidence interval for the proportion with and error of at most .001? Assume that your sample is of that size and show that the confidence interval has an error of at most .001. (3) [10] e) (Extra credit) Assume that you are testing the hypothesis that (II) more than 60% demand over $602.50, find the power of the test if you use a sample of 30 the true proportion is 70% (3)
Solution: The data sets that you had are presented in order. A line divides the numbers above $602.50 from those below. x b
is the number below 602.50. x
30
x b is the number below 602.50.
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
1 484 489 494 499 504 509 514 519 524 529
2 493 498 503 508 513 518 523 528 533 538
3 504 509 514 519 524 529 534 539 544 549
4 514 519 524 529 534 539 544 549 554 559
5 516 521 526 531 536 541 546 551 556 561
6 517 522 527 532 537 542 547 552 557 562
7 525 530 535 540 545 550 555 560 565 570
8 527 532 537 542 547 552 557 562 567 572
9 544 549 554 559 564 569 574 579 584 589
10 552 557 562 567 572 577 582 587 592 597
11 558 563 568 573 578 583 588 593 598 603
12 562 567 572 577 582 587 592 597 602 607
13 583 588 593 598 603 608 613 618 623 628
14 600 605 610 615 620 625 630 635 640 645
15 603 608 613 618 623 628 633 638 643 648
16 611 616 621 626 631 636 641 646 651 656
17 617 622 627 632 637 642 647 652 657 662
18 617 622 627 632 637 642 647 652 657 662
19 621 626 631 636 641 646 651 656 661 666
20 621 626 631 636 641 646 651 656 661 666
21 626 631 636 641 646 651 656 661 666 671
22 627 632 637 642 647 652 657 662 667 672
23 630 635 640 645 650 655 660 665 670 675
24 632 637 642 647 652 657 662 667 672 677
25 633 638 643 648 653 658 663 668 673 678
26 636 641 646 651 656 661 666 671 676 681
27 638 643 648 653 658 663 668 673 678 683
28 639 644 649 654 659 664 669 674 679 684
29 653 658 663 668 673 678 683 688 693 698
30 662 667 672 677 682 687 692 697 702 707 x b
14 13 13 13 12 12 12 12 12 10 x 16 17 17 17 18 18 18 18 18 20
Let p be the proportion that demand more than $602.50. a) State your null and alternative hypotheses (2)
I) 60% demand more than $602.50
H
H
1
0
:
: p p
.
6
.
6
II) More than 60% demand more than $602.50
H
H
1
0
:
: p p
.
6
.
6
13
252y0751 10/19/07 (Open in ‘Print Layout’ format)
III) Less than 60% demand more than $602.50
H
H
1
0
:
: p p
.
6
.
6 b) Test each hypothesis using a test ratio or a critical value for the proportion (2)
The relevant formulas are in Table 3. n
30 ,
.
02 , z
z
.
02
2 .
054 was found in Grass1 and the t table says z
2
z
.
01
2 .
327
p
p
0 n q
0
.
6
30
= .
008
.
08944
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 x 16 17 17 17 18 18 18 18 18 20 p
x n
x
30
.5333 .5667 .5667 .5667 .6000 .6000 .6000 .6000 .6000 .6667
Here is the slice of Table 3 for proportions.
Interval for
Proportion
Confidence
Interval p
p
z
2 s p
Hypotheses s p
p q n q
1
p
H
0
H
1
:
: p p
p
0 p
0
Test Ratio z
p
p p
0
Critical Value p
q
0 cv
p
0
z
2
p p
p
0
1
p
0 n q
0
I) 60% demand more than $602.50
H
H
1
0
:
:
.
6
2 .
327
.
08966
.
6
.
2086 p p
.
6
Critical Value:
.
6 p cv
p
0
z
2
p
Make a diagram. Draw a Normal curve centered at .6 with rejection zones below .3914 and above .8086. None of the values of p falls into the rejection region.
Test Ratio: z
p
p p
0 p
.
6
.
08944
. We reject the null hypothesis unless z falls between
z
2
z
.
01
2 .
327 and z
2
z
.
01
2 .
327 . V1: z
.
5333
.
6
0 .
7458 , V2-4:
.
08944 z
.
5667
.
6
0 .
3723 , V5-9:
.
08944 falls in the ‘reject’ region. z
.
6000
.
6
.
08944
0 , V10: z
.
6667
.
6
.
08944
0 .
7458 None of these
II) More than 60% demand more than $602.50
H
H
1
0
.
6
2 .
054
.
08966
.
6
0 .
1842
.
7842
:
: p p
.
6
Critical Value:
.
6 p cv
p
0
z
p
. Make a diagram. . Draw a Normal curve centered at .6 with a rejection zone above .7842. None of our values of
Test Ratio: z
p
p p
0 p
.
6
.
08944 p falls into the rejection zone.
. We reject the null hypothesis if z falls above z
.
02
2 .
054 .
None of our values of z falls into the rejection zone.
III) Less than 60% demand more than $602.50
H
H
1
0
:
:
.
6
2 .
054
.
08966
.
6
0 .
1842
.
4158 p
p
.
6
Critical Value:
.
6 p cv
p
0
z
p
. Make a diagram. Draw a Normal curve centered at .6 with a rejection zone below .4158. None of our values of
Test Ratio: z
p
p p
0 p
.
6
.
08944 p falls into the rejection zone.
. We reject the null hypothesis if z falls below
z
.
02
2 .
054 .
None of our values of z falls into the rejection zone.
14
252y0751 10/19/07 (Open in ‘Print Layout’ format) c) Find a p-value for the null hypotheses
In response to a student inquiry, I wrote the following paragraph about p-value.
We could say that to compute a value for z or t, if it is a left sided test, find the probability below your value of z using what you know about finding Normal probabilities (if it is t approximate the probability using the t table.) If it is a right sided test find the probability above your value of z. If it is a 2-sided test and z is negative, proceed as you would in a left sided test and double the probability. If it is a 2 sided test and z is positive, proceed as you would in a right sided test and double the probability.
I) 60% demand more than $602.50
V1: z
V2-4:
V5-9:
V10: z
z
0 .
7458
0 .
3723 z
0 , p
0 .
7458
,
, p
, p
value
p value
value
2 value
P
2
2
z
P
2
P
z
H
H
P
0 z
z
0
1
:
2
:
.
p p
0 .
7458
0 .
7458
.
6
.
6
0 .
3723
2
1 .
0000
2
.
5
.
5
2
.
5
.
2734
.
1443
.
2734
.
4532
.
7114
.
4532
II) More than 60% demand more than $602.50
V1:
V2-4:
V5-9:
V10: z z
z
0 .
7458
,
0 .
3723 z
0 , p
0 .
7458
, p
, value p
p value
value
value
P
z
P
P
0 z z
P
z
0 .
7458
.
5
0 .
3723
0 .
7458
H
H
1
.
5
0
:
:
.
5 p p
.
6
.
6
.
2734
.
1443
.
5
.
2734
.
7734
.
6443
.
2266
III) Less than 60% demand more than $602.50
V1:
V10: z
V2-4:
V5-9: z
z
0 .
7458
0 .
3723 z
0 ,
p
0 .
7458
,
, p
,
p value
value p
value
value
P
z
P
P
0 z z
P
z
0 .
7458
.
5
0 .
3723
0 .
7458
H
H
1
0
.
5
:
.
5
:
p p
.
5
.
2734
.
6
.
6
.
2734
.
1443
.
2266
.
3557
.
7734 d) Using the proportion you found above, how large a sample would you need to estimate a 2-sided 98% confidence interval for the proportion with and error of at most .001? Assume that your sample is of that size and show that the confidence interval has an error of at most .001. (3) [10] n
pqz
2 e
2
. I have not worked this out for all versions, and it is up to you to decide what confidence level you will use. A solution for Version 1 with that you could get. n
.
5333
.
6667
.
2
2 .
328
2
1203 .
3
.
01 is probably the largest result
. The sample should be, at least 1204. e) (Extra credit) Assume that you are testing the hypothesis that (II) more than 60% demand over $602.50, find the power of the test if you use a sample of 30 the true proportion is 70% (3)
Find P
p
.
7842 p
.
7
P
z
.
7842
.
7 .
.
7 .
. If you answer was close to this and I didn’t give
30 you credit, complain.
15
252y0751 10/19/07 (Open in ‘Print Layout’ format)
Problem 2: Assume that the underlying data for problem 1 is not Normal and using the data for problem 1 test the following three hypotheses: I) that the median demand is $602.50, II) that median demand is more than $602.50 and III) that the median demand is less than $602.50, using a 98% confidence level. a) state your null and alternative hypotheses and the hypotheses that you will actually test for each of the 3 tests
(3), b) test each one using a test ratio or a critical value (3), c) find a p-value for the 2-sided test and explain whether and why it would lead to a rejection of the null hypothesis at the 95% confidence level (1), d)
(extra credit) Show explicitly what the conclusion in c) would be if the sample of 30 came from a population of 60. (1) e) (extra credit) find a two sided confidence interval for the median (2) [17]
Let p be the proportion that demand more than $602.50. The data has been exhibited in Problem 1. We have calculated the following.
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 x 16 17 17 17 18 18 18 18 18 20 p
x
x
.5333 .5667 .5667 .5667 .6000 .6000 .6000 .6000 .6000 .6667 n 30
The relevant formulas are in Table 3. n
30 ,
.
02 , z
z
.
02
2 .
054 was found in Grass1 and the t table says z
2
z
.
01
2 .
327
p
p
0 n q
0
.
5
30
= .
00833
.
091287 .
If we check the table in the outline { 252ones }, we have the correspondences below. We will use the hypotheses about a proportion on the left.
Hypotheses about Hypotheses about a
A median proportion
If p is the If p is the
H
H
1
0
:
:
H
H
1
0
:
:
0
0
0
0
H
H
1
0
:
:
0
0 proportion above
0
H
H
1
0
:
: p p
.
5
.
5
H
H
1
0
:
:
H
H
1
0
:
: p p
.
5
.
5 p p
.
5
.
5 proportion below
0
H
H
1
0
:
:
H
H
1
0
:
:
H
H
1
0
:
: p p
.
5
.
5 p p
.
5
.
5 p p
.
5
.
5 a) State your null and alternative hypotheses and the hypotheses that you will actually test for each of the 3 tests (3) b) Test each one using a test ratio or a critical value (3)
I) The median demand is $602.50
Critical Value: p cv
p
0
z
2
p
H
H
.
5
0
1
:
:
2 .
327
602
602
.
50
.
50
.
091287
H
H
0
1
.
5
:
:
p p
.
5
.
5
.
2124 . Make a diagram. Draw a
Normal curve centered at .5 with rejection zones below .2878 and above .7124. None of the values of p fall into the rejection region.
Test Ratio: z
p
p p
0 p
.
5
.
091287
. We reject the null hypothesis unless z falls between
z
2
z
.
01
2 .
327 and z
2
z
.
01 z
.
5667
.
5
0 .
7307 , V5-9:
.
091287 these falls in the ‘reject’ region. z
2 .
327 . V1: z
.
5333
.
5
.
091287
.
6000
.
5
1 .
0954 , V10: z
.
091287
0 .
3648 , V2-4:
.
6667
.
5
1 .
8261 None of
.
091287
16
252y0751 10/19/07 (Open in ‘Print Layout’ format)
II) The median demand is more than $602.50
H
H
1
0
:
:
602 .
50
602 .
50
Critical Value: This is a right-sided test so our critical value is
.
5
2 .
054
.
091287
.
5
.
1875
.
6875 p
H
H
1
0 cv
:
: p
0 p p
.
5
.
5
z
p
. Make a diagram. Draw a Normal curve centered at .5 with a rejection zone above .6875. None of the values of p fall into the rejection region.
Test Ratio: z
p
p p
0 p
.
5
.
091287
. We reject the null hypothesis if z falls above z
z
.
02
2 .
054 . V1: z
0 .
3648 , V2-4: these falls in the ‘reject’ region. z
0 .
7307 , V5-9: z
1 .
0954 , V10: z
1 .
8261 None of
III) The median demand is less than $602.50
H
H
1
0
:
:
602 .
50
602 .
50
H
H
1
0
:
: p p
.
5
.
5
Critical Value: This is a right-sided test so our critical value is
.
5
2 .
054
.
091287
.
5
.
1875
.
3125 p cv
p
0
z
p
. Make a diagram. Draw a Normal curve centered at .5 with a rejection zone below .3125. None of the values of p fall into the rejection region.
Test Ratio: z
p
p
0
p
p
.
5
.
091287
. We reject the null hypothesis if
z
.
02
2 .
054 . V1: z
0 .
3648 , V2-4: of these falls in the ‘reject’ region. z
0 .
7307 , V5-9: z
z falls below
z
1 .
0954 , V10: z
1 .
8261 None
Alternate formulas for this section include those below.
i. Test Ratio: With continuity correction z
p
.
5 n
p
p
0
,
p
p
0 q
0 n
or z
2 x
1
n
. This is the n same as testing z against
z
2
.
5 n
p
. Without continuity correction z
2 x
n
. To allow for a finite n population, divide these by ii. Critical Value: p cv
p
0
N
n
N
1
2 n
1
.
z
2
p
. To make a finite population correction, multiply
p
by
N
n
N
1
.
iii. Confidence Interval: p
p
1
2 n
z
2 s p
. To make a finite population correction, multiply s p
by
N
N
n
1
. c) Find a p-value for the 2-sided test and explain whether and why it would lead to a rejection of the null hypothesis at the 95% confidence level (1) . See problem 1 for an explanation of p-value.
I) The median demand is $602.50
V1:
V2-4:
V5-9:
V10: z z
z z
0 .
3648
0 .
7307
1 .
0954
1 .
8261
,
,
, p p p
p
value value value value
2
2
P
H
H
z
0
1
2 P z
2 P z
P
z
:
:
602
0 .
3648
0 .
7307
1 .
0954
1 .
8261
602 .
50
.
50
2
.
5
H
H
2
2
2
.
5
.
5
.
5
0
: p
.
5
.
0672
1
:
.
1406
.
2673
.
3621
.
4664
p
.
5
.
7188
.
4654
.
2758
Since none of these are below the significance level of 5%, none of these lead to a rejection of the null hypothesis at a 95% confidence level.
17
252y0751 10/19/07 (Open in ‘Print Layout’ format)
II) The median demand is more than $602.50
V1:
V2-4:
V5-9:
V10: z z
z z
0 .
3648
0 .
7307
1 .
0954
1 .
8261
, p
,
, p p
p
value value value value
P
P
P
P z
z
z
z
0 .
3648
0 .
7307
1 .
0954
1 .
8261
H
H
0
.
5
1
.
5
:
.
5
.
5
:
602
602
.
1406
.
2673
.
3621
.
4664
.
50
.
50
.
4406
.
2327
.
1379
.
0336
H
H
1
0
:
: p p
.
5
.
5
Since only the p-value for Version 10 is below the significance level of 5%, only in version 10 do we reject the null hypothesis at a 95% confidence level.
III) The median demand is less than $602.50
V1:
V2-4:
V5-9:
V10: z z
z z
0 .
3648
0 .
7307
1 .
0954
1 .
8261
, p
,
, p p
p
value value value value
P
P
P
P z
z
z
z
0 .
3648
0 .
7307
1 .
0954
1 .
8261
H
H
0
1
.
5
.
5
:
.
5
.
5
:
602
602
.
1406
.
2673
.
3621
.
4664
.
50
.
50
H
H
.
7673
.
8621
.
9664
0
1
.
3594
:
: p p
.
5
.
5
Since none of these are below the significance level of 5%, none of these lead to a rejection of the null hypothesis at a 95% confidence level. d) (Extra credit) Show explicitly what the conclusion in c) would be if the sample of 30 came from a population of 60. (1)
p
N
N
n
1 p
0 q
0 n
60
60
30
1
.
5
30
=
0 .
50847
.
00833
.
0042373
.
065094
The test ratio is z
p
p p
0 p
.
5
.
065094 and is now larger in absolute value than it was in c). We can put the p-values for the one-sided hypotheses under the hypotheses in the table below.
Version z-score
H
H
1
0
V1: z
V2-4:
z
.
5333
.
5667
.
5
.
091287
.
065094
.
5
0 .
5116
0 .
7307
P
P
z z
:
:
0 .
5116
0
602 .
50
602 .
50
.
7307
.
5
H
H
1
0
:
: p p
.
5
.
5
H
H
1
0
.
5
.
1950
.
2673
.
3050
.
7673
P
P
z z
:
:
602 .
50
602 .
50
0 .
5116
0 .
7307
.
5
.
1950
.
5
H
H
1
0
:
: p p
.
5
.
5
.
2673
.
.
6950
2327
V5-9: z
V10: z
.
6000
.
065094
.
5
.
.
065094
6667
.
5
1 .
5362
2 .
5609
P
z
P
z
1 .
5362
.
5
.
4382
2 .
5609
.
5
.
4948
.
9382
.
9948
P
z
P
z
1 .
5362
2 .
5609
.
5
.
4382
.
5
.
4948
.
0618
.
0052 .
If we look at these, mentally double the smaller of the two probabilities and compare the p-values with
.
05 , we see that, though some of these p-values have fallen, the only change to our results is that for
Version 10 we will now reject the null hypothesis for
H
H
1
0
:
:
602 .
50
602 .
50
H
H
0
1
:
: p p
.
5
as well as for the
.
5 right-sided test. e) (Extra credit) Find a two sided confidence interval for the median (2) . At this point I’m unsure if there is any significance level implied. The easiest way for me to do this is to copy out the first part of the binomial table for n
30 . { bin } Recall that if we take the interval. We get a significance level of
2 P
x th k number from both the top and the bottom as our
k
1
, when p
.
5 . For example if n
50
18
252y0751 10/19/07 (Open in ‘Print Layout’ format)
2
2
P
P
x x
19
17
1
1
03245
2
.
00767
.
06490
.
01534
,
and
2
P
x
2
P
18 x
1
16
1
2
.
01642
2
00330
.
03264
.
00660 . If the confidence level is to be at least 1
, the significance level must be at most
. So, if we want a 95% confidence interval we need k
18 (and 50
k
99% confidence level we need k
16
1
33 ), for a 98% confidence level we need k
17 (and 32) and for a
(and 33). Unfortunately, we do not have a binomial table for n
30 so we must try k
n
1
z
.
2 n
. For 95% this would be k
30
1
1 .
960 30
10 .
13 , for 98% this
2 2 would be k
30
1
2
2
.
327 30
9 .
13 and for 99% this would be are all rounded down and are paired with the number with index k
30
1
2 .
576 30
8 .
44 . These
2
30
k
1 , which takes the values 21, 22 and 23.
The confidence intervals are given on the following table.
95% confidence interval for the median
Index V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
10 552 557 562 567 572 577 582 587 592 597 to
21 626 631 636 641 646 651 656 661 666 671
98% confidence interval for the median
Index V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
9 544 549 554 559 564 569 574 579 584 589
to
22 627 632 637 642 647 652 657 662 667 672
99% confidence interval for the median
Index V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
8 527 532 537 542 547 552 557 562 567 572
to
23 630 635 640 645 650 655 660 665 670 675
19
252y0751 10/19/07 (Open in ‘Print Layout’ format)
Problem 3: a) Find the sample mean and sample standard deviation for the data in Problem 1 (1) b) Test the hypothesis that the mean is 602.50 using critical values for the sample mean, first stating your hypotheses clearly. Use a 98% confidence level (2) c) Test the hypothesis in b) using a test ratio. Find an approximate p-value and state and explain whether this will lead to a rejection of the null hypothesis if we continue to use a 98% confidence level. (2) d) Using the test ratio you found in c) find a p-value for the null hypothesis that the mean is at most 602.50
(1) e) Using the test ratio you found in c) find a p-value for the null hypothesis that the mean is at least 602.50
(1) f) Test the null hypothesis that the mean is at most 602.50 using an appropriate confidence interval (1) g) Test the null hypothesis that the mean is at least 602.50 using an appropriate confidence interval (1)
[26]
The data in use are as below.
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
1 484 489 494 499 504 509 514 519 524 529
2 653 658 663 668 673 678 683 688 693 698
3 527 532 537 542 547 552 557 562 567 572
4 600 605 610 615 620 625 630 635 640 645
5 552 557 562 567 572 577 582 587 592 597
6 662 667 672 677 682 687 692 697 702 707
7 583 588 593 598 603 608 613 618 623 628
8 544 549 554 559 564 569 574 579 584 589
9 558 563 568 573 578 583 588 593 598 603
10 525 530 535 540 545 550 555 560 565 570
11 504 509 514 519 524 529 534 539 544 549
12 611 616 621 626 631 636 641 646 651 656
13 617 622 627 632 637 642 647 652 657 662
14 621 626 631 636 641 646 651 656 661 666
15 493 498 503 508 513 518 523 528 533 538
16 562 567 572 577 582 587 592 597 602 607
17 626 631 636 641 646 651 656 661 666 671
18 633 638 643 648 653 658 663 668 673 678
19 516 521 526 531 536 541 546 551 556 561
20 639 644 649 654 659 664 669 674 679 684
21 630 635 640 645 650 655 660 665 670 675
22 617 622 627 632 637 642 647 652 657 662
23 514 519 524 529 534 539 544 549 554 559
24 638 643 648 653 658 663 668 673 678 683
25 621 626 631 636 641 646 651 656 661 666
26 603 608 613 618 623 628 633 638 643 648
27 632 637 642 647 652 657 662 667 672 677
28 627 632 637 642 647 652 657 662 667 672
29 517 522 527 532 537 542 547 552 557 562
30 636 641 646 651 656 661 666 671 676 681
Minitab offers the summary statistics below.
Version n x s x s x
Q1 Median Q3
x
x
2
1 30 584.83 9.91 54.26 526.50 607.00 630.50 17545 10346275
2 30 589.83 9.91 54.26 531.50 612.00 635.50 17695 10522475
3 30 594.83 9.91 54.26 536.50 617.00 640.50 17845 10700175
4 30 599.83 9.91 54.26 541.50 622.00 645.50 17995 10879375
5 30 604.83 9.91 54.26 546.50 627.00 650.50 18145 11060075
6 30 609.83 9.91 54.26 551.50 632.00 655.50 18295 11242275
7 30 614.83 9.91 54.26 556.50 637.00 660.50 18445 11425975
8 30 619.83 9.91 54.26 561.50 642.00 665.50 18595 11611175
9 30 624.83 9.91 54.26 566.50 647.00 670.50 18745 11797875
10 30 629.83 9.91 54.26 571.50 652.00 675.50 18895 11986075
20
252y0751 10/19/07 (Open in ‘Print Layout’ format) a) Find the sample mean and sample standard deviation for the data in Problem 1 (1)
Solution: There isn’t a good reason to repeat the calculations here for more than one example.
So I will stick to Version 1
Index x x
2 x
x
x
x
2
1 484 234256 -100.833 10167.4
2 653 426409 68.167 4646.7
3 527 277729 -57.833 3344.7
4 600 360000 15.167 230.0
5 552 304704 -32.833 1078.0
6 662 438244 77.167 5954.7
7 583 339889 -1.833 3.4
8 544 295936 -40.833 1667.4
9 558 311364 -26.833 720.0
10 525 275625 -59.833 3580.0
11 504 254016 -80.833 6534.0
12 611 373321 26.167 684.7
13 617 380689 32.167 1034.7
14 621 385641 36.167 1308.0
15 493 243049 -91.833 8433.4
16 562 315844 -22.833 521.4
17 626 391876 41.167 1694.7
18 633 400689 48.167 2320.0
19 516 266256 -68.833 4738.0
20 639 408321 54.167 2934.0
21 630 396900 45.167 2040.0
22 617 380689 32.167 1034.7
23 514 264196 -70.833 5017.4
24 638 407044 53.167 2826.7
25 621 385641 36.167 1308.0
26 603 363609 18.167 330.0
27 632 399424 47.167 2224.7
28 627 393129 42.167 1778.0
29 517 267289 -67.833 4601.4
30 636 404496 51.167 2618.0
Sum 17545 10346275 0.000 85374.2
For these numbers
x
17545,
x
2
10346275 and n
30 . This means that x
n x
17545
30
584 .
3333 . If we subtract this mean from all 30 numbers in the first column, we get the 3 rd column which has the sum
x
x
0 . If we square the 3 rd column, we get
x
x
2
85374 .
2 .
Using the computational or definitional formula, we get the following. s
2 x
x n
2
1 n x
2
10346275
30
29
584 .
8333
2
n x
1 x
2
85374
29
.
17
2943 .
9367 s
2
2943 .
9367
So s x
2943 .
9367
54 .
25806 . s x
98 .
1312
9 .
9061 n 30
You could have gotten this using the shortcut at the beginning of the Takehome document as follows.
x
a
1071575
na
2
17545
,
17395
330
2
30
17545
10346275 .
x
a
2
x
2
na
2
21
252y0751 10/19/07 (Open in ‘Print Layout’ format) b) Test the hypothesis that the mean is 602.50 using critical values for the sample mean, first stating your hypotheses clearly. Use a 98% confidence level (2)
Solution: As usual, we go back to Table 3.
Interval for Confidence Hypotheses Test Ratio Critical Value
Mean (
unknown)
Interval
DF x
n t
2
1 s x
H
0
H
1
:
:
0
0 t
x
0 s x x cv s x
t
2 s x
s n
Note that
.
02 and n
30 , so that t
2
t
.
01
=2.462. Recall that s x
9 .
9061 and
H
H
1
0
:
:
602 .
50
.
602 .
50 x cv
t
2 s x
602 .
50
2 .
462
9 .
9061
602 .
50
24 .
39 or 578.11 to 626.89
Make a diagram. Show an approximately Normal curve with a mean at 602.50 and shaded ‘reject’ zones above 626.89 and below 578.11. None of the means below will fall into a ‘reject’ zone except the sample mean for Version 10.
Version x
1 584.83
2 589.83
3 594.83
4 599.83
5 604.83
6 609.83
7 614.83
8 619.83
9 624.83
10 629.83 c) Test the hypothesis in b) using a test ratio. Find an approximate p-value and state and explain whether this will lead to a rejection of the null hypothesis if we continue to use a 98% confidence level. (2) t
x
s x
0 x
9 .
602 .
9061
50
If you wish, make an approximately Normal curve with a mean at zero and
‘reject’ zones above t
.
01
= 2.462 and below -2.462 and compare your value of t with the ratios computed below.
In order to find the p-value, we look at the t table to find the following for 29 degrees of freedom. df .45 .40 .35 .30 .25 .20 .15 .10 .05 .025 .01 .005 .001
29 0.127 0.256 0.389 0.530 0.683 0.854 1.055 1.311 1.699 2.045 2.462 2.756 3.396
Values of t appear in the table below. If we compare values we find t
.
05
1.699< 1 .
78375 < t
.
025 t
1 .
2.045. Since
78375 t
.
, the ratio for Version 1, with the table
1.699, means P
t
1 .
699
.
05 , we can conclude that .
025
P
t
1 .
78735
.
05 or, by symmetry, .
025
P
t
1 .
78735
.
05 . For a two-sided test the p-value is the probability of getting something as extreme as or more extreme than twice the probability P
t
1 .
78735
, so we can say that .
05
p
value
.
10 x
584 .
83 is
. The rest are shown on the table below.
Version x t comp
Location of
1 584.83 -1.78375 t
.
1.699< t comp
t comp
Approximate p-value
< t
.
025
2.045 .
05
p
value
.
10
2 589.83 -1.27901 t
.
15
1.311 .
20
p
value
.
30
3 594.83 -0.77427 t
.
25
4 599.83 -0.26953 t
.
40
1.055<
t comp
< t
.
10
0.683<
t comp
< t
.
20
0.256<
t comp
< t
.
35
0.854 .
40
p
value
.
50
0.389 .
70
p
value
.
80
5 604.83 0.23521 t
.
45
0.127< t comp
< t
.
40
0.256 .
80
p
value
.
90
22
252y0751 10/19/07 (Open in ‘Print Layout’ format)
Version x t comp
Location of t comp
Approximate p-value
6 609.83 0.73995 t
.
25
0.683< t comp
< t
.
20
0.854 .
40
p
value
.
50
7 614.83 1.24469 t
.
15
1.055< t comp
< t
.
10
1.311 .
20
p
value
.
30
8 619.83 1.74943 t
.
05
1.699< t comp
< t
.
025
2.045 .
05
p
value
.
10
9 624.83 2.25417 t
.
025
2.045< t comp
< t
.
01
2.462 .
02
p
value
.
05
10 629.83 2.75891 t
.
005
If
.
02
2.756< t comp
< t
.
001
3.396 .
002
, only Version 10 would give a rejection of the null hypothesis.
p
value
.
01 d) Using the test ratio you found in c) find a p-value for the null hypothesis that the mean is at most 602.50
(1) We are now testing
H
H
1
0
:
:
602 .
50
602 .
50
. This is a right-sided test.
Values of t are repeated in the table below. If we compare table values we find t
.
05
1.699< 1 .
78375 < t
.
025 t
1 .
2.045. Since
78375 t
.
, the ratio for Version 1, with the
1.699, means P
t
1 .
699
.
05 , we can conclude that .
025
P
t
1 .
78735
.
05 or, by symmetry, .
025
P
t
1 .
78735
.
05 . For a right-sided test the p-value is the probability of getting something as high as or higher than the probability P
t
1 .
78735
, so we can say that .
95
p
value
.
975 x
584 .
83 is
. The rest are shown on the table below. Note that if the significance level is .02, we will definitely reject the null hypothesis in Version 10 and probably in Version 9.
Version x t comp
Location of t comp
Approximate p-value
1 584.83 -1.78375 t
.
05
1.699<
t comp
< t
.
025
2.045 .
95
p
value
.
975
2 589.83 -1.27901 t
.
15
1.311 .
85
p
value
.
90
3 594.83 -0.77427 t
.
25
4 599.83 -0.26953 t
.
40
1.055<
t comp
< t
.
10
0.683<
t comp
< t
.
20
0.256<
t comp
< t
.
35
0.854
0.389
.
75
.
60
p
value
.
90
p
value
.
75
5 604.83 0.23521 t
.
45
0.127< t comp
< t
.
40
0.256 .
40
p
value
.
45
6 609.83 0.73995 t
.
25
0.683< t comp
< t
.
20
0.854 .
20
p
value
.
25
7 614.83 1.24469 t
.
15
1.055< t comp
< t
.
10
1.311 .
10
p
value
.
15
8 619.83 1.74943 t
.
05
1.699< t comp
< t
.
025
2.045 .
025
p
value
.
05
9 624.83 2.25417 t
.
025
2.045< t comp
< t
.
01
2.462 .
025
p
value
.
01
10 629.83 2.75891 t
.
005
2.756< t comp
< t
.
001
3.396 .
001
p
value
.
005 e) Using the test ratio you found in c) find a p-value for the null hypothesis that the mean is at least 602.50
(1)
We are now testing
H
H
1
0
:
:
602 .
50
602 .
50
. This is a left -sided test.
Values of t are repeated in the table below. If we compare table values we find t
.
05
1.699< 1 .
78375 < t
.
025 t
1 .
2.045. Since
78375 t
.
, the ratio for Version 1, with the
1.699, means P
t
1 .
699
.
05 , we can conclude that .
025
P
t
1 .
78735
.
05 or, by symmetry, .
025
P
t
1 .
78735
.
05 . For a right-sided test the p-value is the probability of getting something as low as or lower than the probability P
t
1 .
78735
, so we can say that .
025
p
value
.
05 x
584 .
83 is
. The rest are shown on the table below. Note that your p-values for d) and e) should add to 1.
23
252y0751 10/19/07 (Open in ‘Print Layout’ format)
Version x t comp
Location of t comp
Approximate p-value
1 584.83 -1.78375 t
.
05
1.699<
t comp
< t
.
025
2.045 .
025
p
value
.
05
2 589.83 -1.27901 t
.
15
1.311 .
10
p
value
.
15
3 594.83 -0.77427 t
.
25
4 599.83 -0.26953 t
.
40
1.055<
t comp
< t
.
10
0.683<
t comp
< t
.
20
0.256<
t comp
< t
.
35
0.854
0.389
.
20
.
35
p
value
p
value
.
25
.
40
5 604.83 0.23521 t
.
45
0.127< t comp
< t
.
40
0.256 .
55
p
value
.
60
6 609.83 0.73995 t
.
25
0.683< t comp
< t
.
20
0.854 .
75
p
value
.
80
7 614.83 1.24469 t
.
15
1.055< t comp
< t
.
10
1.311 .
85
p
value
.
90
8 619.83 1.74943 t
.
05
1.699< t comp
< t
.
025
2.045 .
95
p
value
.
975
9 624.83 2.25417 t
.
025
2.045< t comp
< t
.
01
2.462 .
975
p
value
.
99
10 629.83 2.75891 t
.
005
2.756< t comp
< t
.
001
3.396 .
995
p
value
.
999
Note that if the significance level is .02, we will never reject the null hypothesis. f) Test the null hypothesis that the mean is at most 602.50 using an appropriate confidence interval (1)
I’m surprised that no one called me on this. To do this correctly you need t
.
2.150, which is not on any of your tables. Here you get points for thinking, so I’ll see what you did. It would be reasonable to use t
.
025 in place of the 2% value. You also might use z
.
02
, if you explained that you were desperate.
H
H
1
0
:
:
602 .
50
602 .
50 which becomes
. Recall
x
t
s x
.
02
x
, n
30 , s x
9 .
9061
2 .
150
9 .
9061
x
and the two-sided formula is
21 .
30
x
t
2 s x
. For the results see the table after g).
, g) Test the null hypothesis that the mean is at least 602.50 using an appropriate confidence interval (1)
H
H
1
0
:
:
x
t
2
602 s x
.
50
602 .
50
Recall
, which becomes
.
02
, t
.
x
t
2.150, s x
x
n
30
2 .
150
, s x
9 .
9061
9 .
9061
x
and the two-sided formula is
21 .
30 . The intervals in both f) and g) contain the sample mean.
Version x
H
0
:
602 .
50
x
t
s x
H
0
:
602
x
t
.
50 s x
1 584.83 = 563.53
2 589.83 = 568.53
3 594.83 = 573.53
= 606.13
= 611.13
= 616.13
4 599.83
5 604.83
6 609.83
7 614.83
8 619.83
9 624.83
10 629.83
= 578.53
= 583.53
= 588.53
= 593.53
= 598.53
= 603.53 *
= 608.53 *
= 621.13
= 626.13
= 631.13
= 636.13
= 641.13
= 646.13
= 651.13
Note that the two starred confidence intervals contradict the null hypothesis and thus imply rejection.
24
252y0751 10/19/07 (Open in ‘Print Layout’ format)
Problem 4: Assume that the population standard deviation is known to be 30 but that we are still working with a problem like Problem 3. (98% confidence level, sample of 30.) Do either Problem 4.1 or Problem
4.2. Make sure that I know which one!
Let’s start with Table 3.
.
02 , n
30 ,
30 and
x
n
30
30
30
5 .
4772
Interval for
Mean (
known)
Confidence
Interval
x
z
2
x
n
Hypotheses
H
0
H
1
:
:
0
0
Test Ratio z
x
x
0
Critical Value x cv
0
z
2
x
Problem 4.1
. a) Find a critical value for the sample mean if we are testing whether the population mean is below 30. Clearly state your null and alternative hypotheses (2) b) Assume that the sample mean is 30 minus the second to last digit of your student number. (Use 10 if this digit is zero.) Find a p-value for your null hypothesis. (1) c) Create a power curve for the test (6)
Solution: a) We find a critical value for the sample mean if we are testing whether the population mean is below 30. We state our null and alternative hypotheses (2) value for the mean that is below 30. We use x cv
0
H
H
1
0 z
x
:
:
30
30
30 z
2 .
054
.
02
2 .
054
5 .
4772
We need a critical
30
11 .
250
18 .
750 b) We assume that the sample mean is 30 minus the second to last digit of our student number. (Use 10 if this digit is zero.) We find a p-value for our null hypothesis. z calc
x
x
0 x
30
5 .
4772
will be our test ratio and we will calculate p
value
P
z
z calc
. For example if the mean is 29, we compute z calc
29
30
5 .
4772
0 .
18 . Using the Normal table we find p
value
P
z
0 .
18
.
5
.
0714
.
4286 .
The values below were computer generated. Yours should be close.
Version x z calc
P
z
z calc
1 29 -0.18258 0.427566
2 28 -0.36515 0.357500
3 27 -0.54773 0.291940
4 26 -0.73030 0.232603
5 25 -0.91288 0.180654
6 24 -1.09545 0.136660
7 23 -1.27803 0.100620
8 22 -1.46060 0.072063
9 21 -1.64318 0.050173
10 20 -1.82575 0.033944 c) We create a power curve for the test. We do not reject the null hypothesis if our sample mean is above x cv
18 .
750 . Remember that our hypotheses are possible values of
H
H
0
1
:
:
30
30 and that we need a power curve for all
that are below 30. The distance between 30 and the critical value is 30 – 18.750 =
11.25, half of that is 5.62, which we can round to 6. Let’s try using 30, 24, 18.75, 12 and 6 as
1
.
We will compute P
x
18 .
750
1
P
z
18 .
750
1
5 .
4772
. Make a diagram. Show a Normal curve with a mean of
30 and shade a ‘reject’ zone below 18.750. On the same diagram make a second Normal curve of the same size as the first one with a mean at a value of
1
and shade a ‘do not reject’ zone that includes the entire
25
252y0751 10/19/07 (Open in ‘Print Layout’ format) area under the second curve above 18.750. For
1
P
z
18 .
750
5 .
4772
24
P
z
0 .
96
.
5
.
3315
29 this becomes
P
x
18 .
750
1
24
.
8315 . If we let the computer do the dirty work, we get the following.
Point
1 z calc
18 .
750
1
P
z
z calc
power
1
5 .
4772
1 30.00 -2.05397 0.980011 0.019989
2 24.00 -0.95852 0.831099 0.168901
3 18.75 0.00000 0.500000 0.500000
4 12.00 1.23238 0.108903 0.891097
5 6.00 2.32783 0.009961 0.990039
As was explained in class, you do not need to do the calculations for points 1 and 3 since the power for
0
is always equal to the significance level and the power at
1
x cv
is always .5. Graph the power on your y-axis against
1
on your x-axis.
Problem 4.2
. a) Find critical values for the sample mean if we are testing whether the population mean is 30. Clearly state your null and alternative hypotheses (2) b) Assume that the sample mean is 30 minus the second to last digit of your student number. (Use 10 if this digit is zero.) find a p-value for your null hypothesis. (1) c) Create a power curve for the test (8) [37]
Solution: a) We find critical values for the sample mean if we are testing whether the population mean is
30. We state our null and alternative hypotheses
H
H
0
1
:
:
30
30
x
5 .
4772 ,
.
02 and z
.
01
2 .
327 .
We need critical values for the sample mean that are both above and below 30. These are x cv
0
z
2
30
2 .
327
5 .
4772
30
12 .
745 x
or 17.255 and 42.745. We reject the null hypothesis if the sample mean does not fall between these values. b) We assume that the sample mean is 30 minus the second to last digit of our student number. (We use 10 if this digit is zero.) and find a p-value for our null hypothesis. z calc
x
x
0 x
30
5 .
4772
will be our test ratio and we will calculate p
value
2 P
z
z calc
(since all our values of x are to the left of the alleged mean.). For example if the mean is 29, we compute find p
value
2 P
z
0 .
18
2 (.
5
.
0714 )
z calc
2 (.
4286 )
29
30
0 .
18 . Using the Normal table we
5 .
4772
.
8572 .
If we let the computer do the work, we get the table below. Your results should be similar.
Version x z calc
2 P
z
z calc
1 29 -0.18258 0.855131
2 28 -0.36515 0.714999
3 27 -0.54773 0.583881
4 26 -0.73030 0.465207
5 25 -0.91288 0.361308
6 24 -1.09545 0.273319
7 23 -1.27803 0.201241
8 22 -1.46060 0.144125
9 21 -1.64318 0.100347
10 20 -1.82575 0.067888 c) We create a power curve for the test . We do not reject the null hypothesis if the sample mean lies between 17.255 and 42.745. The alleged mean is 30 and the distance between 30 and the critical values is
12.745, half of which we can round to 6.5. We need the power for every value of the mean. Let’s try using
30, 36.5, 42.745, 49.5 and 56 for
1
on the top side of 30 and 30, 23.5, 17.255, 10.5 and 4 on the bottom
26
252y0751 10/19/07 (Open in ‘Print Layout’ format) side of 30. We will compute P
17 .
255
x
42 .
745
1
P
17 .
255
5 .
4772
1 z
42 .
745
5 .
4772
1
. For example if
1
36 .
5 , we find
P
17 .
255
x
42 .
745
1
36 .
5
P
17 .
255
5 .
36
4772
.
5
z
42 .
745
5 .
4772
36 .
5
P
3 .
51
1 .
14
.
4998
.
3729
.
8727 .
The table that follows is computer generated. Because of rounding error in the standard deviation only the first four significant figures of the operating characteristic
and the power columns should be taken seriously, but your results should be very close to these.
Point
1 z calc 1
17 .
255
5 .
4772
1 4.000 2.42003
2 10.500 1.23329
3 17.255 0.00000
4 23.500 -1.14018
5 30.000 -2.32692
6 36.500 -3.51366
7 42.745 -4.65384
8 49.500 -5.88713
9 56.000 -7.07387
1 z calc 2
42 .
745
5 .
4772
7.07387
5.88713
4.65384
3.51366
2.32692
1.14018
0.00000
-1.23329
-2.42003
1
P
z calc 1
z
z calc 2
power
1
0.007760
0.108733
0.499998
0.872674
0.980030
0.872674
0.499998
0.108733
0.007760
0.992240
0.891267
0.500002
0.127326
0.019970
0.127326
0.500002
0.891267
0.992240
Of course, this is much less work than it looks like. Only points 1, 2 and 4 need to be computed. Note that points 3 and 7 are at critical values and give powers of .5 and that point 5 is the null hypothesis mean and gives a power equal to the significance level (2%). Also the power for the points 9 through 6 is identical to the power for points 1 through 4, so that only three computations are necessary to compute the operating characteristic curve.
27
252y0751 10/19/07 (Open in ‘Print Layout’ format)
Problem 5: In problem 4 we assumed that the population standard deviation is 30. a) Do a 98% confidence interval for the mean using the mean that you found in Problem 3 and assuming that our sample of 30 came from a population of 300. (2) b) How large a sample would we need if we wanted to make the error term no more than
1 and the sample came from an infinite population? (2) c) Using a 98% confidence level and a sample size of 30 create a confidence interval for the population standard deviation using your sample variance or standard deviation from Problem 3. (2) d) Repeat c) assuming that you had a sample of 300. (2) e) Can we say that the standard deviation is significantly different from 30 on the basis of c) and d)? (1) f) Using the data and sample size from problem 3 can we say that the standard deviation is above 30? State
[49] your hypotheses and do an appropriate hypothesis test. (3)
Solution:
Interval for Confidence Hypotheses Test Ratio Critical Value
Mean (
unknown)
Interval
DF x
n t
2
1 s x
H
0
H
1
:
:
0
0 t
x
s x
0 s x cv x
t
2 s x
s n a) Do a 98% confidence interval for the mean using the mean that you found in Problem 3 and assuming that our sample of 30 came from a population of 300. (2)
In problem 3 we found t
.
2.462, s
2 x
2943 .
9367 , s x
54 .
25806 and used
x
t
2 s x
. With the finite population correction, we have the following. s x
N
N
n
1
( 9 .
9061 )
270
299
2943 .
9367
2658 .
4044
51 .
5597 , so that
x
2 .
462
51 .
5597
x
126 .
94 .
If we use the population variance at the beginning of this problem, z
.
01
2 .
327
x
z
x
. With the finite population correction we have the following.
x
,
2 x
900 ,
x
N
N
n
1
30 and
( 5 .
4772 )
270
812 .
7090
28 .
5081 , so that
x
2 .
327
28 .
5081
x
66 .
34 .
299
Using the means for the various versions, we can get our intervals easily.
Version x
x
126 .
94
x
66 .
34
1 584.83
2 589.83
3 594.83
4 599.83
5 604.83
6 609.83
7 614.83
8 619.83
9 624.83
10 629.83
457.89 to 711.77 518.49 to 651.17
462.89 to 716.77 523.49 to 656.17
467.89 to 721.77 528.49 to 661.17
472.89 to 726.77 533.49 to 666.17
477.89 to 731.77 538.49 to 671.17
482.89 to 736.77 543.49 to 676.17
487.89 to 741.77 548.49 to 681.17
492.89 to 746.77 553.49 to 686.17
497.89 to 751.77 558.49 to 691.17
502.89 to 756.77 563.49 to 696.17 b) How large a sample would we need if we wanted to make the error term no more than sample came from an infinite population? (2)
1 and the
Solution: n
z
2 2
. Depending on what we believe, we can use
e
2
2
slot. If the confidence level is 98%, we will use z
.
01
2 .
327
2
900
and since e
2
or
s
1
2 x
2943 .
9367 in the
, we can leave it out of
28
252y0751 10/19/07 (Open in ‘Print Layout’ format) the equation. We have either n
2 .
327
2
(900) = 4873.44 or n
2 .
327
2
( 2943 .
9367 )
15941 .
21 and use
4874 or 15942
Problems c-f concern the variance and standard deviation and use formulas from Table 3.
Interval for
Variance-
Small Sample
Variance-
Large Sample
Confidence
Interval
2
n
2
.
5
1
s
2
.
5
2
s z
2
2
2
Hypotheses
H
0
:
H
1 :
:
2
2
2
0
2
0
H
0
H
1
:
:
2
2
2
0
2
0
Test Ratio z
2
2
2
n
1
s
2
0
2
2
1
Critical Value s
2 cv
2
.
5
.
5
2
n
1
2
0 s cv
z
2
2 DF
2 DF c) Using a 98% confidence level and a sample size of 30 create a confidence interval for the population standard deviation using your sample variance or standard deviation from Problem 3. (2)
Solution: Recall the following. freedom,
2
.
99
14 .
2565 and n
30
2
.
01 s
2 x
2943 .
9367 or s x
54 .
25806
n
1
s
2
49 .
5881 , take the formula substitute s
2 x
2943 .
9367
Or, if we take square roots, 41.49
2943 .
9367
to get
49
.
5881
77.38.
2
2943 .
9367
14 .
2565
2
2
. For (30 – 1) degrees of
2
n
1
s
2
2
1
2 or 1721.6663
, and
2
59844.4379 d) Repeat c) assuming that you had a sample of 300. (2) For the appropriate value of of twice the degrees of freedom z
.
01
2 .
327 , 2 DF
2
24 .
45404 z and the square root
, take the formula z
s
2 DF
2 DF
s
z
2 DF
2 DF
and substitute s x
54 .
25806 to get , for
.
02 ,
2
54 .
25806
24 .
45404
2 .
327
24 .
45404
2
54 .
25806
2 .
327
24 .
45404
24 .
45404
or 49.54
59.96 e) Can we say that the standard deviation is significantly different from 30 on the basis of c) and d)? (1)
It is enough to check our results from the confidence intervals, though a more formal test of H
0
:
30 could be done using
2
n
1
s
2
0
2
and/or setting z
2
2
2 DF
1 could be done. Simply put, since 30 falls on neither interval, there is a significant difference between 30 and the standard deviation from our sample. f) Using the data and sample size from problem 3 can we say that the standard deviation is above 30? State your hypotheses and do an appropriate hypothesis test. (3) The pair of hypotheses
H
H
0
1
:
:
30
30
are equivalent to
2
n
1
s
2
0
2
H
0
:
2
900
. If n
H
1
:
2
900
29
2943 .
9367
900
30 , so there are 29 degrees of freedom, we can use the test ratio
94 .
8602 . If we maintain a 98% confidence level, our ‘reject’ zone will be the area above
2
.
02
. Our table does not give us this value, but we can say that
2
.
01
49 .
5881
29
252y0751 10/19/07 (Open in ‘Print Layout’ format) and
2
.
025
45 .
7224 , so that
2
.
02
must lie between them and 94.8602 must be in the ‘reject’ zone. A pvalue approach would observe that the largest number in the df = 29 column is
2
.
005
52 .
3360 and, since
94.8602 is above this p
value
P
2
94 .
8602
.
005
.
02 . So we reject H
0
.
30