Document 15930441

advertisement

252y0751 10/19/07 (Open in ‘Print Layout’ format)

ECO252 QBA2

FIRST EXAM

October 4 and 8, 2007

Version 1

Name _____ KEY _________

Class hour: _____________

Student number: __________

Show your work! Make Diagrams! Include a vertical line in the middle! Exam is normed on 50 points. Answers without reasons are not usually acceptable.

I. (8 points) Do all the following. x ~ N

 

1.

P

P

38 .

2

3 .

75

 z x

0

2

P

P



38

0 .

09

.

2

11

 z

3

0

 z

.

2

11

3



4999

P

.

0359

3 .

75

 z

= .4640

 

0 .

09

For z make a diagram. Draw a Normal curve with a mean at 0. Indicate the mean by a vertical line!

Shade the area between -3.75 and zero. Because this is entirely on the left side of zero, we must subtract the area between -0.09 and zero from the area between -23.75 and zero. . If you wish, make a completely separate diagram for x . Draw a Normal curve with a mean at 3. Indicate the mean by a vertical line!

Shade the area between -38.2 and 2. This area is entirely on the left side of the mean (3), so we subtract the smaller area between 2 and the mean from the larger area between -23.4 and the mean.

2.

P

3

 x

3

P



3

3

11

 z

3

11

3



P

0 .

55

 z

0

= .2088

For z make a diagram. Draw a Normal curve with a mean at 0. Indicate the mean by a vertical line!

Shade the area between -0.55 and zero. Because this is completely on the left of zero, but ends at zero, this is exactly the kind of probability given by the standard Normal table. Look up the probability between zero and 0.55 on the table and you are done. If you wish, make a completely separate diagram for x . Draw a

Normal curve with a mean at 3. Indicate the mean by a vertical line! Shade the area between -3 and the mean (3). Since this area ends at the mean we do not need to add or subtract.

1

252y0751 10/19/07 (Open in ‘Print Layout’ format)

3.

P

 x

0

P

 z

0

11

3



P

 z

 

0 .

27

0 .

27

 z

0

  z

0

= .1064 + .5 = .6064

For z make a diagram. Draw a Normal curve with a mean at 0. Indicate the mean by a vertical line!

Shade the entire area above -0.27. Because this is on both sides of zero, we must add area between -0.27 and zero to the area above zero.

This is identical to way you get the p-value for a right-sided test when the z ratio is negative. If you wish, make a completely separate diagram for x . Draw a Normal curve with a mean at 3. Indicate the mean by a vertical line! Shade the area above zero. This area is on both sides of the mean (3), so we add the area between 0 and the mean to the area (50%) above the mean.

4. x

.

125

(Do not try to use the t table to get this.) For z make a diagram. Draw a Normal curve with a mean at 0. z

.

125

is the value of z with 12.5% of the distribution above it. Since 100 – 12.5 = 87.5, it is also the .875 fractile. Since 50% of the standardized Normal distribution is below zero, your diagram should show that the probability between z

.

125

and zero is 87.5% - 50% = 37.5% or we check this against the Normal table, the closest we can come to .3750 is P

P

0

0

 z

 z

 z

.

125

1 .

15

.

3750

.

3749 .

.

If

(1.16 is also acceptable here, but clearly worse.) So confidence interval. To get from z

.

125

to x z

.

085

.

125

1 .

15 .

This is the value of z that you need for a 75%

, use the formula x

   z

, which is the opposite of z

 x

. x

3

1 .

15

15 .

65 . If you wish, make a completely separate diagram for x . Draw a

Normal curve with a mean at 3. Show that 50% of the distribution is below the mean (3). If 12.5% of the distribution is above x

.

125

, it must be above the mean and have 37.5% of the distribution between it and the mean.

Check:

.

5

P

 x

.

3749

15 .

65

.

1251

P

 z

15 .

65

11

3



P

 z

1 .

15

P

 z

0

 

0

 z

1 .

15

.

125 .

This is identical to the way you normally get a p-value for a right-sided test.

2

252y0751 10/19/07 (Open in ‘Print Layout’ format)

II. (9 points-2 point penalty for not trying part a.)

Our sales of microwave ovens in five randomly picked months appear below

123 126 140 141 149 a. Compute the sample standard deviation, s , of expenditures. Show your work! (2) b. Assuming that the underlying distribution is Normal, compute a 99% confidence interval for the mean. (2) c. Redo b) when you find out that there were only 12 months to pick the data from.(2) d. Assume that the population standard deviation is 10 and create a 75% two-sided confidence interval for the mean. (2) e. Use your results in a) to test the hypothesis that the mean is below 140 at the 99% level. (3)

State your hypotheses clearly!

Solution: a) Compute the sample standard deviation,

Row x x

2 x

 x

 x

 x

2

1 123 15129 -12.8 163.84

2 126 15876 -9.8 96.04

3 140 19600 4.2 17.64

4 141 19881 5.2 27.04

5 149 22201 13.2 174.24

679 92687 0.0 478.80 x

 n x f. (Extra Credit) Given the data, test the hypothesis that the population standard deviation is below

15 (3)? s , of expenditures.

679

5

135 .

80 s

2 x

 x

2 n

1 n x

2

92687

The first two columns are needed for the computational (shortcut) method. The first, third and fourth are needed for the definitional method. Using (both methods or) the definitional method wastes time.

 x

 check),

5

135 .

80

2

4

679

, x

 x

478 .

8

4 x

 2

2 

92687

478 .

80

119 .

70

,

 

and x

 x

0 n

5

.

(a s x

119 .

70

10 .

9407 b) Assuming that the underlying distribution is Normal, compute a 99% confidence interval for the mean.

(2)

  x

 t

 

2

1 s x

135 .

80

4 .

604



4 .

8929

135 .

80

22 .

53 or 113.27 to 158.33 s x

 s x n

10 .

9704

5

119 .

70

5

23 .

94

4 .

8929 t

 

2

 t

 

.

005

4 .

604 c) Redo b) when you find out that there were only 12 months to pick the data from. (2)

  x

 t

 

2 s x

135 .

80

4 .

604



3 .

903

135 .

80

17 .

97 or 117.83 to 153.77 s x

 s x n

N

N

 n

1

10 .

9407

5

12

12

5

1

119 .

70

5

7

11

23 .

94

7

11

15 .

2345

3 .

903 d) Assume that the population standard deviation is 10 and create a 75% two-sided confidence interval for the mean.

(2) We found z

.

125

1 .

15 on the last page. We have

 

10 , n

5 , x

135 .

80 and

 x

 x n

10

5

100

5

20

4 .

4721

  x

 z

3

 x

1

135 .

80

 

4 .

4721

135 .

80

5 .

14 or

130.66 to 140.94. e) Use your results in a) to test the hypothesis that the mean is below 140 at the 99% level. (3) State your hypotheses clearly! The statement that the mean is below 140 does not contain an equality, so it must be an alternate hypothesis. We have the following information.

 

.

01 , x

135 .

80 , n

5 and s x

 s x n

10 .

9704

5

4 .

8929 . Since this is a one-sided hypothesis we will use t

 

 t

 

.

01

3 .

747 .

Needless to say, because of the small sample size, we are assuming that the parent distribution is Normal.

3

252y0751 10/19/07 (Open in ‘Print Layout’ format)

Our hypotheses are

H

 H

1

0

:

:

140

140

 

.

01 , x

135 .

80 , n

5 ,

0

140 and worrying about the mean being too small, this is a left-sided test. t

 

 t

 

.

01

 s x

4

3 .

747

.

8929 . Since we are

There are three ways to do this. Do only one of them.

(i) Test Ratio: t

 x

 s x

0 

135 .

80

4 .

8929

140

 

0 .

8584 . This is a left-sided test - the smaller the sample mean is, the more negative will be this ratio. We will reject the null hypothesis if the ratio is smaller than

 t

 

  t

.

01

 

3 .

747 . Make a diagram showing a Normal curve with a mean at 0 and a shaded 'reject' zone below -3.747. Since the test ratio is not below -3.747, we cannot reject H

0

.

If you wish to find a p-value for your hypothesis, note that the t-ratio is -0.8584. The p-value will be the probability that t is below -0.8584. The line of the t table for 4 degrees of freedom is below. df .45 .40 .35 .30 .25 .20 .15 .10 .05 .025 .01 .005 .001

4 0.134 0.271 0.414 0.569 0.741 0.941 1.190 1.533 2.132 2.776 3.747 4.604 7.173

What this tells us, among other things, is that P

 t

 

0 .

941

.

20 and P

 t

 

0 .

741

.

25 . Since -0.8584 lies between -0.941 and -0.741, the probability that t lies below -.8584 must be between .20 and .25.

.

20

 p

 value

.

25 . This is above our significance level of .01, so we will not reject the null hypothesis.

(ii) Critical value: We need a critical value for too far below 140, we will not believe is

 x cv

140

0

3 .

747 t



 

2

1 s x

4 .

8929

H

0

:

  x below 140. Common sense says that if the sample mean is

140 . The formula for a critical value for the sample mean

, but we want a single value below 140, so use

121 .

17 x cv

 

0

 t

 

 n

1 s x

.

Make a diagram showing an almost Normal curve with a mean at 140 and a shaded 'reject' zone below 121.17. Since

(iii) Confidence interval:

  x

 t

2 s x x

135 .

80 is not below 121.17, we do not reject H

0

.

is the formula for a two sided interval. The rule for a one-sided confidence interval is that it should always go in the same direction as the alternate hypothesis. Since the alternative hypothesis is

 

135 .

80

3 .

747



4 .

H

8929

1

:

154

140

.

134

, the confidence interval is

  x

 t

 

 s x

or

.

Make a diagram showing an almost Normal curve with a mean at x

135 .

80 and, to represent the confidence interval, shade the area below 154.134 in one direction. Then, on the same diagram, to represent the null hypothesis, H

0

:

 

140 , shade the area above 140 in the opposite direction. Notice that these overlap. What the diagram is telling you is that it is possible for

 

154 .

134 and H

0

:

 

140 to both be true. (If you follow my more recent suggestions, it is actually enough to show that 140 is on the confidence interval.) So we do not reject H

0

. f) (Extra Credit) Given the data, test the hypothesis that the population standard deviation is below 15 (3)?

This is an alternate hypothesis,

 

.

01 s

2 x

H

1

:

 

15 . The null hypothesis is

119 .

70 . Table 3 says that the test ratio is

 2 

 n

1

 s

0

2

H

0

2

:

4

15

119

15

2

. Remember

.

70

2 .

128 . n

5 ,

4

252y0751 10/19/07 (Open in ‘Print Layout’ format)

Recall df

 n

1

4 .

The first paragraph of the chi-squared table appears below. If we look at the 4 column, we see that the lower 1% of values of chi-squared are cut off by

 2

 

.

99

0 .

2971 , so that the reject region is below 0.2971. Since

 2 

2 .

128 is above 0.2910, do not reject the null hypothesis.

Degrees of Freedom

1 2 3 4 5 6 7 8 9

0.005 7.87946 10.5966 12.8382 14.8603 16.7496 18.5476 20.2778 21.9550 23.5893

0.010 6.63491 9.2103 11.3449 13.2767 15.0863 16.8119 18.4753 20.0902 21.6660

0.025 5.02389 7.3778 9.3484 11.1433 12.8325 14.4494 16.0128 17.5346 19.0228

0.050 3.84146 5.9915 7.8147 9.4877 11.0705 12.5916 14.0671 15.5073 16.9190

0.100 2.70554 4.6052 6.2514 7.7794 9.2364 10.6446 12.0170 13.3616 14.6837

0.900 0.01579 0.2107 0.5844 1.0636 1.6103 2.2041 2.8331 3.4895 4.1682

0.950 0.00393 0.1026 0.3518 0.7107 1.1455 1.6354 2.1674 2.7326 3.3251

0.975 0.00098 0.0506 0.2158 0.4844 0.8312 1.2373 1.6899 2.1797 2.7004

0.990 0.00016 0.0201 0.1148 0.2971 0.5543 0.8721 1.2390 1.6465 2.0879

0.995 0.00004 0.0100 0.0717 0.2070 0.4117 0.6757 0.9893 1.344 1.7349

Computer output for parts b) d) e) and f) follows.

————— 10/2/2007 10:12:27 PM ————————————————————

Welcome to Minitab, press F1 for help.

MTB > WOpen "C:\Documents and Settings\rbove\My Documents\Minitab\251x0751-

21.MTW".

Retrieving worksheet from file: 'C:\Documents and Settings\rbove\My

Documents\Minitab\251x0751-21.MTW'

Worksheet was saved on Wed Sep 26 2007

Results for: 251x0751-21.MTW

MTB > Onet c1;

SUBC> Confidence 99.0.

Part b)

One-Sample T: x

Variable N Mean StDev SE Mean 99% CI x 5 135.800 10.941 4.893 (113.273, 158.327)

MTB > OneZ c1;

SUBC> Sigma 10;

SUBC> Confidence 75.0.

Part d)

One-Sample Z: x

The assumed standard deviation = 10

Variable N Mean StDev SE Mean 75% CI x 5 135.800 10.941 4.472 (130.655, 140.945)

MTB > Onet c1;

SUBC> Test 140;

SUBC> Confidence 99.0;

SUBC> Alternative -1.

Part e)

One-Sample T: x

Test of mu = 140 vs < 140

99%

Upper

Variable N Mean StDev SE Mean Bound T P x 5 135.800 10.941 4.893 154.133 -0.86 0.220

MTB > WSave "C:\Documents and Settings\RBOVE\My Documents\Minitab\251x0751-

21.MTW";

SUBC> Replace.

Saving file as: 'C:\Documents and Settings\RBOVE\My

Documents\Minitab\251x0751-21.MTW'

Existing file replaced. Data was stored so macro could be found

5

252y0751 10/19/07 (Open in ‘Print Layout’ format)

MTB > %sigtest c1 225 Tests data in column 1 for variance of 225. Packaged

Minitab Macro. This is a 2-sided test.

Executing from file: sigtest.MAC

The value of the test statistic is 2.1280.

If the test statistic is less than 0.4844 or greater

than 11.1433 then there is statistical evidence indicating

that your variance does not equal to 225.0000, at alpha =

0.0500.

Part f)

MTB > name k1 'a'

MTB > name k2 'df'

Using text of the macro, I set up a 1-sided test. a is signif level.

Sig0 is variance from H0 MTB > name k3 'sig0'

MTB > name k4 'stdev'

MTB > name k5 'fcalc'

MTB > name k6 'lower'

MTB > name k7 'upper'

MTB > name k8 'flow'

MTB > name k9 'fupper' fcalc is the test statistic

For a left-sided test this is signif level

Not used in 1-sided test

Reject null below this number

Not used in 1-sided test

MTB > name k10 'var'

MTB > let var = stdev(x)

Variance

MTB > let var = var * var Variance is std dev. Squared.

MTB > let df = n(x) – 1 Sample size minus 1

MTB > let lower = a

MTB > let sig0 = 225

MTB > let Fcalc = df*var/sig0

MTB > invcdf lower Flow; Sets lower limit as chi-squared alpha

SUBC> chis df.

MTB > print fcalc flow sig0 a

Data Display

fcalc 2.12800 flow 0.297109 sig0 225.000 a 0.0100000

MTB > CDF 'fcalc';

SUBC> ChiSquare 4.

The value of chi-squared we tested

The value from the table

The null hypothesis variance

Significance level

Cumulative Distribution Function

Chi-Square with 4 DF

x P( X <= x )

2.128 0.287770 This is the value of test statistic and p-value

6

252y0751 10/19/07 (Open in ‘Print Layout’ format)

III. Do as many of the following problems as you can.(2 points each unless marked otherwise adding to

13+ points). Show your work except in multiple choice questions. (Actually – it doesn’t hurt there either.) If the answer is ‘None of the above,’ put in the correct answer if possible.

1) If I want to test to see if the mean of x is larger than the given population mean

0

my null hypothesis is: i) ii)

  

0

  

0 iii)

  

0 iv) *

  

0 v) Could be any of the above. We need more information. vi) None of the above

Explanation:

  

0

is our alternate hypothesis since it doesn’t contain an equality.

  

0

is the opposite, so it must be the null hypothesis.

2) Assuming that you have a sample mean of 100 based on a sample of 36 taken from a population of 300 with a known population standard deviation of 80, the 99% confidence interval for the population mean is a) 100

2 .

576



80

36

 b) * 100

2 .

576

300

300

36

1

80

36 c) 100

2 .

576

 d) 100

2 .

724



80

300



80

300

 e) h)

100

100

2 .

724

2 .

438

300

300

36

1 f) 100

2 .

724

 g) 100

2 .

438



80

36



80

300



300

300

36

1

80

36

80

36 i) 100

2 .

438



80

36

 j) 100

2 .

438

300

300

36

1

80

300 k) None of the above. Fill in a correct answer.

Explanation: The formula for a confidence interval when the variance is known when the sample is more than 20% of the population was given in the solution to problem A2 as n

36, N

300,

 

80, 1

  

99% and

 

.

01 .

  x

 z

2

 x

, where x

100,

7

252y0751 10/19/07 (Open in ‘Print Layout’ format)

Here

 x

 x n

N

N

 n

1

80

36

300

300

36

1 and z

2

 z

.

005

2 .

576 . So

 

100

2 .

576

80

36

300

300

36

1

3) Which of the following is a Type 2 error? a) Rejecting the null hypothesis when the null hypothesis is false. b) Rejecting the null hypothesis when the null hypothesis is true. c) Not rejecting the null hypothesis when the null hypothesis is true. d) *Not rejecting the null hypothesis when the null hypothesis is false. e) All of the above f) None of the above.

4) If a random sample is gathered to get information about a population proportion, what do we mean by a p-value? a) P-value is the population proportion in the null hypothesis. b) P-value is the population proportion in the alternate hypothesis. c) P-value is the probability of a type 2 error. d) P-value is the probability that, if the null hypothesis was false, that, if we were to repeat the experiment many times, we would get a sample proportion as extreme as or more extreme than the sample proportion actually observed. e) *P-value is the probability that, if the null hypothesis was true, that, if we were to repeat the experiment many times, we would get a sample proportion as extreme as or more extreme than the sample proportion actually observed. f) P-value is the probability that the alternate hypothesis is true, given the sample proportion actually observed. g) None of the above is true.

5) If a difference in proportions (in a business-related problem) is called statistically significant at the 1% significance level, this means that a) If the null hypothesis is true, the difference in proportions is surprisingly small. b) *We must reject the null hypothesis. c) The difference in proportions is small enough so that we must take account of it in our business decisions. d) The null hypothesis is very likely to be true. e) All of the above

Assume a Normal distribution in 6) and 7).

[10]

6) (Wonnacott & Wonnacott) A company is discharging treated waste into a river. The firm is supposed to be fined if the average pollution level is above 16 parts per million. It is known that the population standard deviation is 6 parts per million. 9 measurements are taken. If we assume that the firm is discharging 16 parts or fewer per million, how high must the sample mean level of pollution be to cause us to doubt the assumption? (Use a 1% significance level, and don’t forget to state your hypotheses.) (3)

Solution: We have

0

16 ,

 

6 , n

9 , H

1

:

 

16 , H

0

:

 

16 ,

 

.

01 , z

.

01

2 .

576

 x

 n

6

9

2 , x cv

 

0

 z 

 x

16

2 .

576

21 .

152

,

8

252y0751 10/19/07 (Open in ‘Print Layout’ format)

7) The politicians, who don’t know any statistics, decide that they will fine the company in 6) if the level of pollution exceeds 20 parts per million. . It is known that the population standard deviation is 6 parts per million. 9 measurements are taken. If we assume that the firm is discharging 16 parts per million what is the probability that they will be fined? (Think p-value?) (2)

Solution: P

 x

20

P

 z

20

16

2



P

 z

2

.

5

.

4772

.

0228

[15]

8) It is a well-known fact that your factory has been producing a product that is 20% defective. We take a sample of 500 units of the product this month and find that 103 are defective. a) Assuming that the 20% figure is correct, how many units of the product must be examined before we can state our defect rate as a proportion

.

03 ? (Use a 95% confidence level!!!)

Solution: The outline says “The usually suggested formula is forgets that we covered.” So, we have p

.

20 , q n

1

 p

1

.

2 pqz

2 e

2

.

8 , z

…..

This is the formula everyone

1 .

960 , e

.

03 and n

.

2

  

.

 

2

2

682 .

951 . So use a sample of at least 683. b) If we wish to test our belief that the proportion has risen over the previous figure, let p represent the proportion of defective items . What are our null and alternative hypotheses? (1)

Solution: The statement implied in the problem p

.

20 does not contain an equality, so the null hypothesis is the opposite p

.

20 . c) You already know that the sample size is 500. Using a 95% confidence level and assuming that your hypothesis in b) is correct, test the hypothesis. (2) [20]

Solution: From Table 3.

Interval for

Proportion

 

.

05 , z

 z

.

05

Confidence

Interval p

 p

 z

2 s p s p

 p q n q

1

 p

1 .

645 , p

0

.

20 , q

0

Hypotheses

H

0

H

1

:

: p p

 p

0 p

0

1

 p

0

Test Ratio z

1

.

2

.

8 , x

 p

 p p

0

103

Critical Value p cv

 p

0

 z

2

 p

 q

0 p

 p

0

1

 p

0 n q

0

and n

500 . First, for the test ratio or critical value we need

 p

 p

0 q

0 n

.

2

500

 we need p

103

500

.

206 , q

1

 p

1

.

206

.

794 . s p

.

00032 p q n

.

017889 . For the confidence interval

.

206

.

794

500

.

00033

.

018087 . Use one of the following 3 methods.

Critical Value Method: Since we have the proportion of p cv

 p

0

 z

 p

H

1

.

20

: p

.

20

1 .

645

, this is a right-sided test. We use a critical value for

.

017889

.

2294 . Make a diagram showing a normal curve with a center at p

0

.

20 and a rejection region above .2294. Since p

.

206 , is below .2294, we cannot reject the null hypothesis.

Test Ratio Method: z

 p

 p p

0 

.

206

.

20

.

017889

0 .

3354 . Make a diagram showing a normal curve with a center at zero and a rejection region above 1.645. Since null hypothesis. The p-value would be P

 p

.

206

  z z

0 .

3354

0 .

3354

, is below 1.645, we cannot reject the

.

5

..

1368 =.3632.

This would lead to nonrejection of the null hypothesis for most common values of

, since the p-value would usually be above the significance level.

9

252y0751 10/19/07 (Open in ‘Print Layout’ format)

.

Confidence interval method: Since we have p

 p

 z

 p

.

206

1 .

645

0 .

018087

H

.

1762

1

: p

.

20 , we need a one-sided ‘

The null hypothesis H

0

: p

.

20

 ’ interval. This would be

is not contradicted by the confidence interval .

p

.

1762 , since any value of the proportion between .1760 and .20 will satisfy both.

10

252y0751 10/19/07 (Open in ‘Print Layout’ format)

ECO252 QBA2

FIRST EXAM

October 8, 2007

TAKE HOME SECTION

-

Name: _________________________

Student Number and class: _________________________

IV. Do at least 3 problems (at least 7 each) (or do sections adding to at least 20 points - Anything extra you do helps, and grades wrap around) . Show your work! State H

0

and H

1

where appropriate. You have not done a hypothesis test unless you have stated your hypotheses, run the numbers and stated your conclusion.

(Use a 95% confidence level unless another level is specified.) Answers without reasons usually are not acceptable. Neatness and clarity of explanation are expected. This must be turned in when you take the in-class exam. Note that answers without reasons and citation of appropriate statistical tests receive no credit.

Failing to be transparent about which section of which problem you are doing can lose you credit. Many answers require a statistical test, that is, stating or implying a hypothesis and showing why it is true or false by citing a table value or a p-value. If you haven’t done it lately, take a fast look at ECO 252 - Things That You Should Never Do on a Statistics Exam (or

Anywhere Else) .

A group of 30 employees are interviewed to determine the minimum amount that they will take to give up a vacation day. After careful interviewing, a psychologist repots the following amounts.

479 648 522 595 547 657 578 539 553 520 499 606 612

616 488 557 621 628 511 634 625 612 509 633 616 598

627 622 512 631

My calculations say that the sum of these 30 numbers is

 x

17395 and that the sum of squares is

 x

2 

10171575 . This is a sample of 30.

Personalize these data as follows. Take the second to last digit of your student number and multiply it by 5.

Add this quantity to each of the 30 numbers. If the second to last digit of your student number is 0, add 50.

Label your exam by version number as follows. If the second to last digit of your student number is 1, you are doing Version 1. If the second to last digit is 2, you are doing Version 2. Etc. If the second to last digit is zero you are doing version 10. Last term's exam said the following.

If you add a quantity a to a column of numbers,

  x

 a

 x

 a

2

 x

60

 x

60

2

 

 x x

 x

2

2

30

  

 

2

,

 

 x

 na

17395 + 1800 = ? and

 

30

2

 

. For example, if

2 

10171575

 a

 

60 x

,

 na ,

120

17395

30

3600 .

Test the following

Problem 1: Count the number of people in your sample that demand more than $602.50 and make it into a sample proportion. Test the following 3 hypotheses: I) that 60% demand more than $602.50, II) that more than 60% demand more than $602.50 and III) that less than 60% demand more than $602.50, using a 98% confidence level.

For each of these three tests a) state your null and alternative hypotheses (2), b) test each one using a test ratio or a critical value for the proportion (2) and c) find a p-value for the null hypotheses (3). Label each part clearly so that I know which is I, II and III and a), b) c). Make sure that I know where the ‘reject’ zone is. d) Using the proportion you found above, how large a sample would you need to estimate a 2-sided 98% confidence interval for the proportion with and error of at most .001? Assume that your sample is of that size and show that the confidence interval has an error of at most .001. (3) [10] e) (Extra credit) Assume that you are testing the hypothesis that (II) more than 60% demand over $602.50, find the power of the test if you use a sample of 30 the true proportion is 70% (3)

11

252y0751 10/19/07 (Open in ‘Print Layout’ format)

Problem 2: Assume that the underlying data for problem 1 is not Normal and using the data for problem 1 test the following three hypotheses: I) that the median demand is $602.50, II) that median demand is more than $602.50 and III) that the median demand is less than $602.50, using a 98% confidence level. a) state your null and alternative hypotheses and the hypotheses that you will actually test for each of the 3 tests

(3), b) test each one using a test ratio or a critical value (3), c) find a p-value for the 2-sided test and explain whether and why it would lead to a rejection of the null hypothesis at the 95% confidence level (1), d)

(extra credit) Show explicitly what the conclusion in c) would be if the sample of 30 came from a population of 60. (1) e) (extra credit) find a two sided confidence interval for the median (2) [17]

Problem 3: a) Find the sample mean and sample standard deviation for the data in Problem 1 (1) b) Test the hypothesis that the mean is 602.50 using critical values for the sample mean, first stating your hypotheses clearly. Use a 98% confidence level (2) c) Test the hypothesis in b) using a test ratio. Find an approximate p-value and state and explain whether this will lead to a rejection of the null hypothesis if we continue to use a 98% confidence level. (2) d) Using the test ratio you found in c) find a p-value for the null hypothesis that the mean is at most 602.50

(1) e) Using the test ratio you found in c) find a p-value for the null hypothesis that the mean is at least 602.50

(1) f) Test the null hypothesis that the mean is at most 602.50 using an appropriate confidence interval (1) g) Test the null hypothesis that the mean is at least 602.50 using an appropriate confidence interval (1)

[26]

Problem 4: Assume that the population standard deviation is known to be 30 but that we are still working with a problem like Problem 3. (98% confidence level, sample of 30.) Do either Problem 4.1 or Problem

4.2. Make sure that I know which one!

Problem 4.1. a) Find a critical value for the sample mean if we are testing whether the population mean is below 30. Clearly state your null and alternative hypotheses (2) b) Assume that the sample mean is 30 minus the second to last digit of your student number. (Use 10 if this digit is zero.) Find a p-value for your null hypothesis. (1) c) Create a power curve for the test (6)

Problem 4.2. a) Find critical values for the sample mean if we are testing whether the population mean is 30. Clearly state your null and alternative hypotheses (2) b) Assume that the sample mean is 30 minus the second to last digit of your student number. (Use 10 if this digit is zero.) Find a p-value for your null hypothesis. (1) c) Create a power curve for the test (8)

Problem 5: In problem 4 we assumed that the population standard deviation is 30.

[37] a) Do a 98% confidence interval for the mean using the mean that you found in Problem 3 and assuming that our sample of 30 came from a population of 300. (2) b) How large a sample would we need if we wanted to make the error term no more than

1 and the sample came from an infinite population? (2) c) Using a 98% confidence level and a sample size of 30 create a confidence interval for the population standard deviation using your sample variance or standard deviation from Problem 3. (2) d) Repeat c) assuming that you had a sample of 300. (2) e) Can we say that the standard deviation is significantly different from 30 on the basis of c) and d)? (1) f) Using the data and sample size from problem 3 can we say that the standard deviation is above 30? State your hypotheses and do an appropriate hypothesis test. (3) [49]

12

252y0751 10/19/07 (Open in ‘Print Layout’ format)

Problem 1: Count the number of people in your sample that demand more than $602.50 and make it into a sample proportion. Test the following 3 hypotheses: I) that 60% demand more than $602.50, II) that more than 60% demand more than $602.50 and III) that less than 60% demand more than $602.50, using a 98% confidence level.

For each of these three tests a) state your null and alternative hypotheses (2), b) test each one using a test ratio or a critical value for the proportion (2) and c) find a p-value for the null hypotheses (3). Label each part clearly so that I know which is I, II and III and a), b) c). Make sure that I know where the ‘reject’ zone is. d) Using the proportion you found above, how large a sample would you need to estimate a 2-sided 98% confidence interval for the proportion with and error of at most .001? Assume that your sample is of that size and show that the confidence interval has an error of at most .001. (3) [10] e) (Extra credit) Assume that you are testing the hypothesis that (II) more than 60% demand over $602.50, find the power of the test if you use a sample of 30 the true proportion is 70% (3)

Solution: The data sets that you had are presented in order. A line divides the numbers above $602.50 from those below. x b

is the number below 602.50. x

30

 x b is the number below 602.50.

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10

1 484 489 494 499 504 509 514 519 524 529

2 493 498 503 508 513 518 523 528 533 538

3 504 509 514 519 524 529 534 539 544 549

4 514 519 524 529 534 539 544 549 554 559

5 516 521 526 531 536 541 546 551 556 561

6 517 522 527 532 537 542 547 552 557 562

7 525 530 535 540 545 550 555 560 565 570

8 527 532 537 542 547 552 557 562 567 572

9 544 549 554 559 564 569 574 579 584 589

10 552 557 562 567 572 577 582 587 592 597

11 558 563 568 573 578 583 588 593 598 603

12 562 567 572 577 582 587 592 597 602 607

13 583 588 593 598 603 608 613 618 623 628

14 600 605 610 615 620 625 630 635 640 645

15 603 608 613 618 623 628 633 638 643 648

16 611 616 621 626 631 636 641 646 651 656

17 617 622 627 632 637 642 647 652 657 662

18 617 622 627 632 637 642 647 652 657 662

19 621 626 631 636 641 646 651 656 661 666

20 621 626 631 636 641 646 651 656 661 666

21 626 631 636 641 646 651 656 661 666 671

22 627 632 637 642 647 652 657 662 667 672

23 630 635 640 645 650 655 660 665 670 675

24 632 637 642 647 652 657 662 667 672 677

25 633 638 643 648 653 658 663 668 673 678

26 636 641 646 651 656 661 666 671 676 681

27 638 643 648 653 658 663 668 673 678 683

28 639 644 649 654 659 664 669 674 679 684

29 653 658 663 668 673 678 683 688 693 698

30 662 667 672 677 682 687 692 697 702 707 x b

14 13 13 13 12 12 12 12 12 10 x 16 17 17 17 18 18 18 18 18 20

Let p be the proportion that demand more than $602.50. a) State your null and alternative hypotheses (2)

I) 60% demand more than $602.50

H

H

1

0

:

: p p

.

6

.

6

II) More than 60% demand more than $602.50

H

H

1

0

:

: p p

.

6

.

6

13

252y0751 10/19/07 (Open in ‘Print Layout’ format)

III) Less than 60% demand more than $602.50

H

H

1

0

:

: p p

.

6

.

6 b) Test each hypothesis using a test ratio or a critical value for the proportion (2)

The relevant formulas are in Table 3. n

30 ,

 

.

02 , z

 z

.

02

2 .

054 was found in Grass1 and the t table says z

2

 z

.

01

2 .

327

 p

 p

0 n q

0 

.

6

30

= .

008

.

08944

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 x 16 17 17 17 18 18 18 18 18 20 p

 x n

 x

30

.5333 .5667 .5667 .5667 .6000 .6000 .6000 .6000 .6000 .6667

Here is the slice of Table 3 for proportions.

Interval for

Proportion

Confidence

Interval p

 p

 z

2 s p

Hypotheses s p

 p q n q

1

 p

H

0

H

1

:

: p p

 p

0 p

0

Test Ratio z

 p

 p p

0

Critical Value p

 q

0 cv

 p

0

 z

2

 p p

 p

0

1

 p

0 n q

0

I) 60% demand more than $602.50

H

 H

1

0

:

:

.

6

2 .

327

.

08966

.

6

.

2086 p p

.

6

Critical Value:

.

6 p cv

 p

0

 z

2

 p

Make a diagram. Draw a Normal curve centered at .6 with rejection zones below .3914 and above .8086. None of the values of p falls into the rejection region.

Test Ratio: z

 p

 p p

0  p

.

6

.

08944

. We reject the null hypothesis unless z falls between

 z

2

  z

.

01

 

2 .

327 and z

2

 z

.

01

2 .

327 . V1: z

.

5333

.

6

 

0 .

7458 , V2-4:

.

08944 z

.

5667

.

6

 

0 .

3723 , V5-9:

.

08944 falls in the ‘reject’ region. z

.

6000

.

6

.

08944

0 , V10: z

.

6667

.

6

.

08944

0 .

7458 None of these

II) More than 60% demand more than $602.50

H

H

1

0

.

6

2 .

054

.

08966

.

6

0 .

1842

.

7842

:

: p p

.

6

Critical Value:

.

6 p cv

 p

0

 z

 p

. Make a diagram. . Draw a Normal curve centered at .6 with a rejection zone above .7842. None of our values of

Test Ratio: z

 p

 p p

0  p

.

6

.

08944 p falls into the rejection zone.

. We reject the null hypothesis if z falls above z

.

02

2 .

054 .

None of our values of z falls into the rejection zone.

III) Less than 60% demand more than $602.50

H

 H

1

0

:

:

.

6

2 .

054

.

08966

.

6

0 .

1842

.

4158 p

 p

.

6

Critical Value:

.

6 p cv

 p

0

 z

 p

. Make a diagram. Draw a Normal curve centered at .6 with a rejection zone below .4158. None of our values of

Test Ratio: z

 p

 p p

0  p

.

6

.

08944 p falls into the rejection zone.

. We reject the null hypothesis if z falls below

 z

.

02

 

2 .

054 .

None of our values of z falls into the rejection zone.

14

252y0751 10/19/07 (Open in ‘Print Layout’ format) c) Find a p-value for the null hypotheses

In response to a student inquiry, I wrote the following paragraph about p-value.

We could say that to compute a value for z or t, if it is a left sided test, find the probability below your value of z using what you know about finding Normal probabilities (if it is t approximate the probability using the t table.) If it is a right sided test find the probability above your value of z. If it is a 2-sided test and z is negative, proceed as you would in a left sided test and double the probability. If it is a 2 sided test and z is positive, proceed as you would in a right sided test and double the probability.

I) 60% demand more than $602.50

V1: z

V2-4:

V5-9:

V10: z

 z

0 .

7458

 

0 .

3723 z

0 , p

0 .

7458

,

, p

, p

 value

 p value

 value

2 value

P

2

2

 z

P

2

P

 z

H

 H

P

0 z

 z

0

1

:

2

:

.

p p

 

0 .

7458

0 .

7458

 

.

6

.

6

 

0 .

3723

2

1 .

0000

2

.

5

.

5

2

.

5

.

2734

.

1443

.

2734

.

4532

.

7114

.

4532

II) More than 60% demand more than $602.50

V1:

V2-4:

V5-9:

V10: z z

 z

0 .

7458

,

0 .

3723 z

0 , p

0 .

7458

, p

, value p

 p value

 value

 value

P

 z

P

P

0 z z

P

 z

0 .

7458

.

5

0 .

3723

0 .

7458

 

 

H

H

1

.

5

0

:

:

.

5 p p

.

6

.

6

.

2734

.

1443

.

5

.

2734

.

7734

.

6443

.

2266

III) Less than 60% demand more than $602.50

V1:

V10: z

V2-4:

V5-9: z

 z

0 .

7458

 

0 .

3723 z

0 ,

 p

0 .

7458

,

, p

,

 p value

 value p

 value

 value

P

 z

P

P

0 z z

P

 z

0 .

7458

.

5

0 .

3723

0 .

7458

 

 

H

 H

1

0

.

5

:

.

5

:

 p p

.

5

.

2734

.

6

.

6

.

2734

.

1443

.

2266

.

3557

.

7734 d) Using the proportion you found above, how large a sample would you need to estimate a 2-sided 98% confidence interval for the proportion with and error of at most .001? Assume that your sample is of that size and show that the confidence interval has an error of at most .001. (3) [10] n

 pqz

2 e

2

. I have not worked this out for all versions, and it is up to you to decide what confidence level you will use. A solution for Version 1 with that you could get. n

.

5333

.

6667



.

 

2

2 .

328

2

1203 .

3

 

.

01 is probably the largest result

. The sample should be, at least 1204. e) (Extra credit) Assume that you are testing the hypothesis that (II) more than 60% demand over $602.50, find the power of the test if you use a sample of 30 the true proportion is 70% (3)

Find P

 p

.

7842 p

.

7

P

 z

.

7842

.

7 .

 

.

7 .

. If you answer was close to this and I didn’t give

30 you credit, complain.

15

252y0751 10/19/07 (Open in ‘Print Layout’ format)

Problem 2: Assume that the underlying data for problem 1 is not Normal and using the data for problem 1 test the following three hypotheses: I) that the median demand is $602.50, II) that median demand is more than $602.50 and III) that the median demand is less than $602.50, using a 98% confidence level. a) state your null and alternative hypotheses and the hypotheses that you will actually test for each of the 3 tests

(3), b) test each one using a test ratio or a critical value (3), c) find a p-value for the 2-sided test and explain whether and why it would lead to a rejection of the null hypothesis at the 95% confidence level (1), d)

(extra credit) Show explicitly what the conclusion in c) would be if the sample of 30 came from a population of 60. (1) e) (extra credit) find a two sided confidence interval for the median (2) [17]

Let p be the proportion that demand more than $602.50. The data has been exhibited in Problem 1. We have calculated the following.

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 x 16 17 17 17 18 18 18 18 18 20 p

 x

 x

.5333 .5667 .5667 .5667 .6000 .6000 .6000 .6000 .6000 .6667 n 30

The relevant formulas are in Table 3. n

30 ,

 

.

02 , z

 z

.

02

2 .

054 was found in Grass1 and the t table says z

2

 z

.

01

2 .

327

 p

 p

0 n q

0 

.

5

30

= .

00833

.

091287 .

If we check the table in the outline { 252ones }, we have the correspondences below. We will use the hypotheses about a proportion on the left.

Hypotheses about Hypotheses about a

A median proportion

If p is the If p is the

H

H

1

0

:

:

H

H

1

0

:

:

 

 

0

0

0

0

H

H

1

0

:

:

 

0

 

0 proportion above

0

H

H

1

0

:

: p p

.

5

.

5

H

H

1

0

:

:

H

H

1

0

:

: p p

.

5

.

5 p p

.

5

.

5 proportion below

0

H

H

1

0

:

:

H

H

1

0

:

:

H

H

1

0

:

: p p

.

5

.

5 p p

.

5

.

5 p p

.

5

.

5 a) State your null and alternative hypotheses and the hypotheses that you will actually test for each of the 3 tests (3) b) Test each one using a test ratio or a critical value (3)

I) The median demand is $602.50

Critical Value: p cv

 p

0

 z

2

 p

H

 H

.

5

0

1

:

:

2 .

327

602

602

.

50

.

50

.

091287

H

 H

0

1

.

5

:

:

 p p

.

5

.

5

.

2124 . Make a diagram. Draw a

Normal curve centered at .5 with rejection zones below .2878 and above .7124. None of the values of p fall into the rejection region.

Test Ratio: z

 p

 p p

0  p

.

5

.

091287

. We reject the null hypothesis unless z falls between

 z

2

  z

.

01

 

2 .

327 and z

2

 z

.

01 z

.

5667

.

5

0 .

7307 , V5-9:

.

091287 these falls in the ‘reject’ region. z

2 .

327 . V1: z

.

5333

.

5

.

091287

.

6000

.

5

1 .

0954 , V10: z

.

091287

 

0 .

3648 , V2-4:

.

6667

.

5

1 .

8261 None of

.

091287

16

252y0751 10/19/07 (Open in ‘Print Layout’ format)

II) The median demand is more than $602.50

H

 H

1

0

:

:

602 .

50

602 .

50

Critical Value: This is a right-sided test so our critical value is

.

5

2 .

054

.

091287

.

5

.

1875

.

6875 p

H

 H

1

0 cv

:

: p

0 p p

.

5

.

5

 z

 p

. Make a diagram. Draw a Normal curve centered at .5 with a rejection zone above .6875. None of the values of p fall into the rejection region.

Test Ratio: z

 p

 p p

0  p

.

5

.

091287

. We reject the null hypothesis if z falls above z

 z

.

02

2 .

054 . V1: z

 

0 .

3648 , V2-4: these falls in the ‘reject’ region. z

0 .

7307 , V5-9: z

1 .

0954 , V10: z

1 .

8261 None of

III) The median demand is less than $602.50

H

 H

1

0

:

:

602 .

50

602 .

50

H

 H

1

0

:

: p p

.

5

.

5

Critical Value: This is a right-sided test so our critical value is

.

5

2 .

054

.

091287

.

5

.

1875

.

3125 p cv

 p

0

 z

 p

. Make a diagram. Draw a Normal curve centered at .5 with a rejection zone below .3125. None of the values of p fall into the rejection region.

Test Ratio: z

 p

 p

0

 p

 p

.

5

.

091287

. We reject the null hypothesis if

 z

.

02

 

2 .

054 . V1: z

 

0 .

3648 , V2-4: of these falls in the ‘reject’ region. z

0 .

7307 , V5-9: z

 z falls below

 z 

1 .

0954 , V10: z

1 .

8261 None

Alternate formulas for this section include those below.

i. Test Ratio: With continuity correction z

 p

.

5 n

 p

 p

0

,

 p

 p

0 q

0 n

or z

2 x

1

 n

. This is the n same as testing z against

 z

2

.

5 n

 p

. Without continuity correction z

2 x

 n

. To allow for a finite n population, divide these by ii. Critical Value: p cv

 p

0

N

 n

 N

1

2 n

1

.

z

2

 p

. To make a finite population correction, multiply

 p

by

N

 n

N

1

.

iii. Confidence Interval: p

 p

1

2 n

 z

2 s p

. To make a finite population correction, multiply s p

by

N

N

 n

1

. c) Find a p-value for the 2-sided test and explain whether and why it would lead to a rejection of the null hypothesis at the 95% confidence level (1) . See problem 1 for an explanation of p-value.

I) The median demand is $602.50

V1:

V2-4:

V5-9:

V10: z z

 z z

0 .

3648

0 .

7307

1 .

0954

1 .

8261

,

,

, p p p

 p

 value value value value

2

2

P

H

 H

 z

0

1

2 P z

2 P z

P

 z

:

 

:

 

602

0 .

3648

 

 

0 .

7307

1 .

0954

1 .

8261

 

 

602 .

50

.

50

2

.

5

H

 H

2

2

2

.

5

.

5

.

5

0

: p

.

5

.

0672

1

:

.

1406

.

2673

.

3621

.

4664

 p

.

5

.

7188

.

4654

.

2758

Since none of these are below the significance level of 5%, none of these lead to a rejection of the null hypothesis at a 95% confidence level.

17

252y0751 10/19/07 (Open in ‘Print Layout’ format)

II) The median demand is more than $602.50

V1:

V2-4:

V5-9:

V10: z z

 z z

0 .

3648

0 .

7307

1 .

0954

1 .

8261

, p

,

, p p

 p

 value value value value

P

P

P

P z

 z

 z

 z

0 .

3648

0 .

7307

1 .

0954

1 .

8261

 

 

 

H

 H

 

0

.

5

1

.

5

:

.

5

.

5

:

602

602

.

1406

.

2673

.

3621

.

4664

.

50

.

50

.

4406

.

2327

.

1379

.

0336

H

 H

1

0

:

: p p

.

5

.

5

Since only the p-value for Version 10 is below the significance level of 5%, only in version 10 do we reject the null hypothesis at a 95% confidence level.

III) The median demand is less than $602.50

V1:

V2-4:

V5-9:

V10: z z

 z z

0 .

3648

0 .

7307

1 .

0954

1 .

8261

, p

,

, p p

 p

 value value value value

P

P

P

P z

 z

 z

 z

0 .

3648

0 .

7307

1 .

0954

1 .

8261

 

 

H

 H

 

 

0

1

.

5

.

5

:

.

5

.

5

:

602

602

.

1406

.

2673

.

3621

.

4664

.

50

.

50

H

 H

.

7673

.

8621

.

9664

0

1

.

3594

:

: p p

.

5

.

5

Since none of these are below the significance level of 5%, none of these lead to a rejection of the null hypothesis at a 95% confidence level. d) (Extra credit) Show explicitly what the conclusion in c) would be if the sample of 30 came from a population of 60. (1)

 p

N

N

 n

1 p

0 q

0 n

60

60

30

1

.

5

 

30

=

0 .

50847

.

00833

.

0042373

.

065094

The test ratio is z

 p

 p p

0  p

.

5

.

065094 and is now larger in absolute value than it was in c). We can put the p-values for the one-sided hypotheses under the hypotheses in the table below.

Version z-score

H

 H

1

0

V1: z

V2-4:

 z

.

5333

.

5667

.

5

.

091287

.

065094

.

5

0 .

5116

0 .

7307

P

P

 z z

:

:

0 .

5116

0

602 .

50

602 .

50

.

7307

 

 

.

5

H

 H

1

0

:

: p p

.

5

.

5

H

 H

1

0

.

5

.

1950

.

2673

.

3050

.

7673

P

P

 z z

:

:

602 .

50

602 .

50

0 .

5116

0 .

7307

 

 

.

5

.

1950

.

5

H

 H

1

0

:

: p p

.

5

.

5

.

2673

.

.

6950

2327

V5-9: z

V10: z

.

6000

.

065094

.

5

.

.

065094

6667

.

5

1 .

5362

2 .

5609

P

 z

P

 z

1 .

5362

 

.

5

.

4382

2 .

5609

 

.

5

.

4948

.

9382

.

9948

P

 z

P

 z

1 .

5362

 

2 .

5609

 

.

5

.

4382

.

5

.

4948

.

0618

.

0052 .

If we look at these, mentally double the smaller of the two probabilities and compare the p-values with

 

.

05 , we see that, though some of these p-values have fallen, the only change to our results is that for

Version 10 we will now reject the null hypothesis for

H

 H

1

0

:

:

602 .

50

602 .

50

H

H

0

1

:

: p p

.

5

as well as for the

.

5 right-sided test. e) (Extra credit) Find a two sided confidence interval for the median (2) . At this point I’m unsure if there is any significance level implied. The easiest way for me to do this is to copy out the first part of the binomial table for n

30 . { bin } Recall that if we take the interval. We get a significance level of

 

2 P

 x th k number from both the top and the bottom as our

 k

1

, when p

.

5 . For example if n

50

18

252y0751 10/19/07 (Open in ‘Print Layout’ format)

2

2

P

P

 x x

19

17

1

1

 

03245

2

.

00767

.

06490

.

01534

,

 

and

2

P

 x

2

P

18 x

1

16

1

2

.

01642

2

00330

.

03264

.

00660 . If the confidence level is to be at least 1

 

, the significance level must be at most

. So, if we want a 95% confidence interval we need k

18 (and 50

 k

99% confidence level we need k

16

1

33 ), for a 98% confidence level we need k

17 (and 32) and for a

(and 33). Unfortunately, we do not have a binomial table for n

30 so we must try k

 n

1

 z

.

2 n

. For 95% this would be k

30

1

1 .

960 30

10 .

13 , for 98% this

2 2 would be k

30

1

2

2

.

327 30

9 .

13 and for 99% this would be are all rounded down and are paired with the number with index k

30

1

2 .

576 30

8 .

44 . These

2

30

 k

1 , which takes the values 21, 22 and 23.

The confidence intervals are given on the following table.

95% confidence interval for the median

Index V1 V2 V3 V4 V5 V6 V7 V8 V9 V10

10 552 557 562 567 572 577 582 587 592 597 to

21 626 631 636 641 646 651 656 661 666 671

98% confidence interval for the median

Index V1 V2 V3 V4 V5 V6 V7 V8 V9 V10

9 544 549 554 559 564 569 574 579 584 589

to

22 627 632 637 642 647 652 657 662 667 672

99% confidence interval for the median

Index V1 V2 V3 V4 V5 V6 V7 V8 V9 V10

8 527 532 537 542 547 552 557 562 567 572

to

23 630 635 640 645 650 655 660 665 670 675

19

252y0751 10/19/07 (Open in ‘Print Layout’ format)

Problem 3: a) Find the sample mean and sample standard deviation for the data in Problem 1 (1) b) Test the hypothesis that the mean is 602.50 using critical values for the sample mean, first stating your hypotheses clearly. Use a 98% confidence level (2) c) Test the hypothesis in b) using a test ratio. Find an approximate p-value and state and explain whether this will lead to a rejection of the null hypothesis if we continue to use a 98% confidence level. (2) d) Using the test ratio you found in c) find a p-value for the null hypothesis that the mean is at most 602.50

(1) e) Using the test ratio you found in c) find a p-value for the null hypothesis that the mean is at least 602.50

(1) f) Test the null hypothesis that the mean is at most 602.50 using an appropriate confidence interval (1) g) Test the null hypothesis that the mean is at least 602.50 using an appropriate confidence interval (1)

[26]

The data in use are as below.

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10

1 484 489 494 499 504 509 514 519 524 529

2 653 658 663 668 673 678 683 688 693 698

3 527 532 537 542 547 552 557 562 567 572

4 600 605 610 615 620 625 630 635 640 645

5 552 557 562 567 572 577 582 587 592 597

6 662 667 672 677 682 687 692 697 702 707

7 583 588 593 598 603 608 613 618 623 628

8 544 549 554 559 564 569 574 579 584 589

9 558 563 568 573 578 583 588 593 598 603

10 525 530 535 540 545 550 555 560 565 570

11 504 509 514 519 524 529 534 539 544 549

12 611 616 621 626 631 636 641 646 651 656

13 617 622 627 632 637 642 647 652 657 662

14 621 626 631 636 641 646 651 656 661 666

15 493 498 503 508 513 518 523 528 533 538

16 562 567 572 577 582 587 592 597 602 607

17 626 631 636 641 646 651 656 661 666 671

18 633 638 643 648 653 658 663 668 673 678

19 516 521 526 531 536 541 546 551 556 561

20 639 644 649 654 659 664 669 674 679 684

21 630 635 640 645 650 655 660 665 670 675

22 617 622 627 632 637 642 647 652 657 662

23 514 519 524 529 534 539 544 549 554 559

24 638 643 648 653 658 663 668 673 678 683

25 621 626 631 636 641 646 651 656 661 666

26 603 608 613 618 623 628 633 638 643 648

27 632 637 642 647 652 657 662 667 672 677

28 627 632 637 642 647 652 657 662 667 672

29 517 522 527 532 537 542 547 552 557 562

30 636 641 646 651 656 661 666 671 676 681

Minitab offers the summary statistics below.

Version n x s x s x

Q1 Median Q3

 x

 x

2

1 30 584.83 9.91 54.26 526.50 607.00 630.50 17545 10346275

2 30 589.83 9.91 54.26 531.50 612.00 635.50 17695 10522475

3 30 594.83 9.91 54.26 536.50 617.00 640.50 17845 10700175

4 30 599.83 9.91 54.26 541.50 622.00 645.50 17995 10879375

5 30 604.83 9.91 54.26 546.50 627.00 650.50 18145 11060075

6 30 609.83 9.91 54.26 551.50 632.00 655.50 18295 11242275

7 30 614.83 9.91 54.26 556.50 637.00 660.50 18445 11425975

8 30 619.83 9.91 54.26 561.50 642.00 665.50 18595 11611175

9 30 624.83 9.91 54.26 566.50 647.00 670.50 18745 11797875

10 30 629.83 9.91 54.26 571.50 652.00 675.50 18895 11986075

20

252y0751 10/19/07 (Open in ‘Print Layout’ format) a) Find the sample mean and sample standard deviation for the data in Problem 1 (1)

Solution: There isn’t a good reason to repeat the calculations here for more than one example.

So I will stick to Version 1

Index x x

2 x

 x

 x

 x

2

1 484 234256 -100.833 10167.4

2 653 426409 68.167 4646.7

3 527 277729 -57.833 3344.7

4 600 360000 15.167 230.0

5 552 304704 -32.833 1078.0

6 662 438244 77.167 5954.7

7 583 339889 -1.833 3.4

8 544 295936 -40.833 1667.4

9 558 311364 -26.833 720.0

10 525 275625 -59.833 3580.0

11 504 254016 -80.833 6534.0

12 611 373321 26.167 684.7

13 617 380689 32.167 1034.7

14 621 385641 36.167 1308.0

15 493 243049 -91.833 8433.4

16 562 315844 -22.833 521.4

17 626 391876 41.167 1694.7

18 633 400689 48.167 2320.0

19 516 266256 -68.833 4738.0

20 639 408321 54.167 2934.0

21 630 396900 45.167 2040.0

22 617 380689 32.167 1034.7

23 514 264196 -70.833 5017.4

24 638 407044 53.167 2826.7

25 621 385641 36.167 1308.0

26 603 363609 18.167 330.0

27 632 399424 47.167 2224.7

28 627 393129 42.167 1778.0

29 517 267289 -67.833 4601.4

30 636 404496 51.167 2618.0

Sum 17545 10346275 0.000 85374.2

For these numbers

 x

17545,

 x

2 

10346275 and n

30 . This means that x

 n x

17545

30

584 .

3333 . If we subtract this mean from all 30 numbers in the first column, we get the 3 rd column which has the sum

  x

 x

0 . If we square the 3 rd column, we get

  x

 x

2 

85374 .

2 .

Using the computational or definitional formula, we get the following. s

2 x

 x n

2

1 n x

2

10346275

30

29

584 .

8333

2

  n x

1 x

2

85374

29

.

17

2943 .

9367 s

2

2943 .

9367

So s x

2943 .

9367

54 .

25806 . s x

  

98 .

1312

9 .

9061 n 30

You could have gotten this using the shortcut at the beginning of the Takehome document as follows.

  x

 a

1071575

 

 na

2

 

17545

,

17395

330

2

30

17545

10346275 .

  x

 a

2 

  x

2

  

 na

2

21

252y0751 10/19/07 (Open in ‘Print Layout’ format) b) Test the hypothesis that the mean is 602.50 using critical values for the sample mean, first stating your hypotheses clearly. Use a 98% confidence level (2)

Solution: As usual, we go back to Table 3.

Interval for Confidence Hypotheses Test Ratio Critical Value

Mean (

unknown)

Interval

 

DF x

 n t

2

1 s x

H

0

H

1

:

:

0

0 t

 x

 

0 s x x cv s x

   t

2 s x

 s n

Note that

 

.

02 and n

30 , so that t

2

 t

.

01

=2.462. Recall that s x

9 .

9061 and

H

 H

1

0

:

:

602 .

50

.

602 .

50 x cv

   t

2 s x

602 .

50

2 .

462

9 .

9061

602 .

50

24 .

39 or 578.11 to 626.89

Make a diagram. Show an approximately Normal curve with a mean at 602.50 and shaded ‘reject’ zones above 626.89 and below 578.11. None of the means below will fall into a ‘reject’ zone except the sample mean for Version 10.

Version x

1 584.83

2 589.83

3 594.83

4 599.83

5 604.83

6 609.83

7 614.83

8 619.83

9 624.83

10 629.83 c) Test the hypothesis in b) using a test ratio. Find an approximate p-value and state and explain whether this will lead to a rejection of the null hypothesis if we continue to use a 98% confidence level. (2) t

 x

 s x

0  x

9 .

602 .

9061

50

If you wish, make an approximately Normal curve with a mean at zero and

‘reject’ zones above t

.

01

= 2.462 and below -2.462 and compare your value of t with the ratios computed below.

In order to find the p-value, we look at the t table to find the following for 29 degrees of freedom. df .45 .40 .35 .30 .25 .20 .15 .10 .05 .025 .01 .005 .001

29 0.127 0.256 0.389 0.530 0.683 0.854 1.055 1.311 1.699 2.045 2.462 2.756 3.396

Values of t appear in the table below. If we compare values we find t

.

05

1.699< 1 .

78375 < t

.

025 t

1 .

2.045. Since

78375 t

.

  

, the ratio for Version 1, with the table

1.699, means P

 t

1 .

699

.

05 , we can conclude that .

025

P

 t

1 .

78735

.

05 or, by symmetry, .

025

P

 t

 

1 .

78735

.

05 . For a two-sided test the p-value is the probability of getting something as extreme as or more extreme than twice the probability P

 t

 

1 .

78735

, so we can say that .

05

 p

 value

.

10 x

584 .

83 is

. The rest are shown on the table below.

Version x t comp

Location of

1 584.83 -1.78375 t

.

  

1.699< t comp

 t comp

Approximate p-value

< t

.

025

2.045 .

05

 p

 value

.

10

2 589.83 -1.27901 t

.

15

1.311 .

20

 p

 value

.

30

3 594.83 -0.77427 t

.

25

4 599.83 -0.26953 t

.

40

1.055<

 t comp

< t

.

10

0.683<

 t comp

< t

.

20

0.256<

 t comp

< t

.

35

0.854 .

40

 p

 value

.

50

0.389 .

70

 p

 value

.

80

5 604.83 0.23521 t

.

45

0.127< t comp

< t

.

40

0.256 .

80

 p

 value

.

90

22

252y0751 10/19/07 (Open in ‘Print Layout’ format)

Version x t comp

Location of t comp

Approximate p-value

6 609.83 0.73995 t

.

25

0.683< t comp

< t

.

20

0.854 .

40

 p

 value

.

50

7 614.83 1.24469 t

.

15

1.055< t comp

< t

.

10

1.311 .

20

 p

 value

.

30

8 619.83 1.74943 t

.

05

1.699< t comp

< t

.

025

2.045 .

05

 p

 value

.

10

9 624.83 2.25417 t

.

025

2.045< t comp

< t

.

01

2.462 .

02

 p

 value

.

05

10 629.83 2.75891 t

.

005

If

 

.

02

2.756< t comp

< t

.

001

3.396 .

002

, only Version 10 would give a rejection of the null hypothesis.

 p

 value

.

01 d) Using the test ratio you found in c) find a p-value for the null hypothesis that the mean is at most 602.50

(1) We are now testing

H

 H

1

0

:

:

602 .

50

602 .

50

. This is a right-sided test.

Values of t are repeated in the table below. If we compare table values we find t

.

05

1.699< 1 .

78375 < t

.

025 t

1 .

2.045. Since

78375 t

.

  

, the ratio for Version 1, with the

1.699, means P

 t

1 .

699

.

05 , we can conclude that .

025

P

 t

1 .

78735

.

05 or, by symmetry, .

025

P

 t

 

1 .

78735

.

05 . For a right-sided test the p-value is the probability of getting something as high as or higher than the probability P

 t

 

1 .

78735

, so we can say that .

95

 p

 value

.

975 x

584 .

83 is

. The rest are shown on the table below. Note that if the significance level is .02, we will definitely reject the null hypothesis in Version 10 and probably in Version 9.

Version x t comp

Location of t comp

Approximate p-value

1 584.83 -1.78375 t

.

05

1.699<

 t comp

< t

.

025

2.045 .

95

 p

 value

.

975

2 589.83 -1.27901 t

.

15

1.311 .

85

 p

 value

.

90

3 594.83 -0.77427 t

.

25

4 599.83 -0.26953 t

.

40

1.055<

 t comp

< t

.

10

0.683<

 t comp

< t

.

20

0.256<

 t comp

< t

.

35

0.854

0.389

.

75

.

60

 p

 value

.

90

 p

 value

.

75

5 604.83 0.23521 t

.

45

0.127< t comp

< t

.

40

0.256 .

40

 p

 value

.

45

6 609.83 0.73995 t

.

25

0.683< t comp

< t

.

20

0.854 .

20

 p

 value

.

25

7 614.83 1.24469 t

.

15

1.055< t comp

< t

.

10

1.311 .

10

 p

 value

.

15

8 619.83 1.74943 t

.

05

1.699< t comp

< t

.

025

2.045 .

025

 p

 value

.

05

9 624.83 2.25417 t

.

025

2.045< t comp

< t

.

01

2.462 .

025

 p

 value

.

01

10 629.83 2.75891 t

.

005

2.756< t comp

< t

.

001

3.396 .

001

 p

 value

.

005 e) Using the test ratio you found in c) find a p-value for the null hypothesis that the mean is at least 602.50

(1)

We are now testing

H

 H

1

0

:

:

602 .

50

602 .

50

. This is a left -sided test.

Values of t are repeated in the table below. If we compare table values we find t

.

05

1.699< 1 .

78375 < t

.

025 t

1 .

2.045. Since

78375 t

.

  

, the ratio for Version 1, with the

1.699, means P

 t

1 .

699

.

05 , we can conclude that .

025

P

 t

1 .

78735

.

05 or, by symmetry, .

025

P

 t

 

1 .

78735

.

05 . For a right-sided test the p-value is the probability of getting something as low as or lower than the probability P

 t

 

1 .

78735

, so we can say that .

025

 p

 value

.

05 x

584 .

83 is

. The rest are shown on the table below. Note that your p-values for d) and e) should add to 1.

23

252y0751 10/19/07 (Open in ‘Print Layout’ format)

Version x t comp

Location of t comp

Approximate p-value

1 584.83 -1.78375 t

.

05

1.699<

 t comp

< t

.

025

2.045 .

025

 p

 value

.

05

2 589.83 -1.27901 t

.

15

1.311 .

10

 p

 value

.

15

3 594.83 -0.77427 t

.

25

4 599.83 -0.26953 t

.

40

1.055<

 t comp

< t

.

10

0.683<

 t comp

< t

.

20

0.256<

 t comp

< t

.

35

0.854

0.389

.

20

.

35

 p

 value

 p

 value

.

25

.

40

5 604.83 0.23521 t

.

45

0.127< t comp

< t

.

40

0.256 .

55

 p

 value

.

60

6 609.83 0.73995 t

.

25

0.683< t comp

< t

.

20

0.854 .

75

 p

 value

.

80

7 614.83 1.24469 t

.

15

1.055< t comp

< t

.

10

1.311 .

85

 p

 value

.

90

8 619.83 1.74943 t

.

05

1.699< t comp

< t

.

025

2.045 .

95

 p

 value

.

975

9 624.83 2.25417 t

.

025

2.045< t comp

< t

.

01

2.462 .

975

 p

 value

.

99

10 629.83 2.75891 t

.

005

2.756< t comp

< t

.

001

3.396 .

995

 p

 value

.

999

Note that if the significance level is .02, we will never reject the null hypothesis. f) Test the null hypothesis that the mean is at most 602.50 using an appropriate confidence interval (1)

I’m surprised that no one called me on this. To do this correctly you need t

.

  

2.150, which is not on any of your tables. Here you get points for thinking, so I’ll see what you did. It would be reasonable to use t

 

.

025 in place of the 2% value. You also might use z

.

02

, if you explained that you were desperate.

H

 H

1

0

:

:

602 .

50

602 .

50 which becomes

. Recall

  x

 t

 s x

.

02

 x

, n

30 , s x

9 .

9061

2 .

150

9 .

9061

 x

and the two-sided formula is

21 .

30

  x

 t

2 s x

. For the results see the table after g).

, g) Test the null hypothesis that the mean is at least 602.50 using an appropriate confidence interval (1)

H

 H

1

0

:

:

  x

 t

2

602 s x

.

50

602 .

50

Recall

 

, which becomes

.

02

, t

.

  

 x

 t

2.150, s x

 x

 n

30

2 .

150

, s x

9 .

9061

9 .

9061

 x

and the two-sided formula is

21 .

30 . The intervals in both f) and g) contain the sample mean.

Version x

H

0

:

 

602 .

50

  x

 t

 s x

H

0

:

 

602

  x

 t

.

50 s x

1 584.83 = 563.53

2 589.83 = 568.53

3 594.83 = 573.53

= 606.13

= 611.13

= 616.13

4 599.83

5 604.83

6 609.83

7 614.83

8 619.83

9 624.83

10 629.83

= 578.53

= 583.53

= 588.53

= 593.53

= 598.53

= 603.53 *

= 608.53 *

= 621.13

= 626.13

= 631.13

= 636.13

= 641.13

= 646.13

= 651.13

Note that the two starred confidence intervals contradict the null hypothesis and thus imply rejection.

24

252y0751 10/19/07 (Open in ‘Print Layout’ format)

Problem 4: Assume that the population standard deviation is known to be 30 but that we are still working with a problem like Problem 3. (98% confidence level, sample of 30.) Do either Problem 4.1 or Problem

4.2. Make sure that I know which one!

Let’s start with Table 3.  

.

02 , n

30 ,

 

30 and

 x

 n

30

30

30

5 .

4772

Interval for

Mean (

known)

Confidence

Interval

  x

 z

2

 x

 n

Hypotheses

H

0

H

1

:

:

0

0

Test Ratio z

 x

 x

0

Critical Value x cv

 

0

 z

2

 x

Problem 4.1

. a) Find a critical value for the sample mean if we are testing whether the population mean is below 30. Clearly state your null and alternative hypotheses (2) b) Assume that the sample mean is 30 minus the second to last digit of your student number. (Use 10 if this digit is zero.) Find a p-value for your null hypothesis. (1) c) Create a power curve for the test (6)

Solution: a) We find a critical value for the sample mean if we are testing whether the population mean is below 30. We state our null and alternative hypotheses (2) value for the mean that is below 30. We use x cv

 

0

H

H

1

0 z

 x

:

:

30

30

30 z

2 .

054

.

02

2 .

054

5 .

4772

We need a critical

30

11 .

250

18 .

750 b) We assume that the sample mean is 30 minus the second to last digit of our student number. (Use 10 if this digit is zero.) We find a p-value for our null hypothesis. z calc

 x

 x

0  x

30

5 .

4772

will be our test ratio and we will calculate p

 value

P

 z

 z calc

. For example if the mean is 29, we compute z calc

29

30

5 .

4772

 

0 .

18 . Using the Normal table we find p

 value

P

 z

 

0 .

18

.

5

.

0714

.

4286 .

The values below were computer generated. Yours should be close.

Version x z calc

P

 z

 z calc

1 29 -0.18258 0.427566

2 28 -0.36515 0.357500

3 27 -0.54773 0.291940

4 26 -0.73030 0.232603

5 25 -0.91288 0.180654

6 24 -1.09545 0.136660

7 23 -1.27803 0.100620

8 22 -1.46060 0.072063

9 21 -1.64318 0.050173

10 20 -1.82575 0.033944 c) We create a power curve for the test. We do not reject the null hypothesis if our sample mean is above x cv

18 .

750 . Remember that our hypotheses are possible values of

H

H

0

1

:

:

30

30 and that we need a power curve for all

that are below 30. The distance between 30 and the critical value is 30 – 18.750 =

11.25, half of that is 5.62, which we can round to 6. Let’s try using 30, 24, 18.75, 12 and 6 as

1

.

We will compute P

 x

18 .

750

1

P

 z

18 .

750

 

1

5 .

4772

 . Make a diagram. Show a Normal curve with a mean of

30 and shade a ‘reject’ zone below 18.750. On the same diagram make a second Normal curve of the same size as the first one with a mean at a value of

1

and shade a ‘do not reject’ zone that includes the entire

25

252y0751 10/19/07 (Open in ‘Print Layout’ format) area under the second curve above 18.750. For

1

P

 z

18 .

750

5 .

4772

24

P

 z

 

0 .

96

.

5

.

3315

29 this becomes

 

P

 x

18 .

750

1

24

.

8315 . If we let the computer do the dirty work, we get the following.

Point

1 z calc

18 .

750

 

1  

P

 z

 z calc

 power

1

5 .

4772

1 30.00 -2.05397 0.980011 0.019989

2 24.00 -0.95852 0.831099 0.168901

3 18.75 0.00000 0.500000 0.500000

4 12.00 1.23238 0.108903 0.891097

5 6.00 2.32783 0.009961 0.990039

As was explained in class, you do not need to do the calculations for points 1 and 3 since the power for

  

0

is always equal to the significance level and the power at

1

 x cv

is always .5. Graph the power on your y-axis against

1

on your x-axis.

Problem 4.2

. a) Find critical values for the sample mean if we are testing whether the population mean is 30. Clearly state your null and alternative hypotheses (2) b) Assume that the sample mean is 30 minus the second to last digit of your student number. (Use 10 if this digit is zero.) find a p-value for your null hypothesis. (1) c) Create a power curve for the test (8) [37]

Solution: a) We find critical values for the sample mean if we are testing whether the population mean is

30. We state our null and alternative hypotheses

H

H

0

1

:

:

30

30

 x

5 .

4772 ,

 

.

02 and z

.

01

2 .

327 .

We need critical values for the sample mean that are both above and below 30. These are x cv

 

0

 z

2

30

2 .

327

5 .

4772

30

12 .

745 x

or 17.255 and 42.745. We reject the null hypothesis if the sample mean does not fall between these values. b) We assume that the sample mean is 30 minus the second to last digit of our student number. (We use 10 if this digit is zero.) and find a p-value for our null hypothesis. z calc

 x

 x

0  x

30

5 .

4772

will be our test ratio and we will calculate p

 value

2 P

 z

 z calc

(since all our values of x are to the left of the alleged mean.). For example if the mean is 29, we compute find p

 value

2 P

 z

 

0 .

18

2 (.

5

.

0714 )

 z calc

2 (.

4286 )

29

30

 

0 .

18 . Using the Normal table we

5 .

4772

.

8572 .

If we let the computer do the work, we get the table below. Your results should be similar.

Version x z calc

2 P

 z

 z calc

1 29 -0.18258 0.855131

2 28 -0.36515 0.714999

3 27 -0.54773 0.583881

4 26 -0.73030 0.465207

5 25 -0.91288 0.361308

6 24 -1.09545 0.273319

7 23 -1.27803 0.201241

8 22 -1.46060 0.144125

9 21 -1.64318 0.100347

10 20 -1.82575 0.067888 c) We create a power curve for the test . We do not reject the null hypothesis if the sample mean lies between 17.255 and 42.745. The alleged mean is 30 and the distance between 30 and the critical values is

12.745, half of which we can round to 6.5. We need the power for every value of the mean. Let’s try using

30, 36.5, 42.745, 49.5 and 56 for

1

on the top side of 30 and 30, 23.5, 17.255, 10.5 and 4 on the bottom

26

252y0751 10/19/07 (Open in ‘Print Layout’ format) side of 30. We will compute P

17 .

255

 x

42 .

745

1

P



17 .

255

5 .

4772

1  z

42 .

745

5 .

4772

1

 . For example if

1

36 .

5 , we find

 

P

17 .

255

 x

42 .

745

1

36 .

5

P

17 .

255

5 .

36

4772

.

5

 z

42 .

745

5 .

4772

36 .

5

P

3 .

51

1 .

14

.

4998

.

3729

.

8727 .

The table that follows is computer generated. Because of rounding error in the standard deviation only the first four significant figures of the operating characteristic

  and the power columns should be taken seriously, but your results should be very close to these.

Point

1 z calc 1

17 .

255

5 .

4772

1 4.000 2.42003

2 10.500 1.23329

3 17.255 0.00000

4 23.500 -1.14018

5 30.000 -2.32692

6 36.500 -3.51366

7 42.745 -4.65384

8 49.500 -5.88713

9 56.000 -7.07387

1 z calc 2

42 .

745

5 .

4772

7.07387

5.88713

4.65384

3.51366

2.32692

1.14018

0.00000

-1.23329

-2.42003

1

 

P

 z calc 1

 z

 z calc 2

 power

1

 

0.007760

0.108733

0.499998

0.872674

0.980030

0.872674

0.499998

0.108733

0.007760

0.992240

0.891267

0.500002

0.127326

0.019970

0.127326

0.500002

0.891267

0.992240

Of course, this is much less work than it looks like. Only points 1, 2 and 4 need to be computed. Note that points 3 and 7 are at critical values and give powers of .5 and that point 5 is the null hypothesis mean and gives a power equal to the significance level (2%). Also the power for the points 9 through 6 is identical to the power for points 1 through 4, so that only three computations are necessary to compute the operating characteristic curve.

27

252y0751 10/19/07 (Open in ‘Print Layout’ format)

Problem 5: In problem 4 we assumed that the population standard deviation is 30. a) Do a 98% confidence interval for the mean using the mean that you found in Problem 3 and assuming that our sample of 30 came from a population of 300. (2) b) How large a sample would we need if we wanted to make the error term no more than

1 and the sample came from an infinite population? (2) c) Using a 98% confidence level and a sample size of 30 create a confidence interval for the population standard deviation using your sample variance or standard deviation from Problem 3. (2) d) Repeat c) assuming that you had a sample of 300. (2) e) Can we say that the standard deviation is significantly different from 30 on the basis of c) and d)? (1) f) Using the data and sample size from problem 3 can we say that the standard deviation is above 30? State

[49] your hypotheses and do an appropriate hypothesis test. (3)

Solution:

Interval for Confidence Hypotheses Test Ratio Critical Value

Mean (

unknown)

Interval

 

DF x

 n t

2

1 s x

H

0

H

1

:

:

0

0 t

 x

 s x

0 s x cv x

   t

2 s x

 s n a) Do a 98% confidence interval for the mean using the mean that you found in Problem 3 and assuming that our sample of 30 came from a population of 300. (2)

In problem 3 we found t

.

  

2.462, s

2 x

2943 .

9367 , s x

54 .

25806 and used

  x

 t

2 s x

. With the finite population correction, we have the following. s x

N

N

 n

1

( 9 .

9061 )

270

299

2943 .

9367

2658 .

4044

51 .

5597 , so that

  x

2 .

462

51 .

5597

 x

126 .

94 .

If we use the population variance at the beginning of this problem, z

.

01

2 .

327

  x

 z

 x

. With the finite population correction we have the following.

 x

,

 2 x

900 ,

 x

N

N

 n

1

30 and

( 5 .

4772 )

270

812 .

7090

28 .

5081 , so that

  x

2 .

327

28 .

5081

 x

66 .

34 .

299

Using the means for the various versions, we can get our intervals easily.

Version x

  x

126 .

94

  x

66 .

34

1 584.83

2 589.83

3 594.83

4 599.83

5 604.83

6 609.83

7 614.83

8 619.83

9 624.83

10 629.83

457.89 to 711.77 518.49 to 651.17

462.89 to 716.77 523.49 to 656.17

467.89 to 721.77 528.49 to 661.17

472.89 to 726.77 533.49 to 666.17

477.89 to 731.77 538.49 to 671.17

482.89 to 736.77 543.49 to 676.17

487.89 to 741.77 548.49 to 681.17

492.89 to 746.77 553.49 to 686.17

497.89 to 751.77 558.49 to 691.17

502.89 to 756.77 563.49 to 696.17 b) How large a sample would we need if we wanted to make the error term no more than sample came from an infinite population? (2)

1 and the

Solution: n

 z

2  2

. Depending on what we believe, we can use

 e

2

 2

slot. If the confidence level is 98%, we will use z

.

01

2 .

327

2 

900

and since e

2

or

 s

1

2 x

2943 .

9367 in the

, we can leave it out of

28

252y0751 10/19/07 (Open in ‘Print Layout’ format) the equation. We have either n

2 .

327

2

(900) = 4873.44 or n

2 .

327

2

( 2943 .

9367 )

15941 .

21 and use

4874 or 15942

Problems c-f concern the variance and standard deviation and use formulas from Table 3.

Interval for

Variance-

Small Sample

Variance-

Large Sample

Confidence

Interval

 2 

 n

 2

.

5

1

 s

2

.

5

 

2

 

 s z

2

2

 

2

 

Hypotheses

H

0

:

H

1 :

:

 2

2  

  2

0

2

0

H

0

H

1

:

:

 2

 2

 2

0

2

0

Test Ratio z

 2

2

 2 

 n

1

 s

 2

0

2

2

 

1

Critical Value s

2 cv

 2

.

5

.

5

 

2

 n

1

2

0 s cv

 z

2

2 DF

2 DF c) Using a 98% confidence level and a sample size of 30 create a confidence interval for the population standard deviation using your sample variance or standard deviation from Problem 3. (2)

Solution: Recall the following. freedom,

 2

 

.

99

14 .

2565 and n

30

2

 

.

01 s

2 x

2943 .

9367 or s x

54 .

25806

 n

1

 s

2

49 .

5881 , take the formula substitute s

2 x

2943 .

9367

Or, if we take square roots, 41.49

 

2943 .

9367

to get

49

.

5881

 

77.38.

  2 

 

2943 .

9367

14 .

2565

 2

2

. For (30 – 1) degrees of

  2 

 n

1

 s

2

 2

1

2 or 1721.6663

, and

 2 

59844.4379 d) Repeat c) assuming that you had a sample of 300. (2) For the appropriate value of of twice the degrees of freedom z

.

01

2 .

327 , 2 DF

2

 

24 .

45404 z and the square root

, take the formula z

 s

2 DF

2 DF

   s

 z

2 DF

2 DF

and substitute s x

54 .

25806 to get , for

 

.

02 ,

2

54 .

25806

24 .

45404

2 .

327

24 .

45404

 

2

54 .

25806

2 .

327

24 .

45404

24 .

45404

 or 49.54

  

59.96 e) Can we say that the standard deviation is significantly different from 30 on the basis of c) and d)? (1)

It is enough to check our results from the confidence intervals, though a more formal test of H

0

:

 

30 could be done using

 2 

 n

1

 s

 2

0

2

and/or setting z

2

 2 

2 DF

1 could be done. Simply put, since 30 falls on neither interval, there is a significant difference between 30 and the standard deviation from our sample. f) Using the data and sample size from problem 3 can we say that the standard deviation is above 30? State your hypotheses and do an appropriate hypothesis test. (3) The pair of hypotheses

H

H

0

1

:

:

30

30

are equivalent to

 2 

 n

1

 s

 2

0

2

H

0

:

 2 

900

. If n

H

1

:

 2 

900

29

2943 .

9367

900

30 , so there are 29 degrees of freedom, we can use the test ratio

94 .

8602 . If we maintain a 98% confidence level, our ‘reject’ zone will be the area above

 2

.

02

. Our table does not give us this value, but we can say that

 2

 

.

01

49 .

5881

29

252y0751 10/19/07 (Open in ‘Print Layout’ format) and

 2

 

.

025

45 .

7224 , so that

 2

.

02

must lie between them and 94.8602 must be in the ‘reject’ zone. A pvalue approach would observe that the largest number in the df = 29 column is

 2

 

.

005

52 .

3360 and, since

94.8602 is above this p

 value

P

 2 

94 .

8602

.

005

  

.

02 . So we reject H

0

.

30

Download