Chapter 5

advertisement

CHAPTER 5: Solutions to Selected Exercises

5.1 a. r x

1 r

VIF x

5

2

.

910 , r x

2 x

5

8 , 933 .

1 , VIF

4

.

933 , r x

4

9 , x

5

597

.

936

.

6 , VIF

5

23 .

3 b. x

2

, x

4

, x

5 c.

d.

Yes, b

4

(says labor hours goes down as patient load increases)

Perhaps, all p -values for independent variables exceed .02, and .02 is 20 times larger than .001.

Model 1 seems to be best. e.

It has the smallest s (equivalently, the largest R

2

) and the smallest C. It is chosen by both the stepwise and backwards procedure. It has the shorter prediction interval of 17,618 – 14,511 = 3,107 compared to 17,601 – 14,460 =

3,141. The p -values for the independent variables are all less than .10.

5.2

The model using FloorSp and PresPct seems to be best. It has the smallest s

(equivalently the largest

2

R ) and the smallest C. All the independent variables have pvalues less than .01. b

1

 

.

003842 An increase in floor space of 1 square foot corresponds to a decrease in mean prescription sales of 3.842 units when prescription space percentage remain the same. b

2

 

.

5819 An increase in percentage allocated to prescription department of

1% corresponds to a decrease in mean prescription sales of 581.9 units when floor space remains the same.

5.3 Model with 12 terms : s = 210.70

(7 squared and interaction)

R

2

= .974

Model with 5 linear terms : s = 430.25 R

2

= .893

(TIME, MKTPOTEN, ADVER, MKTSHARE, CHANGE)

Model with 12 terms has much smaller s and larger

2

R

5.4

Approximate horizontal band appearance. No violations indicated.

5.5

Possible violation of normality because histogram appears to be skewed to the right and the normal probability plot deviates from straight line at both ends.

Possible violation of constant variation assumption because residual plot may funnel in.

5.6

No

5.7 a. P

( 4 )

P

( 10 )

3 ( i

3 n

)

1

( 100 )

1

3 ( i )

3 n

1

( 100 )

1

3 ( 4 )

3 ( 11 )

1

1

( 100 )

3 ( 10 )

3 ( 11 )

1

1

( 100 )

(.

3235 )( 100 )

.

8529 ( 100 )

32 .

35

85 .

29 b.

No

42

5.11 a. b.

5.12 a. b.

5.13 a. b.

5.14 a.

5.8 The residual plot exhibits possible cyclical appearance

5.9

In Figure 5.27 (b), the points seem to fan out as the number of desktops increases. The service time appears to vary more when more desktops are being serviced. In Figure 5.27

(c), the points fan out. The variation of the residuals is greater for greater number of desktops.

5.10

a. (i) y *

5 .

0206

95% P.I. for y * = [4.3402, 5.7010]

(ii)

 e

5 .

0206 

151 .

5022

95% P.I. for y = [ e

4 .

3402

, e

5.7010

]

[76.7229, 299.1664] b.

The residual plot is curved indicating the straight-line model does not fit the data appropriately.

Yes y

ˆ

95% C.I. [7(21.4335), 7(28.5803)] = [150.0345, 200.0621]

95% P.I. [7(13.3044), 7(36.7094)] = [93.1308, 256.9658]

Allow 200 minutes.

The residual plot has a curved appearance, low in middle and high at both ends.

The residual plot has a horizontal band appearance.

Yes. y

ˆ

7 ( 25 .

0069 )

175 .

048

220 ( 5 .

635 )

1239 .

70

95% C.I. [220(5.306), 220(5.964)] = [1167.32, 1312.08]

95% P.I. [220(3.994), 220(7.276)] = [878.68, 1,600.72]

The points lie approximately on a straight line, but we should look at some other graphs and the Anderson-Darling statistic for further verification. b.

Large hospitals make it somewhat difficult to look at these plots, but no assumptions appear to be violated. The obvious outliers should be investigated.

5.15 a. They seem about the same except possibly hospital 12. b. t

( n

( k

2 ))

[.

025 ]

 t

16

3

2

 

.

025

2.201

t

( n

( k

2 ))

[.

005 ]

 t

16

 

3

2

 

.

005

3.106

Since | d

12

| = 2.2241, there is some evidence that hospital 12 is an outlier with respect to y, but not strong evidence. We should not be very concerned.

43

c.

Since 2

 k

1

/ n

2

3

1

/ 16

.

5 , all three hospitals are outliers with respect to the x values. d.

Yes, Cook’s D for hospital 17 when hospital 14 is included is 5.033 and with hospital 14 removed Cook’s D is reduced to 1.317. e.

Yes, Cook’s D for original hospital 16 when hospital 14 is removed is 1.384, which is considerably less than 5.033 for hospital 17 when hospital 14 is included.

5.16 a. b

4

2871 .

7828 The mean monthly labor hours for a large hospital will exceed those for small (not large) hospital by 2871.7828 hours when values of the other variables remain the same.

Since the p -value = .0003 < .001, we have very strong evidence that value of b

4 is statistically different from 0. b. d

14

1 .

4058 t

 n

  k

2

 

.

025

 t

17

 

4

2

 

.

025

2 .

201 .

Since |1.4058| < 2.201, we do not have evidence that hospital 14 is an outlier with respect to y . c.

Cook’s D for hospital 17 has been reduced from 5.033 to .738 by including the dummy variable in the model. Hospital 17 is no longer influential. d.

n = 17, Dummy: 17,030 – 15,175 = 1,855 n = 16, No Dummy: 16,886 – 14,906 = 1,980 n = 17, No Dummy: 17,618 – 14,511 = 3,107

Model with the dummy variable for all 17 hospitals. e.

The model with the dummy variable for all 17 hospitals has a residual plot with the most horizontal ban appearance. f.

The best model for evaluating hospitals appears to be y

 

0

 

1 x

1

 

2 x

2

 

3 x

4

D

L

 

using estimation from all 17 hospitals.

It has small p -values for all independent variables.

It has the smallest s and shortest P.I.

for the questionable hospital.

The residual plots seem to have the most horizontal band appearance.

The influence of the individual hospitals on the estimates is low.

5.17 Removing observation 17 would substantially change the least squares point estimate of

1

.

5.18

Removing observation 17 would substantially change the point prediction of y

17

.

44

Download