

12/14/98 252z9871

2. Data from 1) is repeated below.



 y x

1

 y

82



.

6 ,

?,



 x y

2

2 y





786

34

.

64

.

1 ,



,

 x

1 x

2 x

1





130 .

0 ,

54



.

0 and n

 x

1

2

9



.

2364 .

0 ,

 x

2 a. Do a multiple regression of salary against months and gender.. (12)



4 .

0 ,

 x

2

2 

4 .

0 , b. Compute problem.(5)

R

2

and R

2

adjusted for degrees of freedom. Compare both with values for the previous c. Do an F test to see if gender helps explain salary.(6) d. Use your equation to predict salary for a female employee with 30 months service. (2) e. Using the method suggested in the text, make your answer to d into prediction and confidence intervals.

(4) f. Using the results from this problem and the previous one draw a graph with 3 regression lines showing salary against months for (i) employees in general, (ii) male employees and (iii) female employees (4)

Solution: a) First, we compute compute



X

1

Y





X

1

X



X

2

Y

 n X

2

Y

4



2



54

9



0 .

44444

.

0



2



2 .

22222

9 .

17778

. Third, we compute





1306

34 .

1

.

50



,

Y



9



0 .

44444 and



X



2

Y

X



1



34

X

Y

2

, X

.

1

2



9 .

17778





,



1

 n

 n X

14.44444



Y

Y

1

2

2





28

786



2 .

61111

X

2



54 and X

.

64 ,

2





4 .

0

X

9

1

2 



0 .

44444

4 .

0 ,

. Second, we



X

2

2



4 .

0 and

.

55556

,



,

X

1

2





X n X

.

0



9



14 .

4444

1

Y

1

2



 n X

1

Y

486



0 .

44444

.

22222







113



.

38889

,



3 .

77778

,

X

2

2

 n X

2

2

. Fourth, we substitute these numbers into the Simplified Normal Equations:





X

X

1

Y

2

Y



 n X n X

1

Y

2

Y



 b

1 b

1









X

X

1

2

1



X

2 n X



1

2 n X

 

1

X

2

2



 b

X

 

2

1

X



2 which are

113



.

3889

2 .

6111





482



.

2222

3 .

7778 b

1 b

1





3 .

7778

2 .

2222 b

2 b

2

X



2

2 n X



1

X n X

2

2

2





, and solve them as two equations in two unknowns for b

1 and b

2

. We do this by multiplying the second equation by 1.70, which is 3.7778 divided by 2.2222 so that the two equations become

113



.

3889

4 .

4389





486



6

.

.

2222

4222 b b

1

1





3

3

.

.

7778

7778 b b

2

2

, we then add these two together to get

109 get

.

95 solving





3 .

7778 b

0

479

 b

2

.

80 b

1

Y



 b

1

, so that

113

X

.

3889

1

 b

2

 b

1

X

2



486



0 .

22707

.

2222



9 .

17778

. The first of the two normal equations can now be rearranged to

0 .

22707





, which gives us



0 .

22707



14 .

4444 b

2

 

0 .

78897

0 .

78897



0 .

4444

. Finally we get





6 .

24853 b

0

by

. Thus our equation is b) The coefficient of determination is

 s e

2



0 .

22707





Y

2

Y

ˆ



 b

0 n Y

2





113 .

3889 b

1

X

1

 b

2

X

2



0 .

78897

28



.

5556 b

1

 

X

1

Y

 n





6 .

24853



R

2 





2 .

6111

 n X

3

1

Y b

1

 

2





0 .

22707 X



X

1

Y



1

 n X



0 .

78897 X

1

Y

Y

2

 



2 n Y

2



2

.

97379



X

2

Y

 n X

2

Y





 

Y

X

2

2

Y

 n



. (The standard error is n Y

 n X

3

2

2

Y



 

1



R

2



, but we don’t need it yet.) Our results can be summarized below as:

R

2

.92601 n

9 k

1

R

2

.9154

.97379 9 2 .9651

4

12/14/98 252z9871

R

2

, which is R

2

adjusted for degrees of freedom, has the formula R

2

X

2

Error

Total

0.699

28.556

6

8

0.1165



 n



1



R

2 n

 k





1 k

, where k is the number of independent variables. regression is better.

R

2

adjusted for degrees of freedom seems to show that our second c) the easiest way to do the F test and have it look right is to note that



Y

2  n Y

2 

28 .

55556 . For the regression with one independent variable the regression sum of squares is

R

2

 

Y

2  n Y

2







.

92601 regression sum of squares is



28

R

2



.

55556



Y

2







26 n Y

.

443

2







. For the regression with two independent variables the

.

97379



28 .

55556





27 .

807 . The difference between these is 1.364. the remaining unexplained variation is 28.556 – 27.807 = 0.699. the ANOVA table is

Source SS DF MS F F

.

05

X

1

26.443 1 26.443

1.364 1 1.364 11.71

F

6

1 

5 .

99

Since our computed F is larger than the table F , we reject our null hypothesis that X

2

has no effect. d)

Y

ˆ  b

0

 b

1

X

1

 b

2

X

2



6 .

24853



0 .

22707 X

1



0 .

78897 X e) According to the text, we can use the following:

2

Confidence interval Y

ˆ

Prediction interval Y

ˆ

 t n



 k



1

2 s e n

 t n



 k



1

2 s e

.



6 .

24853



0 .

22707

 



0 .

78897 ( 1 )



12 .

2717 f) We get our general equation from the last problem, and then take the equation from this problem with

X

2



0 for men and X

2



1 for women.

If X

1



0 If X

1



20

General

Y

ˆ 

5 .

81



0 .

233 X

1

Y

ˆ 

5 .

81 Y

ˆ 

10 .

47

Men

Y

ˆ 

6 .

25



0 .

227 X

1

Y

ˆ 

6 .

25 Y

ˆ 

10 .

79

Women

Y

ˆ 

5 .

46



0 .

227 X

1

Y

ˆ 

5 .

46 Y

ˆ 

10 .

00

These points enable us to graph the lines.

5

12/14/98 252z9871

3. Data from the previous problem is repeated again but with sales replacing gender! a) Compute the correlation between sales and salary. Is it significant? (5) b) Compute a rank correlation between sales and salary. Is it significant?

Why might we expect rank correlation to be higher that the conventional correlation? (5) c) Compute Kendall's W for this data and test it for significance.(6) y x

1 x

2 x

2 y x

2

2

Salary months sales

7.5 6 2.25 16.875 5.0625 Boldface data is additional computations.

8.6 10 2.58 22.188 6.6564

9.1 12 2.73 24.843 7.4529

10.3 18 3.09 31.827 9.5481

13.0 30 3.90 50.700 15.2100

6.2 5 2.86 17.732 8.1796

8.7 13 3.81 33.147 14.5161

9.4 15 3.82 35.908 14.5924

9.8 21 3.94 38.612 15.5236

82.6 28.98 271.832 96.7416

Solution: a)



Y



82 .

6 ,



Y

2 

786 .

64 ,



X

2



28 .

98 ,



X

2

2



96 .

7416 ,



X

2

Y



271 .

832 and n



9 .

This means that X

2 r xy





28 .

98

9



X

2

2



X

2

Y

 n X

2

2

 n X

2

Y



Y

2



3 .

22 . Other sums come from previous problems.

 n Y

2



271 .

832

96 .

7416



9



3 .

22



9 .

17778





9



3 .

22



2

28 .

55556



.

3510 r xy



.

3510



.

59246

For the significance test, t

 r



.

5925

1



.

3510



1 .

945 . For a 1-sided test H

0

:

 

0 , H

1

:

 

0

1

 r n



2

2

7 we find t n



2

.

05

 t

7

.

05



1 .

895 . Since the computed t is more than the critical value we reject conclude that the correlation is significant. However, for a 2-sided test H

0

:

 

0 , H

1

:



H

0

and



0 we find t n



2

.

025

 t

7

.

025



2 .

365 . Since the computed t is less than the critical value we accept H

0

and conclude that the correlation is not significant. b) Computations for both b) and c) appear in the table below. r y r x 1 d

2 d r y r x 1 r x 2

SR 2

SR

2 1

3 2

5 3

8 5

9 8

1 4

4 6

6 7

7 9

1

1

2

3

1

-3

-2

-1

-2

0

1

1

4

9

1

9

4

1

4

34

2 2 1

3 3 2

5 4 3

8 7 5

9 9 8

1 1 4

4 5 6

6 6 7

7 8 9

5

8

12

20

26

6

15

19

24

135

25

64

144

400

676

36

225

361

576

2507

6

12/14/98 252z9871

The first 4 columns are the rank correlation. We rank the items in columns. the rankings and must sum to zero. a 1-sided test and r s

2 

1



6 n





 n

2 d



2

1



 

1



9



6

 

81



1





1

 d is the difference between

.

2833



.

7167 n



9 , the 5% critical value is 0.600, so the correlation is significant.

. From the table for c) To compute Kendall’s W , we ranked the data in columns and added across columns to get row sums.

We square and sum these row sums. SR



 n

SR



135

9



15 , S





2  n SR

2



2507



9

 

2 

482 and if k is the number of columns (judges), W



1

12 k

2

S n

3  n



1

482

 

 

729 9



12





.

8926 . According to the table, the 5% critical value for S is 54.0, indicating significant agreement, since our computed S is higher. (We reject H

0

, disagreement.)

Solution Continues in 252zz9871

7



Related documents

Products

Support



Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib