Multicollinearity Exercise Use the attached SAS output to answer the questions. [OPTIONAL: Copy the SAS program below into the SAS editor window and run it.] Please don’t print out all the output shown below or the from the SAS job if you decide to run it. 1. Use at least three different methods to diagnose whether multicollinearity is a problem for this set of data. 2. Identify which variables are key participants in the most serious near linear dependency in the data. (How do you know this?) 3. Which variable has the “wrong sign” for its coefficient in this regression? Explain why its sign is wrong. 4. What is the smallest value of the ridge constant (k) that “fixes” the sign of the coefficient you named in #3? 5. What is the smallest value of the ridge constant (k) that reduces all VIF’s so that they are below the guideline of 10? 6. What is the smallest value of k that seems (in your opinion) to stabilize the coefficients? 7. If one principle component is removed, give the estimated coefficients for X1, X2, X3, X4. Does this fix the one with the “wrong” sign? ******************************************************************* ************ LAW SCHOOL ADMISSION DATA ****************** **************** PARTLY FROM PAGE 599 OF SMITH *************** *******************************************************************; **** DATA FOR 20 STUDENTS ****** Y IS THE LAW SCHOOL GPA X1 IS THE UNDERGRADUATE SCHOOL GPA X2 IS THE LMAT PERCENTILE X3 IS A RATING OF THE UNDERGRADUATE SCHOOL QUALITY X4 IS THE GRE SCORE; DATA LAW; INPUT Y X1 X2 X3 X4 NO $; CARDS; 3.42 3.28 .96 6 1330 1 3.60 3.18 .97 7 1370 2 3.28 2.89 .93 5 1140 3 3.75 3.72 .99 8 1520 4 3.36 3.18 .95 6 1270 5 3.96 3.50 .98 8 1450 6 3.31 3.04 .94 5 1200 7 3.33 3.87 .95 5 1340 8 3.60 3.54 .96 7 1350 9 4.00 3.27 .99 10 1480 a 3.28 3.30 .95 5 1280 b 3.44 3.29 .91 7 1080 c 3.25 3.17 .93 5 1170 d 3.75 3.62 .97 8 1410 e 3.30 3.34 .96 5 1330 f 3.20 3.08 .90 4 1010 g 3.50 3.37 .96 6 1340 h 3.28 3.16 .94 5 1220 i 3.17 3.20 .95 4 1270 j 3.31 3.10 .94 5 1210 k ; TITLE 'LAW SCHOOL ADMISSIONS DATA'; PROC CORR; VAR Y X1 X2 X3 X4; PROC REG; MODEL Y=X1 X2 X3 X4 / COLLIN VIF; PROC REG RIDGE = 0 TO .01 BY .001 OUTEST=B; MODEL Y=X1 X2 X3 X4 ; PROC PRINT; PROC PLOT; PLOT (X1 X2 X3 X4) * _RIDGE_ / VREF=0 VPOS=25 HPOS=45; PROC REG DATA = LAW MODEL Y=X1 X2 X3 X4; PROC PRINT; RIDGE= 0 TO .01 BY .001 OUTEST=C OUTVIF; PROC REG DATA = LAW MODEL Y=X1 X2 X3 X4; PROC PRINT; run; PCOMIT=1 2 3 OUTEST=C; LAW SCHOOL ADMISSIONS DATA Correlation Analysis Pearson Correlation Coefficients / Prob > |R| under Ho: Rho=0 / N = 20 Y X1 X2 X3 X4 Y 1.00000 0.0 0.47331 0.0350 0.76094 0.0001 0.95925 0.0001 0.76574 0.0001 X1 0.47331 0.0350 1.00000 0.0 0.52911 0.0164 0.42078 0.0647 0.65377 0.0018 X2 0.76094 0.0001 0.52911 0.0164 1.00000 0.0 0.69724 0.0006 0.98781 0.0001 X3 0.95925 0.0001 0.42078 0.0647 0.69724 0.0006 1.00000 0.0 0.69983 0.0006 X4 0.76574 0.0001 0.65377 0.0018 0.98781 0.0001 0.69983 0.0006 1.00000 0.0 Model: MODEL1 Dependent Variable: Y Analysis of Variance Source DF Sum of Squares Mean Square Model Error C Total 4 15 19 1.07143 0.07106 1.14249 0.26786 0.00474 Root MSE Dep Mean C.V. 0.06883 3.45450 1.99243 R-square Adj R-sq F Value Prob>F 56.542 0.0001 0.9378 0.9212 Parameter Estimates Variable DF Parameter Estimate Standard Error T for H0: Parameter=0 Prob > |T| INTERCEP X1 X2 X3 X4 1 1 1 1 1 -2.378637 0.125719 6.058256 0.129773 -0.000878 24.38266385 0.64288464 32.14872369 0.01417076 0.00646192 -0.098 0.196 0.188 9.158 -0.136 0.9236 0.8476 0.8531 0.0001 0.8937 Variable DF Variance Inflation INTERCEP X1 X2 X3 X4 1 1 1 1 1 0.00000000 96.67364263 2280.9435770 1.99014566 2880.9838356 Collinearity Diagnostics Condition Var Prop Var Prop Var Prop Var Prop Var Prop Number Eigenvalue Index INTERCEP 1 2 3 4 5 4.95347 0.04096 0.00348 0.00209 1.47569E-7 1.00000 10.99716 37.70759 48.66811 5794 0.0000 0.0000 0.0000 0.0000 1.0000 X1 X2 0.0000 0.0001 0.0022 0.0150 0.9827 X3 0.0000 0.0000 0.0000 0.0000 1.0000 X4 0.0012 0.5624 0.3141 0.1117 0.0106 0.0000 0.0000 0.0004 0.0005 0.9991 OBS _MODEL_ _TYPE_ _DEPVAR_ _RIDGE_ _PCOMIT_ _RMSE_ 1 2 3 4 5 6 7 8 9 10 11 12 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 PARMS RIDGE RIDGE RIDGE RIDGE RIDGE RIDGE RIDGE RIDGE RIDGE RIDGE RIDGE Y Y Y Y Y Y Y Y Y Y Y Y . 0.000 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.010 . . . . . . . . . . . . 0.068829 0.068829 0.068871 0.068880 0.068886 0.068892 0.068899 0.068907 0.068915 0.068925 0.068935 0.068947 OBS INTERCEP X1 X2 X3 X4 Y 1 2 3 4 5 6 7 8 9 10 11 12 -2.37864 -2.37864 0.89844 1.17880 1.28047 1.33140 1.36094 1.37947 1.39160 1.39968 1.40504 1.40850 0.12572 0.12572 0.03989 0.03249 0.02977 0.02838 0.02755 0.02700 0.02663 0.02637 0.02617 0.02603 6.05826 6.05826 1.73602 1.36524 1.23008 1.16182 1.12178 1.09627 1.07920 1.06748 1.05935 1.05373 0.12977 0.12977 0.12932 0.12907 0.12883 0.12859 0.12836 0.12813 0.12790 0.12767 0.12744 0.12721 -.00087848 -.00087848 -.00000776 0.00006863 0.00009765 0.00011320 0.00012307 0.00013001 0.00013524 0.00013938 0.00014279 0.00014568 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Plot of X1*_RIDGE_. Legend: A = 1 obs, B = 2 obs, etc. X1 ‚ ‚ 0.15 ˆ ‚ ‚ ‚ A ‚ ‚ 0.10 ˆ ‚ ‚ ‚ ‚ ‚ 0.05 ˆ ‚ A ‚ A A ‚ A A A A A A A ‚ ‚ 0.00 ˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ ‚ Šƒˆƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒˆƒƒƒ 0.000 0.002 0.004 0.006 0.008 0.010 Ridge regression control value Plot of X2*_RIDGE_. Legend: A = 1 obs, B = 2 obs, etc. X2 ‚ ‚ 6 ˆ A ‚ ‚ ‚ ‚ ‚ 4 ˆ ‚ ‚ ‚ ‚ ‚ 2 ˆ ‚ A ‚ A A ‚ A A A A A A A ‚ ‚ 0 ˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Šƒˆƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒˆƒƒƒ 0.000 0.002 0.004 0.006 0.008 0.010 Ridge regression control value Plot of X3*_RIDGE_. Legend: A = 1 obs, B = 2 obs, etc. X3 ‚ ‚ 0.15 ˆ ‚ ‚ A A ‚ A A A A A A A A A ‚ ‚ 0.10 ˆ ‚ ‚ ‚ ‚ ‚ Šƒˆƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒˆƒƒƒ 0.000 0.002 0.004 0.006 0.008 0.010 Ridge regression control value Plot of X4*_RIDGE_. Legend: A = 1 obs, B = 2 obs, etc. X4 ‚ ‚ ‚ 0.00025 ˆ ‚ ‚ A A A A A A A A ‚ A 0 ˆƒƒƒƒƒAƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ ‚ ‚ ‚ -0.00025 ˆ ‚ ‚ ‚ -0.0005 ˆ ‚ ‚ ‚ -0.00075 ˆ ‚ ‚ A ‚ -0.001 ˆ ‚ Šƒˆƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒˆƒƒƒ 0.000 0.002 0.004 0.006 0.008 0.010 Ridge regression control value O B S 1 2 3 4 OBS 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 _MODEL_ MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 MODEL1 _TYPE_ PARMS RIDGEVIF RIDGE RIDGEVIF RIDGE RIDGEVIF RIDGE RIDGEVIF RIDGE RIDGEVIF RIDGE RIDGEVIF RIDGE RIDGEVIF RIDGE RIDGEVIF RIDGE RIDGEVIF RIDGE RIDGEVIF RIDGE RIDGEVIF RIDGE _DEPVAR_ Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y OBS 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 INTERCEP -2.37864 . -2.37864 . 0.89844 . 1.17880 . 1.28047 . 1.33140 . 1.36094 . 1.37947 . 1.39160 . 1.39968 . 1.40504 . 1.40850 X1 0.1257 96.6736 0.1257 3.9053 0.0399 2.1861 0.0325 1.8012 0.0298 1.6537 0.0284 1.5803 0.0275 1.5373 0.0270 1.5090 0.0266 1.4887 0.0264 1.4731 0.0262 1.4606 0.0260 _ X2 6.06 2280.94 6.06 59.05 1.74 17.99 1.37 8.89 1.23 5.48 1.16 3.84 1.12 2.93 1.10 2.37 1.08 2.00 1.07 1.74 1.06 1.55 1.05 _ M O D E L _ MODEL1 MODEL1 MODEL1 MODEL1 D _ E T P Y V P A E R _ _ PARMS Y IPC Y IPC Y IPC Y _ _ R I D G E _ . . . . P C O M I T _ . 1 2 3 _RIDGE_ . 0.000 0.000 0.001 0.001 0.002 0.002 0.003 0.003 0.004 0.004 0.005 0.005 0.006 0.006 0.007 0.007 0.008 0.008 0.009 0.009 0.010 0.010 X3 0.12977 1.99015 0.12977 1.95543 0.12932 1.94548 0.12907 1.93597 0.12883 1.92660 0.12859 1.91732 0.12836 1.90811 0.12813 1.89898 0.12790 1.88992 0.12767 1.88093 0.12744 1.87201 0.12721 _PCOMIT_ . . . . . . . . . . . . . . . . . . . . . . . X4 -0.00 2880.98 -0.00 74.08 -0.00 22.21 0.00 10.72 0.00 6.41 0.00 4.34 0.00 3.19 0.00 2.48 0.00 2.02 0.00 1.70 0.00 1.47 0.00 _ I R N M T S C E P X X _ T 1 2 0.06883 -2.37864 0.12572 6.05826 0.06670 1.52761 0.02349 0.90751 0.10775 -0.66338 -0.13747 3.65788 0.13095 -0.76037 0.20866 2.78208 X 3 0.12977 0.12951 0.06579 0.03571 _RMSE_ 0.068829 . 0.068829 . 0.068871 . 0.068880 . 0.068886 . 0.068892 . 0.068899 . 0.068907 . 0.068915 . 0.068925 . 0.068935 . 0.068947 Y -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 X 4 -.00087848 0.00015692 0.00053840 0.00051382 Y -1 -1 -1 -1