Economics 310 Lecture 13 Heteroscedasticity Continued Tests to be Discussed Goldfeld-Quandt Test Breusch-Pagan-Godfrey Test Assumes variance monotonically associated with some variable. Variance linear function of set of variables or function of a linear combination of variables. White General Heteroscedasticity Test Source unknown, but may exist. Goldfeld-Quandt Test Test assumes i2 f ( Z i ) where Z is some variable and f is a monotonic function. (i.e. df df 0 or 0 for all Z). dZ dZ 1. Order data according to Z. n-c n-c , c, . 2 2 3. Estimate model in the first and third groups. 2. Divide data into 3 groups of size 4. Get varian ce estimates for each group. 5. Test statistic is an F with n-c n-c k and k degrees of freedom. 2 2 ˆ 22 F 2 , arrange so largest va riance estimate is on top. ˆ1 6. H 0 : Homosceda sticity H a : Heterosce dasticity Data Organization Observation Group 1 (n-c)/2=(20-4)/2=8 obs C=4 Group 2 (n-c)/2=(20-4)/2=8 obs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Y 370.11 361.45 351.30 342.23 332.64 325.50 314.92 310.00 302.15 297.38 291.26 285.30 277.43 272.61 261.92 259.55 246.14 240.57 229.52 220.45 X1 10.00 12.65 15.65 17.95 20.09 22.96 25.90 27.97 30.69 32.93 35.16 37.84 40.08 42.76 45.16 47.24 49.69 51.95 54.08 56.92 X2 30.00 28.79 27.36 25.94 24.22 23.73 22.24 21.58 20.96 20.72 20.07 19.78 19.02 18.81 17.20 16.97 15.03 14.11 12.14 11.42 Z 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 Obstetrics Example Data from 800+ hospitals. Dependent variable is the average length of stay in maternity ward. Explanatory variables is the charge per day and % of deliveries that are c-sections. Expect greater variability in length of stay at hospitals that are not subject to high managed care. Shazam Commands sample 1 859 read (d:\econom~1\classe~1\ob1_het.txt) cases rate los cost billed neo mcph mcpm genr charge=billed/los ols los rate charge diagnos / chowone=589 Shazam Output for GoldfeldQuandt Test VARIABLE ESTIMATED STANDARD T-RATIO NAME COEFFICIENT ERROR 856 DF RATE 4.1647 0.2426 17.17 CHARGE -0.43644E-03 0.2652E-04 -16.46 CONSTANT 2.1049 0.6367E-01 33.06 |_diagnos / chowone=589 PARTIAL STANDARDIZED ELASTICITY P-VALUE CORR. COEFFICIENT AT MEANS 0.000 0.506 0.4940 0.4056 0.000-0.490 -0.4737 -0.3229 0.000 0.749 0.0000 0.9173 REQUIRED MEMORY IS PAR= 123 CURRENT PAR= 500 DEPENDENT VARIABLE = LOS 859 OBSERVATIONS REGRESSION COEFFICIENTS 4.16471785492 -0.436436399782E-03 2.10491763784 SEQUENTIAL CHOW AND GOLDFELD-QUANDT TESTS N1 N2 SSE1 SSE2 CHOW 589 270 238.20 21.357 40.564 PVALUE 0.000 G-Q 5.082 DF1 586 DF2 PVALUE 267 0.000 Breusch-Pagan-Godfrey Test Model : Yi 1 2 X 2i ... k X ki i i2 f (1 2 Z 2i ... m Z mi ) or more specifical ly i2 1 2 Z 2i ... m Z mi 1. Estimate model by OLS and get residuals SSE 2. Obtain Maximum Likelihood estimate of variance ~ 2 n 3. Construct p i ˆ i2 / ~ 2 4. Estimate the regression : p i 1 2 Z 2i ... m Z mi i 5. Obtain the SSR for the regression 4 above and define 1 ( SSR) ~ m2 1 2 6. Test the null hypothesis of homoscedas ticity using above Chi - squared random variable. Example of BPG Test using OB Data Null hypothesis is homoscedasticity Let the Z’s be (1) the number of of OB cases per year and (2) whether the hospital is under high managed care Expect variance to be negatively related to both variables. Shazam Code for OB Example * performing Breusch-Pagan Test on cases and mcph ?ols los rate charge / resid=e dn anova gen1 sigsq=$sig2 genr esq=e*e genr p=esq/sigsq ols p cases mcph / anova gen1 ess=$ssr gen1 pbg=ess/2 Print pbg Shazam Output for OB Example |_ols p cases mcph / anova VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY NAME COEFFICIENT ERROR 856 DF P-VALUE CORR. COEFFICIENT AT MEANS CASES -0.37767E-03 0.2353E-03 -1.605 0.109 -0.055 -0.0551 -0.5201 MCPH -0.52494 0.6677 -0.7861 0.432 -0.027 -0.0270 -0.1650 CONSTANT 1.6851 0.4774 3.529 0.000 0.120 0.0000 1.6851 |_gen1 ess=$ssr ..NOTE..CURRENT VALUE OF $SSR = 288.00 |_gen1 pbg=ess/2 |_Print pbg PBG 144.0017 Built in BPG Test in Shazam Shazam has a built in BPG test. Uses the explanatory variables as the Zs. Invoked by using the command “DIAGNOS” with the option “HET” right after the “OLS” command. i.e. ols y x1 x2 diagnos / het Using HET on OB example |_?ols los rate charge |_diagnos / het REQUIRED MEMORY IS PAR= 123 CURRENT PAR= 500 DEPENDENT VARIABLE = LOS 859 OBSERVATIONS REGRESSION COEFFICIENTS 4.16471785492 -0.436436399782E-03 2.10491763784 HETEROSKEDASTICITY TESTS E**2 ON YHAT: CHI-SQUARE = 19.852 WITH 1 D.F. E**2 ON YHAT**2: CHI-SQUARE = 80.223 WITH 1 D.F. E**2 ON LOG(YHAT**2): CHI-SQUARE = 1.018 WITH 1 D.F. E**2 ON X (B-P-G) TEST: CHI-SQUARE = 35.644 WITH 2 D.F. E**2 ON LAG(E**2) ARCH TEST: CHI-SQUARE = 0.027 WITH 1 D.F. LOG(E**2) ON X (HARVEY) TEST: CHI-SQUARE = 5.259 WITH 2 D.F. ABS(E) ON X (GLEJSER) TEST: CHI-SQUARE = 62.292 WITH 2 D.F. White General Test for Heteroscedasticity This is a general test. No preconception of cause of heteroscedasticity Is a Lagrange-Multiplier Test Regress squared residuals on explanatory variables, their squares and their cross products. n*R2 is chi-squared variable White Test Model : Yi 1 2 X 2i ... k X ki i Obtain residuals, ˆ i , from above regression and form the following auxillary regression : ˆ i2 1 2 X 2i ... k X ki k 1 X 22i ... 2 k 1 X ki2 2k X 2i X 3i ... k ( k 1) X k 1,i X ki i 2 Test statistic 2 k2( k 1) n * Raux . 2 1 Shazam code for White test for OB example ?ols los rate charge / resid=e genr esq=e*e genr rate2=rate*rate genr charge2=charge*charge genr charrate=charge*rate ?ols esq rate charge rate2 charge2 charrate gen1 rsqaux=$r2 gen1 numb=$n gen1 white=numb*rsqaux print white Results of White’s test for OB example |_?ols los rate charge / resid=e |_genr esq=e*e |_genr rate2=rate*rate |_genr charge2=charge*charge |_genr charrate=charge*rate |_?ols esq rate charge rate2 charge2 charrate |_gen1 rsqaux=$r2 ..NOTE..CURRENT VALUE OF $R2 = 0.44959 |_gen1 numb=$n ..NOTE..CURRENT VALUE OF $N = 859.00 |_gen1 white=numb*rsqaux |_print white WHITE 386.2005 Note: Critical chi-square 5 df. = 11.0705 White Correction Do not know the source of heteroscedasticity. Forced to use OLS estimates. Consistent estimate of true variancecovariance matrix of OLS estimators. Gives test of hypothesis that are asymptotically unbiased. Covariance Matrix OLS b X ' X X ' y X ' X X ' e 1 1 b b E X ' X X ' ee'X X ' X 1 1 X ' X X 'WX X ' X 1 1 The White correction gives a consistent estimate of the above variance - covariance matrix. OB Example with White Correction |_ols los rate charge / hetcov USING HETEROSKEDASTICITY-CONSISTENT COVARIANCE MATRIX R-SQUARE = 0.3424 R-SQUARE ADJUSTED = 0.3409 VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.34648 STANDARD ERROR OF THE ESTIMATE-SIGMA = 0.58863 SUM OF SQUARED ERRORS-SSE= 296.59 MEAN OF DEPENDENT VARIABLE = 2.2946 LOG OF THE LIKELIHOOD FUNCTION = -762.125 VARIABLE ESTIMATED STANDARD T-RATIO NAME COEFFICIENT ERROR 856 DF RATE 4.1647 1.189 3.501 CHARGE -0.43644E-03 0.5672E-04 -7.694 CONSTANT 2.1049 0.1944 10.83 PARTIAL STANDARDIZED ELASTICITY P-VALUE CORR. COEFFICIENT AT MEANS 0.000 0.119 0.4940 0.4056 0.000-0.254 -0.4737 -0.3229 0.000 0.347 0.0000 0.9173 Estimated Generalized LeastSquares The estimated generalize d least squares estimator is obtained by replacing the actual variance - covariance matrix wit h an estimate of it. ˆ ( X 'Wˆ 1 X ) 1 X 'Wˆ 1 y The estimated variance - covariance matrix of the estimated generalize d least squares estimator is ˆ ˆ ( X 'Wˆ 1 X ) 1 Possible variance forms 1 1. X i , weight Xi 2 i 2 2. i2 2 X i , to be estimated . weight X i ˆ 2 3. Variance proportion al to expected value of Y i2 2 ( E ( yi )) 2 2 ( 1 2 X i )2 1 weight yˆ i