Economics 310

advertisement
Economics 310
Lecture 13
Heteroscedasticity Continued
Tests to be Discussed

Goldfeld-Quandt Test


Breusch-Pagan-Godfrey Test


Assumes variance monotonically associated with
some variable.
Variance linear function of set of variables or
function of a linear combination of variables.
White General Heteroscedasticity Test

Source unknown, but may exist.
Goldfeld-Quandt Test
Test assumes  i2  f ( Z i ) where Z is some variable and
f is a monotonic function. (i.e.
df
df
 0 or
 0 for all Z).
dZ
dZ
1. Order data according to Z.
n-c
n-c
, c,
.
2
2
3. Estimate model in the first and third groups.
2. Divide data into 3 groups of size
4. Get varian ce estimates for each group.
5. Test statistic is an F with
n-c
n-c
 k and
 k degrees of freedom.
2
2
ˆ 22
F  2 , arrange so largest va riance estimate is on top.
ˆ1
6. H 0 : Homosceda sticity
H a : Heterosce dasticity
Data Organization
Observation
Group 1
(n-c)/2=(20-4)/2=8 obs
C=4
Group 2
(n-c)/2=(20-4)/2=8 obs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Y
370.11
361.45
351.30
342.23
332.64
325.50
314.92
310.00
302.15
297.38
291.26
285.30
277.43
272.61
261.92
259.55
246.14
240.57
229.52
220.45
X1
10.00
12.65
15.65
17.95
20.09
22.96
25.90
27.97
30.69
32.93
35.16
37.84
40.08
42.76
45.16
47.24
49.69
51.95
54.08
56.92
X2
30.00
28.79
27.36
25.94
24.22
23.73
22.24
21.58
20.96
20.72
20.07
19.78
19.02
18.81
17.20
16.97
15.03
14.11
12.14
11.42
Z
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
2.0
2.2
2.4
2.6
2.8
3.0
3.2
3.4
3.6
3.8
4.0
Obstetrics Example




Data from 800+ hospitals.
Dependent variable is the average length of
stay in maternity ward.
Explanatory variables is the charge per day
and % of deliveries that are c-sections.
Expect greater variability in length of stay at
hospitals that are not subject to high
managed care.
Shazam Commands
sample 1 859
read (d:\econom~1\classe~1\ob1_het.txt) cases rate los cost billed neo mcph mcpm
genr charge=billed/los
ols los rate charge
diagnos / chowone=589
Shazam Output for GoldfeldQuandt Test
VARIABLE
ESTIMATED STANDARD
T-RATIO
NAME
COEFFICIENT
ERROR
856 DF
RATE
4.1647
0.2426
17.17
CHARGE
-0.43644E-03 0.2652E-04 -16.46
CONSTANT
2.1049
0.6367E-01
33.06
|_diagnos / chowone=589
PARTIAL STANDARDIZED ELASTICITY
P-VALUE CORR. COEFFICIENT AT MEANS
0.000 0.506
0.4940
0.4056
0.000-0.490
-0.4737
-0.3229
0.000 0.749
0.0000
0.9173
REQUIRED MEMORY IS PAR= 123 CURRENT PAR= 500
DEPENDENT VARIABLE = LOS
859 OBSERVATIONS
REGRESSION COEFFICIENTS
4.16471785492
-0.436436399782E-03
2.10491763784
SEQUENTIAL CHOW AND GOLDFELD-QUANDT TESTS
N1
N2
SSE1
SSE2
CHOW
589 270 238.20
21.357
40.564
PVALUE
0.000
G-Q
5.082
DF1
586
DF2 PVALUE
267 0.000
Breusch-Pagan-Godfrey Test
Model : Yi  1   2 X 2i  ...   k X ki   i
 i2  f (1   2 Z 2i  ...   m Z mi ) or more specifical ly
 i2  1   2 Z 2i  ...   m Z mi
1. Estimate model by OLS and get residuals
SSE
2. Obtain Maximum Likelihood estimate of variance ~ 2 
n
3. Construct p i  ˆ i2 / ~ 2
4. Estimate the regression : p i  1   2 Z 2i  ...   m Z mi   i
5. Obtain the SSR for the regression 4 above and define
1
( SSR) ~  m2 1
2
6. Test the null hypothesis of homoscedas ticity using above

Chi - squared random variable.
Example of BPG Test using OB
Data



Null hypothesis is homoscedasticity
Let the Z’s be (1) the number of of OB
cases per year and (2) whether the
hospital is under high managed care
Expect variance to be negatively related
to both variables.
Shazam Code for OB Example
* performing Breusch-Pagan Test on cases and mcph
?ols los rate charge / resid=e dn anova
gen1 sigsq=$sig2
genr esq=e*e
genr p=esq/sigsq
ols p cases mcph / anova
gen1 ess=$ssr
gen1 pbg=ess/2
Print pbg
Shazam Output for OB
Example
|_ols p cases mcph / anova
VARIABLE ESTIMATED STANDARD T-RATIO
PARTIAL STANDARDIZED ELASTICITY
NAME
COEFFICIENT ERROR
856 DF P-VALUE CORR.
COEFFICIENT
AT MEANS
CASES
-0.37767E-03 0.2353E-03 -1.605 0.109
-0.055
-0.0551
-0.5201
MCPH
-0.52494
0.6677
-0.7861 0.432
-0.027
-0.0270
-0.1650
CONSTANT 1.6851
0.4774
3.529 0.000
0.120
0.0000
1.6851
|_gen1 ess=$ssr
..NOTE..CURRENT VALUE OF $SSR = 288.00
|_gen1 pbg=ess/2
|_Print pbg
PBG
144.0017
Built in BPG Test in Shazam





Shazam has a built in BPG test.
Uses the explanatory variables as the
Zs.
Invoked by using the command
“DIAGNOS” with the option “HET” right
after the “OLS” command.
i.e. ols y x1 x2
diagnos / het
Using HET on OB example
|_?ols los rate charge
|_diagnos / het
REQUIRED MEMORY IS PAR= 123 CURRENT PAR= 500
DEPENDENT VARIABLE = LOS
859 OBSERVATIONS
REGRESSION COEFFICIENTS
4.16471785492 -0.436436399782E-03 2.10491763784
HETEROSKEDASTICITY TESTS
E**2 ON YHAT:
CHI-SQUARE = 19.852 WITH 1 D.F.
E**2 ON YHAT**2: CHI-SQUARE = 80.223 WITH 1 D.F.
E**2 ON LOG(YHAT**2): CHI-SQUARE = 1.018 WITH 1 D.F.
E**2 ON X (B-P-G) TEST:
CHI-SQUARE = 35.644 WITH 2 D.F.
E**2 ON LAG(E**2) ARCH TEST: CHI-SQUARE =
0.027 WITH 1 D.F.
LOG(E**2) ON X (HARVEY) TEST: CHI-SQUARE =
5.259 WITH 2 D.F.
ABS(E) ON X (GLEJSER) TEST: CHI-SQUARE = 62.292 WITH 2 D.F.
White General Test for
Heteroscedasticity





This is a general test.
No preconception of cause of
heteroscedasticity
Is a Lagrange-Multiplier Test
Regress squared residuals on
explanatory variables, their squares and
their cross products.
n*R2 is chi-squared variable
White Test
Model : Yi  1   2 X 2i  ...   k X ki  i
Obtain residuals, ˆ i , from above regression and form
the following auxillary regression :
ˆ i2  1   2 X 2i  ...   k X ki
  k 1 X 22i  ...   2 k 1 X ki2
  2k X 2i X 3i  ...   k ( k 1) X k 1,i X ki   i
2
Test statistic
2
 k2( k 1)  n * Raux
.
2
1
Shazam code for White test
for OB example
?ols los rate charge / resid=e
genr esq=e*e
genr rate2=rate*rate
genr charge2=charge*charge
genr charrate=charge*rate
?ols esq rate charge rate2 charge2 charrate
gen1 rsqaux=$r2
gen1 numb=$n
gen1 white=numb*rsqaux
print white
Results of White’s test for OB
example
|_?ols los rate charge / resid=e
|_genr esq=e*e
|_genr rate2=rate*rate
|_genr charge2=charge*charge
|_genr charrate=charge*rate
|_?ols esq rate charge rate2 charge2 charrate
|_gen1 rsqaux=$r2
..NOTE..CURRENT VALUE OF $R2 = 0.44959
|_gen1 numb=$n
..NOTE..CURRENT VALUE OF $N = 859.00
|_gen1 white=numb*rsqaux
|_print white
WHITE
386.2005 Note: Critical chi-square 5 df. = 11.0705
White Correction




Do not know the source of
heteroscedasticity.
Forced to use OLS estimates.
Consistent estimate of true variancecovariance matrix of OLS estimators.
Gives test of hypothesis that are
asymptotically unbiased.
Covariance Matrix OLS
b  X ' X  X ' y    X ' X  X ' e
1

1
 b b  E  X ' X  X ' ee'X  X ' X 
1
1

  X ' X  X 'WX  X ' X 
1
1
The White correction gives a consistent estimate of
the above variance - covariance matrix.
OB Example with White
Correction
|_ols los rate charge / hetcov
USING HETEROSKEDASTICITY-CONSISTENT COVARIANCE MATRIX
R-SQUARE =
0.3424
R-SQUARE ADJUSTED =
0.3409
VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.34648
STANDARD ERROR OF THE ESTIMATE-SIGMA = 0.58863
SUM OF SQUARED ERRORS-SSE=
296.59
MEAN OF DEPENDENT VARIABLE =
2.2946
LOG OF THE LIKELIHOOD FUNCTION = -762.125
VARIABLE
ESTIMATED STANDARD
T-RATIO
NAME
COEFFICIENT
ERROR
856 DF
RATE
4.1647
1.189
3.501
CHARGE
-0.43644E-03 0.5672E-04 -7.694
CONSTANT
2.1049
0.1944
10.83
PARTIAL STANDARDIZED ELASTICITY
P-VALUE CORR. COEFFICIENT AT MEANS
0.000 0.119
0.4940
0.4056
0.000-0.254
-0.4737
-0.3229
0.000 0.347
0.0000
0.9173
Estimated Generalized LeastSquares
The estimated generalize d least squares estimator
is obtained by replacing the actual variance - covariance
matrix wit h an estimate of it.
ˆ  ( X 'Wˆ 1 X ) 1 X 'Wˆ 1 y
The estimated variance - covariance matrix
of the estimated generalize d least squares estimator is
ˆ ˆ  ( X 'Wˆ 1 X ) 1

Possible variance forms
1
1.    X i , weight 
Xi
2
i
2
2.  i2   2 X i ,  to be estimated .

weight  X i
ˆ
2
3. Variance proportion al to expected value of Y
 i2   2 ( E ( yi )) 2   2 ( 1   2 X i )2
1
weight 
yˆ i
Download