Class no. 5 Bachelor Studies in Finance 2012/2013 Subject: Heteroskedasticity (Goldfeld-Quandt test, White’s test). 1. In White’s test for heteroskedasticity: a) Write down the auxiliary regression for the given base regression model: I. Y = β 0 + β 1 X 1 + β 2 X 2 + ε , II. Y = β 0 + β 1 X 1 + β 2 X 2 + β 3 X 3 + ε . III. Y = β 0 + β 1 X 1 + β 2 X 2 + β 3 X 22 + ε . b) How many explanatory variables are there in the auxiliary regression if the number of explanatory variables in base regression equals 4 (or 5, 6) (we assume that none of the variables in base regression is square or multiplication of other variables) ? 2. The model RCHF ,t = β 0 + β1 ⋅ REUR ,t + β 2 REUR ,t −1 + β 3 RUSD ,t + β 4 RUSD ,t −1 + ε t , ε − N 0; σ 2 I ( ) was estimated by ordinary least squares from 236 daily exchange rate returns PLN/CHF (RCHF), PLN/EUR (REUR) and PLN/USD (RUSD). a) Write down auxiliary regression for White’s heteroskedasticity test. b) Can You reject the null hypothesis that disturbance term is homoskedastic, knowing that the determination coefficient for auxiliary regression equals 0,02194? Calculate p-value. 3. Using n = 44 daily exchange rate returns PLN/CHF (RCHF), PLN/EUR (REUR) and PLN/USD (RUSD) from the period March 2004 – April 2004 the model RCHF , t = α 0 + α1 REUR , t + α 2 RUSD , t −1 + ε t was estimated by ordinary least squares. Using the appropriate statistical test examine the hypothesis that the variance of disturbance term was significantly higher in March (comparing to April). RSS for March and April were equal 7 and 5 respectively, and the numbers of working days in March and April were equal 23 and 21 respectively. Assume 5% significance level. Exercises for presentation Group 1 Arleta Gałecka Marta Kossakowska Tatiana Dmytrenko Group 2 Mateusz Pliszka Kamil Olejnik Alicja Skrzyszewska 1. In the file Data_Set_Class_5.xls You will find data on expenditures (Y) and income (X) for the group of n = 20 families. Perform calculations in MS Excel and present the results in the class. a) Estimate (using OLS method) parameters of the model: Y = β 0 + β1 X + ε . (1) Interpret the results. Calculate the residuals. Analyze the plot of residuals (on X axis mark the number of family). Is there any conclusion on the variance of disturbance term? Explain why. b) Consider the following action: You order the observations by the ascending value of variable X and afterwards You once again estimate parameters of the model (1). What do You think would be the result of such action? Explain why. Execute the above mentioned operation. One more time analyze the plot of residuals (this time ordered by the ascending value of variable X). What is Your suspicion regarding the variance of disturbance term? c) Using Goldfeld-Quandt test verify the hypothesis that disturbance term is homoskedastic. Assume that each sub-sample consists of eight extreme observations. d) Using White’s test verify the hypothesis that disturbance term is homoskedastic. Assume the following form of auxiliary regression: ei2 = γ 0 + γ 1 X i + γ 2 X i2 + ν i . 1 e) What are the consequences of heteroskedasticity to the properties of OLS estimates? f) Estimate (using OLS method) log-linear model (use ordered dataset from section b): ln Y = α 0 + α 1 ln X + µ (2) Analyze the plot of residuals. Confront the result with the graph from section b. What was the influence of logarithmic transformation on the variance of residuals? Once more apply White’s test in the same form as in section d to investigate whether the logarithmic transformation would eliminate the heteroskedasticity. 2. The file Data_Set_Class_5.xls contains international cross-sectional data on aggregate expenditure on education (EDUC), gross domestic product (GDP), and population (POP), for a sample of 38 countries in 1997. EDUC and GDP are measured in millions of US $ and POP is measured in thousands. Perform calculations in MS Excel and present the results in the class. a) Plot a scatter diagram of EDUC on GDP, and comment on whether the data set appears to be subject to heteroskedasticity. b) Estimate (using OLS method) parameters of the model: EDUC = β 0 + β 1GDP + ε (1) c) Sort the data set by GDP and perform a Goldfeld–Quandt test for heteroskedasticity, running regressions using the subsamples of 14 countries with the smallest and greatest GDP. d) One of the ways of dealing with heteroskedasticity is to use Weighted Least Squares. To do that, You need to specify a leading variable for the proportion of the error term variance. If the leading variable is x then: σ i2 = σ 02 ⋅ xi . and the weights are: 1 ωi = . xi The true variance of the error term might depend on one of the variables included in our model, on some variable not included in our model or on any function of those variable(s). Unfortunately, it is not easy to find this leading variable and proper weights. Assume that the leading variable is 1) GDP2, in which case You need to divide the model (1) by GDP (scale by Gross Domestic EDUC 1 ε Product): . (2) = β0 + β1 + GDP GDP GDP 2) POP2, in which case You need to divide the model (1) by POP (scale by population): EDUC 1 GDP ε = β0 + β1 + . (3) POP POP POP POP Note that in model (3) there is no intercept (In MS Excel Data Analysis – Regression specification window check the box “Constant is zero” (“Stała wynosi zero”)) . Estimate both models (2) and (3). Use the Goldfeld–Quandt test (same as in section c) but in 1 GDP case of model (2) sort the data set by and in case of model (3) sort by ) to GDP POP investigate whether scaling by population or by GDP would eliminate the heteroskedasticity. Choose the best specification out of three estimated models. 3. In the file Data_Set_Class_5.xls You will find data on the import of machines and appliances and (M) and the money invested to buy the machines (I) in the years 1970 – 1989 in a given company. Perform calculations in MS Excel and present the results in the class. a) Estimate (using OLS method) parameters of the model: M t = β 0 + β1 I t + ε t . (1) b) The CEO of the company gave You the information that in the years 1977-1989 the company had difficulties with financing the investments. Apply the Goldfeld-Quandt test to find out whether the variance of error term was constant during the years 1970-1989. 2 c) Estimate the model once again, this time with the correction of heteroskedasticity, using White’s method. The White method is convenient when we do not know what the leading variable for the proportion of the error term variance is. The methods involves few simple steps: Step 1) Estimate (using OLS method) parameters of the auxiliary model: 2 z t = α 0 + α 1 X 1t + α 2 X 2t + K + α m X mt + γ 1 X 12t + γ 2 X 32t + K + γ m X mt + (2) + ν 1 X 1t X 2t + ν 2 X 1t X 3t + K + ν l X m−1t X mt + ξ t The dependent variable z t is the logarithm of squared residuals from the model (1) ( z t = ln(et2 ) ). The independent variables in the auxilliary model (2) are all independent variables from the model (1), their squares, and their cross-products. Step 2) Calculate fitted values from auxilliary model ( ẑ t ) and the estimated variances σˆ t2 = exp[ zˆt ] . 1 . σˆ t Step 4) Multiply each variable in the model (1) by the weights and estimate the model: wt Yt = β ' 0 wt + β '1 ( wt X 1t ) + ... + β ' m ( wt X mt ) + ε ' t (3) Estimated parameters β ' from the model (3) are corrected for heteroskedasticity. d) Apply the Goldfeld-Quandt test for the model (3) to investigate whether applying White’s method would eliminate the heteroskedasticity. Step 3) Calculate the weights wt = 3