Faculty of Economy International Business and Development November 24 2011 3st TEST (Type AD) Economic Statistics Duration – 50 minutes Examination Aids: Calculator Point Value EXERCISE 1 EXERCISE 2 Total 8 2 10 Point Earned In the calculations use no more than two decimal Remember always of Commenting the results obtained EXERCISE 1 –Type AD (eight points) Recent UN data from several nations on: Y = crude birth rate (number of births per 1000 population size), X2 = women's economic activity (female labor force as percentage of male), X3 = GNP (per capita, in thousands of dollars) The human resource director is interested in using regression modeling to help in in explaining the variability of the birth rate. As independent variable we choose to variables X2 (female labour force) and X3 GNP per capita. Here are: i) the mean and standard deviation for each variable, ii) the correlation coefficients between the variable and iii) estimates of the parameters of the model with the OLS procedure and the ANOVA table Summary Statistics, using the observations 1 - 26 Variable Y (Birth_Rate) X2(Female_La bor_force) X3 (GNP_PC) Mean 22.2846 49.4400 Std. Dev. 10.3519 20.2363 9.22000 9.55789 Correlation coefficients, using the observations 1 - 26 Y 1.0000 X2 -0.5222 1.0000 X3 -0.7381 0.5809 1.0000 Y X2 X3 Model 1: OLS, using observations 1-26 Dependent variable: Y (Birth_Rate) const X2 X3 Mean dependent var Sum squared resid R-squared F Coefficient 34.533 -0.131 -0.644 22.2846 1106.13 13.54838 Std. Error 4.123 0.09546 0.1957 t-ratio -1.37 -3.29 S.D. dependent var S.E. of regression Adjusted R-squared 10.3519 Analysis of Variance (ANOVA) Sum of squares df 1427.26 1106.13 2533.39 2 23 25 Regression Residual(error) Total Mean square A) [1 point] Write down the estimated regression equation. Interpret the coefficient estimates for X2 and X3 B) [1 point] Compute the coefficient of determination, R2, and interpret is meaning. C) [1 point] Test if the overall regression model is significant using a 0.05 significance level. D) [2 points] At the 0.05 level of significance, determine whether each independent variable makes a contribution to the regression model. Indicate the most appropriate regression model for this set of data. E) [1 point] Sketch on a single graph the relationship between Y and X3 when X2 = 0. Interpret the results. F) [2 points] Find the estimated standardized regression coefficients for the model, and interpret. a) Write down the estimated regression equation. Interpret the coefficient estimates for X2 and X3 Birth_Rate_hat = 34.5 - 0.131*Female_Labor_fo - 0.644*GNP_pc (4.13) .0955) (0.196) n = 24, R-squared = 0.563 (standard errors in parentheses) X2: using the data of the sample, we can say that if the female labor force increase of 1% as percentage of male, the n.° of birth per 1000 population size decreases by 0.131, holding constant all the other variables (in our case only X3). X3: using the data of the sample, we can say that if the GNP per capita increases of 1000$, the n.° of birth per 1000 population size decreases of 0.644, controlling X2. b) [1 point] Compute the coefficient of determination, R2, and interpret is meaning R-squared = ESS/TSS = 1427.26/2533.39 = 0.56 Using the variable X2 and X3 I explain 56% of the variability of Y. In other term, implying that using X2 and X3 to predict crude birth rate produce a 56% reduction in predicting error relative to using only Y mean (Y bar). c) Test if the overall regression model is significant using a 0.05 significance level. Test the overall significance (i.e., validity) of the multiple regression model using a 5% significance level. Solution: [Hypotheses] : H0: H1 : Not all j = 0 for j 2, 3 Or: H0: H1: At least one j 0 for j 2, 3 Or H0: The model is not significant H1: The model is significant The Test statistic is F (k 1, n k ) ESS (k 1) 1427.26 / 2 14.84 RSS (n k ) 1106.13 / 23 Decision Rule From the F-table, F(0.05, 2, 23) = 3.42. The decision rule is to reject H0 if F 3.42, and accept (do not reject) H0 if F 3.42. The test statistic is F = 14.84 which falls in the rejection region. Reject H0 and conclude that the model is significant. d) [2 points] At the 0.05 level of significance, determine whether each independent variable makes a contribution to the regression model. Indicate the most appropriate regression model for this set of data. H0 : βj = 0, j = X2 , X3 H1 : βj ≠ 0, j = X2 , X3 t = (bj-0)/sbj j = X2 , X3 At the 0.05 significance level, reject H0 if t ≥2.069 or t 2.069. Do not reject H0 if 2.069 t 2.069. The critical value from the t-table is t = 2.069 with 23 degrees of freedom. Compare the t statistics in the above table (-1.37 and -3.29) to the critical value X2 is not significant independent variable and X3 is significant independent variable. We can re-write our regression model as a simple linear regression model Birth_Rate_hat = 34.5 - 0.644*GNP_pc (4.123) (0.196) e) Sketch on a single graph the relationship between Y and X3 when X2 = 0. Interpret the results. If X2 the equation is : Birth_Rate_hat = 34.5 - 0.644*GNP_pc There is a negative correlation between GNP per capita and crude birth rate, if GNP is equal zeo the birth rate would be 34.533 but GNP = 0 has no meaning. With the increase of GNP the birth rate decreases, it would be 0 for a GNP value of 53.622. Negative value for GNP or Birth Rate cannot be observed. f) [2 points] Find the estimated standardized regression coefficients for the model, and interpret. We obtain estimates of the standardized regression coefficients starting from the partial regression coefficients: For beta2: beta2 b2 sX 2 beta3 b3 sX3 sY 0.131* 20.2363 /10.3519 0.256 For beta3: sY 0.644*9.55789 /10.3519 0.595 These standardized regression slopes are comparable independently of the scales on which the predictors are measured. So it can be stated that X3 (GNP pc), has a greater relative effect on the value of Y. EXERCISE 2 – Type AD (two points ) A) [1 point] All of the following are possible effects of multicollinearity EXCEPT: a) The variances of regression coefficients estimators may be larger than expected b) The signs of the regression coefficients may be opposite of what is expected c) A significant F ratio may result even though the t ratios are not significant d) Removal of one data point may cause large changes in the coefficient estimates e) The VIF is zero B) [1 point] In a multiple regression model what methods are available to identify the most important explanatory variable? When independent variables are correlated, as they normally are, determining the relative importance of the predictor variables is a very complex process. We provide only few notes about the problem: 1) The variable with the highest value of Standardized partial regression coefficient (Betazj) is the most important independent variables 2) Through the decomposition of R-squared in the contributions of each variable. In the trivariate model we have: R2 = beta2*rY2 + beta3*rY3. The variable associated to the greatest value of betaj*r yXj is the most important explanatory variable. 3) Computing the partial correlation coefficient. For example,the larger the absolute value of r Y X2.X3 , the stronger the association between Y and X2, controlling for X3. 4) To see which variable has the greatest t statistical. Note that this method work only if every variable is statistically significant.