Faculty of Economy International Business and Development

advertisement
Faculty of Economy
International Business and Development
November 24 2011 3st TEST (Type AD)
Economic Statistics
Duration – 50 minutes
Examination Aids: Calculator
Point Value
EXERCISE 1
EXERCISE 2
Total
8
2
10
Point Earned
In the calculations use no more than two decimal
Remember always of Commenting the results obtained
EXERCISE 1 –Type AD (eight points)
Recent UN data from several nations on:
Y = crude birth rate (number of births per 1000 population size),
X2 = women's economic activity (female labor force as percentage of male),
X3 = GNP (per capita, in thousands of dollars)
The human resource director is interested in using regression modeling to help in in
explaining the variability of the birth rate. As independent variable we choose to variables
X2 (female labour force) and X3 GNP per capita.
Here are: i) the mean and standard deviation for each variable, ii) the correlation
coefficients between the variable and iii) estimates of the parameters of the model with
the OLS procedure and the ANOVA table
Summary Statistics, using the observations 1 - 26
Variable
Y (Birth_Rate)
X2(Female_La
bor_force)
X3 (GNP_PC)
Mean
22.2846
49.4400
Std. Dev.
10.3519
20.2363
9.22000
9.55789
Correlation coefficients, using the observations 1 - 26
Y
1.0000
X2
-0.5222
1.0000
X3
-0.7381
0.5809
1.0000
Y
X2
X3
Model 1: OLS, using observations 1-26
Dependent variable: Y (Birth_Rate)
const
X2
X3
Mean dependent var
Sum squared resid
R-squared
F
Coefficient
34.533
-0.131
-0.644
22.2846
1106.13
13.54838
Std. Error
4.123
0.09546
0.1957
t-ratio
-1.37
-3.29
S.D. dependent var
S.E. of regression
Adjusted R-squared
10.3519
Analysis of Variance (ANOVA)
Sum of squares
df
1427.26
1106.13
2533.39
2
23
25
Regression
Residual(error)
Total
Mean square
A) [1 point] Write down the estimated regression equation. Interpret the coefficient
estimates for X2 and X3
B) [1 point] Compute the coefficient of determination, R2, and interpret is meaning.
C) [1 point] Test if the overall regression model is significant using a 0.05 significance
level.
D) [2 points] At the 0.05 level of significance, determine whether each independent
variable makes a contribution to the regression model. Indicate the most
appropriate regression model for this set of data.
E) [1 point] Sketch on a single graph the relationship between Y and X3 when X2 = 0.
Interpret the results.
F) [2 points] Find the estimated standardized regression coefficients for the model,
and interpret.
a) Write down the estimated regression equation. Interpret the coefficient estimates
for X2 and X3
Birth_Rate_hat = 34.5 - 0.131*Female_Labor_fo - 0.644*GNP_pc
(4.13) .0955)
(0.196)
n = 24, R-squared = 0.563
(standard errors in parentheses)
X2: using the data of the sample, we can say that if the female labor force increase
of 1% as percentage of male, the n.° of birth per 1000 population size decreases by
0.131, holding constant all the other variables (in our case only X3).
X3: using the data of the sample, we can say that if the GNP per capita increases of
1000$, the n.° of birth per 1000 population size decreases of 0.644, controlling X2.
b) [1 point] Compute the coefficient of determination, R2, and interpret is meaning
R-squared = ESS/TSS = 1427.26/2533.39 = 0.56
Using the variable X2 and X3 I explain 56% of the variability of Y. In other term,
implying that using X2 and X3 to predict crude birth rate produce a 56% reduction in
predicting error relative to using only Y mean (Y bar).
c) Test if the overall regression model is significant using a 0.05 significance level.
Test the overall significance (i.e., validity) of the multiple regression model
using a 5% significance level.
Solution:
[Hypotheses] : H0: 
H1 : Not all j = 0 for j  2, 3
Or:
H0: 
H1: At least one j 0 for j 2, 3
Or
H0: The model is not significant
H1: The model is significant
The Test statistic is
F (k  1, n  k ) 
ESS (k  1) 1427.26 / 2

 14.84
RSS (n  k ) 1106.13 / 23
Decision Rule
From the F-table, F(0.05, 2, 23) = 3.42. The decision rule is to reject H0 if F 3.42,
and accept (do not reject) H0 if F 3.42.
The test statistic is F = 14.84 which falls in the rejection region. Reject H0 and
conclude that the model is significant.
d) [2 points] At the 0.05 level of significance, determine whether each independent
variable makes a contribution to the regression model. Indicate the most
appropriate regression model for this set of data.
H0 : βj = 0, j = X2 , X3
H1 : βj ≠ 0, j = X2 , X3
t = (bj-0)/sbj j = X2 , X3
At the 0.05 significance level, reject H0 if t ≥2.069 or t 2.069. Do not reject H0 if
2.069 t 2.069.
The critical value from the t-table is t = 2.069 with 23 degrees of freedom.
Compare the t statistics in the above table (-1.37 and -3.29) to the critical value X2 is
not significant independent variable and X3 is significant independent variable.
We can re-write our regression model as a simple linear regression model
Birth_Rate_hat = 34.5 - 0.644*GNP_pc
(4.123) (0.196)
e) Sketch on a single graph the relationship between Y and X3 when X2 = 0. Interpret
the results.
If X2 the equation is : Birth_Rate_hat = 34.5 - 0.644*GNP_pc
There is a negative correlation between GNP per capita and crude birth rate, if GNP
is equal zeo the birth rate would be 34.533 but GNP = 0 has no meaning. With the
increase of GNP the birth rate decreases, it would be 0 for a GNP value of 53.622.
Negative value for GNP or Birth Rate cannot be observed.
f) [2 points] Find the estimated standardized regression coefficients for the model, and
interpret.
We obtain estimates of the standardized regression coefficients starting from the
partial regression coefficients:
For beta2:
beta2  b2
sX 2
beta3  b3
sX3
sY
 0.131* 20.2363 /10.3519  0.256
For beta3:
sY
 0.644*9.55789 /10.3519  0.595
These standardized regression slopes are comparable independently of the
scales on which the predictors are measured. So it can be stated that X3
(GNP pc), has a greater relative effect on the value of Y.
EXERCISE 2 – Type AD (two points )
A) [1 point]
All of the following are possible effects of multicollinearity EXCEPT:
a) The variances of regression coefficients estimators may be larger than expected
b) The signs of the regression coefficients may be opposite of what is expected
c) A significant F ratio may result even though the t ratios are not significant
d) Removal of one data point may cause large changes in the coefficient estimates
e) The VIF is zero
B) [1 point] In a multiple regression model what methods are available to identify the most
important explanatory variable?
When independent variables are correlated, as they normally are, determining the
relative importance of the predictor variables is a very complex process. We provide
only few notes about the problem:
1) The variable with the highest value of Standardized partial regression
coefficient (Betazj) is the most important independent variables
2) Through the decomposition of R-squared in the contributions of each
variable. In the trivariate model we have: R2 = beta2*rY2 + beta3*rY3. The
variable associated to the greatest value of betaj*r yXj is the most important
explanatory variable.
3) Computing the partial correlation coefficient. For example,the larger the
absolute value of r Y X2.X3 , the stronger the association between Y and X2,
controlling for X3.
4) To see which variable has the greatest t statistical. Note that this method
work only if every variable is statistically significant.
Download