Estimate and coefficients and compare them. 1- a

advertisement
1. Estimate  and coefficients and compare them.
a-
y =  +  1 x + u
y* =  + x* + u where x* =10x and y* =10y
b-
y =  +  1 x + u
y* =  + x + u
where y* =y+5
2. A researcher has international cross-section data on aggregate wages, W, aggregate profits, P, and
aggregate income, Y., for a sample of n countries. By definition : Y i = W i + P i
The regressions
Wi = a1 + a2Yi
Pi = b1 + b2 Yi
are fitted using OLS regression analyses. Show that the regression coefficients will
automatically satisfy the following equation a 2 + b 2 = 1.
1. A researcher has annual data on demand for labour, L, aggregate output in current prices, Y,
average wages at current prices, W, and a general price index, P, for the manufacturing sector of a
certain industrialized country for the period 1973–2002. L is measured as the average number of
workers employed. Y and P are measured as index numbers equal to 100 in the year 2000. He fits
the following regression (standard errors in parentheses; RSS is residual sum of squares):
Log L= –3.12 + 0.42 log Y – 0.34 log W – 0.11 log P
(0.13) (0.09)
(0.10)
R2=0.96
(1)
(0.06)
He next regresses L on real output, Y/P, and real wages, W/P:
Log L= –2.56 + 0.46 logY/P – 0.32 log W/P
(0.13) (0.07)
R2=0.90
(2)
(0.07)
(a) Give an economic interpretation and statistical tests of the slope coefficients in equation (1).
(b) Is the equation (1) significant as a whole?
(c) Explain why the second specification is a restricted version of the first, stating the restriction.
(d) Perform an F test of the restriction.
(e) Assuming that the restriction is valid, explain why regression (2) is preferable to regression (1).
(g) Regression (2) has a worse fit than regression (1). Does this matter?
1. A researcher wants to estimate a Cobb-Douglas type production function ( Q = A*C1*L2 ) using
the data of 30 firms which consists of output(Q), capital(C) and labor(L).
a. The equation is estimated using all the data. Interpret the coefficients of variables C and L.
Ln(Qi) = 3.26 + 0.75*Ln(Ci) + 0.20*Ln(Li)
Std.Err.
(0.20)
(0.05)
R2 = 0.85
RSS = 30.18
b. A researcher respecifies the above equation using the assumption and estimates the below
second equation. Clearly state the assumption made and check weather the data support this
assumption.
Ln(
Qi
L
)  2.15  0.70 * ln( i )
Ci
Ci
RSS = 35.76
c. The first defined equation is estimated using only the data of top 15 firms that have high
production level and then the same equation is estimated using the data of the 15 firms that
have lowest production level. Do these equations have different coefficients?
Ln(Qi) = 1.62 + 0.80*Ln(Ci) + 0.15*Ln(Li)
Std.Err.
(0.30)
(0.07)
R2 = 0.91
RSS = 17.84
Ln(Qi) = 2.82 + 0.60*Ln(Ci) + 0.35*Ln(Li)
Std.Err.
(0.20)
(0.07)
R2 = 0.88
RSS = 9.15
d. 10 of the 40 firms are international firms and others are not. In order to measure the effect of
the internationality, a researcher adds a dummy variable into the model. (D=1 if the firm is
international, 0 otherwise). The new equation is estimated. Is there any significant effect of the
internationality on the production function? If yes, explain its effects.
Ln(Qi) = 3.26 + 0.65*Ln(Ci) + 0.30*Ln(Li) + 0.50*Di + 0.15*Di*Ln(Ci) - 0.10*Di*Ln(Li)
Std.Err.
(0.25)
(0.05)
(0.35)
(0.06)
(0.08)
RSS = 25.54
e. Check the heteroscadasticity for the equation given at a. (use the equations given at c.)
2. In order to explain the sales of the durable good producer, the regression equation was estimated.
SQ = Sales, PC = price of the main competitor, PQ = price of the producer, Y = Income, C =
Consumption, N= the number of stores that the firm is operating. The estimated standard errors of
coefficients are given in parentheses.
SQ = -7.2 + 200.3 PC – 150.6 PQ + 20.6 Y – 15.8 C + 201.1 N
(250.1)
(125.6)
(40.1) (10.6) (103.8)
R2 = 0.73
n = 26
i-
Are the coefficients significant?
ii-
Test the significance of the equation.
iii-
Compare the results of i. and ii. What is the possible problem of the equation? Rearrange
the estimated equation in order to solve the problem.
iv-
If the variable Y is dropped from the equation, calculate the new R2 of the equation.
1. From the household budget survey of 1980 of the Dutch Central Bureau of Statistics, J. S. Cramer
obtained the following logit model based on a sample of 2820 households. (The results given here
are based on the method of maximum likelihood and are after the third iteration.) The purpose of
the logit model was to determine car ownership as a function of (logarithm of) income.
Car ownership was a binary variable: Y = 1 if a household owns a car, zero otherwise.
Zi = −2.772 + 0.347 ln Income
t = (−3.35) (4.05)
where Zi = estimated logit and where ln Income is the logarithm of income.
a) Interpret the estimated logit model.
b) What is the probability that a household with an income of 20,000 will own a car?
c) What is the effect of income increases on the probability of car ownership? (assume income
level 25,000)
3. The wages of the workers who are employed at textile industry are examined in order to explain
the effect of the labor union membership. The estimated standard errors of coefficients are given in
parentheses.
Ln W = -1.40 + 0.30 A – 0.003 A2 + 0.06 S + 0.14 U
(0.10)
(0.001) (0.01)
(0.05)
W = hourly wage
A = age of worker
n = 30
R2 = 0.14
S = education level (as a year)
U = dummy variable (if a worker member of union 1, otherwise 0)
i. Interpret the effect of union membership on the wages. Is it significant?
ii. What is the effect of education level on the wages? Is it significant?
iii. What is the value of the adjusted R2 ?
iv. While the other variables are fixed, at which age the wages reaches its maximum level?
2. The output below shows the result of regressing the weight of the respondent in 1985, measured in
pounds, on his or her height, measured in inches.
a) Provide an interpretation of the regression coefficients. [5 marks]
b) Write hypothesis and test the significance of regression coefficients. [5 marks]
c) Calculate the 95 percent confidence interval for the coefficient of height. [5 marks]
d) Interpret the R2. [5 marks]
e) Calculate F statistics. [5 marks]
f) If the height of a man is 70 inches, what is the estimated weight? [5 marks]
. reg weight85 height
Source |
SS
df
MS
Number of obs =
550
---------+-----------------------------F( 1,
548) =
?
Model | 245463.095
1 245463.095
Prob > F
= 0.0000
Residual | 392166.897
548 715.633025
R-squared
= 0.3850
---------+-----------------------------Adj R-squared = 0.3838
Total | 637629.993
549 1161.43897
Root MSE
= 26.751
-----------------------------------------------------------------------------weight85 |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
---------+-------------------------------------------------------------------height |
5.399304
.2915345
?
?
?
cons | -210.1883
19.85925
-10.584
0.000
-249.1979
-171.1788
------------------------------------------------------------------------------
3. Using 30 observations, equation y =  + x + u was estimated by OLS. Find the missing values.
Variable
Constant
X

Coefficient
Standard Error
26.034
1.741
0.137
0.028
X = 54.478
R2 =
? [5 marks]
t- value
14.955
? [5 marks]

Y= ? [5 marks]
2. A researcher has a sample of 43 observations on a dependent variable, Y, and two potential
explanatory variables, X and Z. He defines two further variables V and W as the sum of X and Z
and the difference between them:
Vi=Xi+Zi
Wi=Xi-Zi
He fits the following four regressions
(1) A regression of Y on X and Z
(2) A regression of Y on V and W
(3) A regression of Y on V
(4) A regression of Y on Z and V
The table shows the regression results (standard errors in parentheses; RSS = residual sum of
squares; there was an intercept, not shown, in each regression). Unfortunately, a goat ate part of the
regression output and some of the numbers are missing. These are indicated by letters.
(1)
0.60
(0.04)
0.80
(0.04)
(2)
(3)
(4)
—
—
—
—
—
V
—
0.72
(0.02)
W
—
A
(B)
C
(D)
E
F
H
(I)
J
(K)
—
—
G
220
L
M
X
Z
R2
RSS
0.60
200
a) Reconstruct each missing number if this is possible, giving a brief explanation. Detailed
mathematical analysis is not required. If the calculation is too complicated to do without a
calculator, you may instead earn full marks by indicating how the missing value should be
calculated. If it is not possible to reconstruct a number, give a brief explanation.
[2 mark each: A B C D E F H J L M ,
5 marks: K ,
10 marks each: G I ]
b) The correlation between X and Z was high. That between V and W was low. Explain the
implications, if any, for a comparison of the regression results for specifications (1), (2), and (3)
(i) making no assumption concerning the true values of the coefficients of X and Z in
specification (1) [5 marks]
(ii) assuming that the true coefficients of X and Z are the same. [5 marks]
Download