Question 2

advertisement
More exercises for seminars ECON 4150
Problem set 6. Testing simultaneous hypotheses on the regression coefficients - F tests.
Solve the exercises: 8.9 using the file codd.xls, 8.13 using the data in the file cars.xls.
Problem set 7. This is an exercise in demand analysis. You will find the data in the file
table 8-3.xls
Demand analysis:
We are concerned with analysing the demand for beer in a sample of households. Our
sample reports observations on 5 variables: the quantity of beer demanded (QB ) , the
price of beer (PB) , the price of other liquor ( PL) , a price index of the remaining goods
and services on the households’ budgets (PR) , and finally the households income (INC ).
We are not certain of what will be the appropriate specification of this demand so we will
try different forms. We start with the ln-ln functional form:
(1) ln(QBi )  0  1 ln( PBi )  2 ln( PLi )  3 ln( PRi )  4 ln( INCi )   i
where  i denotes the random disturbances.
Question A. How will you interpret the regression parameters in this demand function.
Use PcGive and the attached data file table 8-3.xls to estimate this model.
Question B. Do you think the signs of the estimates are reasonable? Substantiate your
assertions. Explain how the numbers in the column P  t are calculated.
Standard consumer demand theory tells us that if prices and income increase by the same
proportion we should expect no change in the quantity demanded. The consumers are
said to have no money illusion.
Question C. Show that the assumption of no money illusion applied to the demand
function (1) implies the restriction:
(2) 1   2  3   4  0
Question D. Use results and information from your regression to test the hypothesis:
H 0 : 1   2  3   4  0
against
with significance level   0.05 .
H a : 1   2  3   4  0
Now we are told that the ln-ln functional form (1) has certain theoretical drawbacks. In
addition to (1) we thus wish to analyse two forms which are linear in the relative price
and the real income.
(3) QBi  1  2 ( PLi PBi )  3 ( PRi / PBi )  4 ( INCi PBi )  ui
(4) QBi  0 (1 PBi )  1  2 ( PLi / PBi )  3 ( PRi / PBi )   4 ( INCi / PBi )  vi
where u i and vi denote random disturbances.
Use PcGive to estimate the regression equations (3) and (4).
Question E. Which of the two equations (3) or (4) do you think is appropriate for testing
the assumption of no money illusion? State the reason for your choice and explain how
you would test this hypothesis.
Question F. Assume that the number of household members ( Ni ) have been wrongly
excluded from the regressions (3) and (4). Choose one of these regressions for further
study, and explain formally how this misspecification effects the OLS estimates of the
 ' s when:
(i) N i is uncorrelated with the explanatory variables already used in the equation.
(ii) N i is correlated with the explanatory variables already used in the equation.
Problem set 8. The next exercises illustrates the use of dummy variables in regression
analysis.
Gender and Wages
For sample of Norwegian workers, consisting of 75 women and 75 men, we have
observed the variables: gender, wages per hour, years of education. It is often said that
women are underpaid in the labour market. We wish to investigate this assertion by
applying regression analysis to our data-set. We use the following variables in our
analysis: Wi - wage per hour paid to worker (i ), Ei  worker’s (i ) number of years of
education, the qualitative variable gender is represented by the dummy variables defined
by:
1 if wor ker (i) is a woman
1 if wor ker (i) is a man 
DF (i)  
 , DM (i)  

0 if wor ker (i) is a man 
0 if wor ker (i) is a woman
The first part of the regression analysis is based on the equations:
(1)
Wi   F DF (i)   M DM (i)   1 Ei   i
i  1,2,........150
(2)
Wi   0  1 DF (i)   1 Ei   i ,
i  1,2,..........150
The specification (1) includes both dummy variables besides education, but excludes the
interept term. The regression equation (2) includes an intercept term, but excludes the
dummy variable for men DM (i).
The results of these regressions applied to our data set are shown output (1) and output
(2).
Question 1
Clarify the relations between the parameters  F ,  M in the regression equation (1) and
the parameters  0 , 1 in the regression equation (2). Discuss the empirical results as they
are shown in output (1) and in output (2)
Question 2
In regression analyses where dummy variables enter the set of explanatory variables one
usually prefer specification (2) to specification (1). Explain why.
In order to obtain a more complete picture of the importance of gender for salaries we
have also run the regressions:
(3)
Wi   0  1 DF (i)   1 Ei   2 ( DF (i) Ei )   i
i  1,2,.......150
where the variable ( DF (i) Ei ) is the product of the variables DF (i) og Ei .
(4)
Wi   0   1 Ei   i
i  1,2,........150
The reults of these regression are shown in output (3) and output (4)
Question 3
When you use the results of the run regressions do you agree or disagree to the assertion
that women are underpaid in the Norwegian labour market? Substantiate your answer.
Question 4
Use regression (2) (output (2)) to calculate the annual income for a man with 10
years’education and who works 1800 hours per year. Calculate the standard error of the
estimate as a measure of the uncertainty of the estimate. You are informed that the
covariance between ˆ0 and ˆ1 is –29.082.
Question 5
Assume that you exchange the roles of the dummy variables DF (i) and DM (i) in the
specification (2), so that in stead of (2) you run the regression:
(2*) Wi   0  1 DM (i)   1 Ei   i
How do you think the output of regression (2*) will look like compared to the output of
regression (2) (output 2)?
Output 1 (Regression (1))
DF
DM
E
sigma
R^2
Coefficient
82.3653
124.549
2.65070
49.8728
0.174338
Std.Error
19.28
19.80
1.535
t-value
4.27
6.29
1.73
t-prob
0.000
0.000
0.086
RSS
365632.519
F(2,147) = 15.52 [0.000]**
no. of observations
150
mean(timelønn)
135.707
no. of parameters
var(timelønn)
3
2952.24
Output 2 (Regression (2))
Constant
DF
E
sigma
R^2
Coefficient
124.549
-42.1839
2.65070
49.8728
0.174338
no. of observations
mean(timelønn)
Std.Error
19.80
8.163
1.535
t-value
6.29
-5.17
1.73
RSS
365632.519
F(2,147) = 15.52 [0.000]**
150
135.707
no. of parameters
var(timelønn)
3
2952.24
t-prob
0.000
0.000
0.086
Output 3 (Regression (3))
Constant
DF
E
( DF E)
sigma
R^2
Coefficient
129.266
-54.0309
2.26865
0.976874
50.0269
0.17488
Std.Error
25.03
39.13
1.973
3.155
t-value
5.16
-1.38
1.15
0.310
t-prob
0.000
0.169
0.252
0.757
RSS
365392.546
F(3,146) = 10.31 [0.000]**
no. of observations
150
mean(timelønn)
135.707
no. of parameters
var(timelønn)
4
2952.24
Output 4 (Regression (4))
Constant
E
sigma
R^2
Coefficient
96.9259
3.18752
54.0306
0.0243395
Std.Error
20.66
1.659
RSS
F(1,148) =
no. of observations
150
mean(timelønn)
135.707
t-value
4.69
1.92
432057.218
3.692 [0.057]
no. of parameters
2
var(timelønn)
2952.24
Problem set 9. Households’ expenditures on food.
t-prob
0.000
0.057
Problem set 10. Exercises with dummy variables.
Solve the exercises: 9.6 using the data file tuna.xls, 9.8
Problem set 11. This is an exercise with non-linear models.
We are interested in investigating how households’ expenditures on food vary with their
income . We have observations on the households’ expenditures on food (Y) and incomes
( R ) for 50 Norwegian households, both variables are measured in 1000 kroner. In
addition do we have observations on the number of members in the households.
~
For a given income R do we assume that the expected value of Y , denoted Y , is given
by the function:
(1)
R
~
Y 
R
where  and  denote unknown, positive parameters.
(a) Discuss if, in your opinion, the function (1) gives a good description of the
relation between expenditures on food and income.
Since (1) is non-linear in the parameters, we are unable to estimate the parameters  og
 with ordinary OLS regression. We shall therefore approximate (1), first with a linear
and then with quadratic funksjon. The linear approximation Yi is given by:
(2)
Yi   1 Ri   i
where  1 


and  i denotes random disturbances,
i  1,2, ….. ,50
(b) Show that OLS estimator of  1 is given by:
50
(3)
ˆ1 
Y R
i 1
50
i
i
R
i 1
2
i
Out-print 1 shows the results of the OLS applied to regression (2) together with the
histogram of the residuals ˆi .
(c) Give your comments to this regression, calculate the Jarque-Bera observator
when (skewness) S  0.6332 and kurtosis k  4.5288.
~
In order to improve the approximation of Y , we now approximate (1) by a quadratic
function.. The quadratic approximation of Yi is given by:
(4)
Yi   1 Ri   2 Ri   i
2
where  2  

2
and  i are random disturbances,
i  1,2,......50
Outprint 2 shows the results of OLS applied to the regression (4) together with histogram
of the residuals ˆi . The variable RR in the outprint corresponds to R 2 in the regression
equation (4).
(d) Do you think that the results as they appear in this out-print are reasonable?
Substantiate your answer.
(e) Explain how you will use these results to deduce estimates of  og  and
calculate ̂ and ˆ .
(f) Use the expenditure function (1) to derive the Engel elasticity. Use your results
above to calculate this elasticity for a household with income equal to 100 000
We suspect that the parameter  1 in equation (4) depends on the number of member in
the household S , but we are uncertain how to specify this dependency. There are two
proposals:
(5)  1   0  1 S
eller
(6)  1   0  1 S  
where  denotes the random disturbances satisfying the usual conditions.
Out-print 3 shows the results of the regression:
i  1,2,.......,50
(7) Yi   0 Ri  1 ( RS )i   2 Ri2  ui
where u i denotes the random disturbances in this regression
(g) Give your comments to this out-print.
(h) Explain your opinion about choosing (5) or (6).
Out-print 1
EQ( 1) Modelling Y by OLS-CS (using data.eksoppg.h2003.xls)
The estimation sample is: 1 to 50
Coefficient
0.233338
R
sigma
Std.Error
0.01760
6.55912
t-value
13.3
RSS
t-prob
0.000
2108.08312
no. of observations
50 no. of parameters
1
mean(Y)
12.5025 var(Y)
37.0663
Density
0.45
r:Y
N(0,1)
0.40
0.35
0.30
0.25
0.20
0.15
0.10
0.05
-3.0
-2.5
-2.0
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
no. of observations
50 no. of parameters
1
mean(Y)
12.5025 var(Y)
37.0663
Out-print 2
Q( 2) Modelling Y by OLS-CS (using data.eksoppg.h2003.xls)
The estimation sample is: 1 to 50
R
RR
Coefficient
Std.Error
0.371463
0.05604
-0.00213237 0.0008260
t-value
6.63
-2.58
t-prob
0.000
0.013
sigma
6.20997 RSS
1851.05693
no. of observations
50 no. of parameters
2
mean(Y)
12.5025 var(Y)
37.0663
4.5
0.5
Density
r:Y
N(0,1)
0.4
0.3
0.2
0.1
-3.0
-2.5
-2.0
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
Out-print 3
Q( 3) Modelling Y by OLS-CS (using data.eksoppg.h2003.xls)
The estimation sample is: 1 to 50
R
RS
RR
Coefficient
0.170144
0.0623471
-0.00238358
Std.Error
t-value
0.06098
2.79
0.01248
5.00
0.0006765 -3.52
sigma
5.07192 RSS
1209.04536
no. of observations
50 no. of parameters
3
mean(Y)
12.5025 var(Y)
37.0663
t-prob
0.008
0.000
0.001
Density
r:Y
0.40
N(0,1)
0.35
0.30
0.25
0.20
0.15
0.10
0.05
-2.5
-2.0
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
Problem set 12. Regressions when the disturbances might be autocorrelated.
Exercise 12.1, Exercise 12.5, Exercise 12.6, Exercise 12.7
Download