Econometrics

advertisement
Econometrics
Stata Quiz (2/28/07)
Name:
You have a dataset that covers 300 students in three different high schools. The variables in your
dataset are as follows:
Yi = test score for student i
X1i = average class size for student i
X2i = parental income (in thousands of dollars) for student i
HSi = high school attended by student i
Your job is to figure out the true effect of class size on test scores in each high school using this
data. The relationship may be the same for different high schools or it may be the same. For the
purposes of this quiz, suppose that class size and parental income are the only variables that can
affect a student’s test score.
1) Suppose you run the following regression:
. reg y x1
Is the estimated coefficient for x1 an unbiased estimate for the true effect of class size on test
scores? If so, why? If not, why not? Write down the model that this regression is assuming
describes the relationship between class size and the other variables.
Answer:
This regression assumes that the true model is:
Yi  0  1 X1i  ui
The estimated coefficient would be unbiased if class size was the only variable that affected test
scores, or if class size and parental income were uncorrelated with each other. Unfortunately, the
true model needs to include parental income and parental income is correlated with class size, so
this regression gives you a biased estimate of the effect of class size on test scores.
2) Now suppose you run the following regression:
. reg y x1 x2
Is the estimated coefficient for x1 an unbiased estimate for the true effect of class size on test
scores? If so, why? If not, why not? Write down the model that this regression is assuming
describes the relationship between class size and the other variables.
Answer:
This regression assumes that the true model is:
Yi  0  1 X1i  2 X 2i  ui
All the variables that could affect test scores are included in the model and so we get an unbiased
estimate of the effect of class size on test scores. The true effect of class size on test scores is -2
(and not the -5.4 you found in the previous problem). The regression is telling you that when
average class size goes up by 1, test scores go down by 2 points.
This effect is much smaller than the incorrect estimate you found in problem 1. In problem 1, we
assigned some of the effect of higher parental income to class size, leading us to overestimate the
importance of lower class size in causing better test scores.
3) Explain how you could figure out the relationship between class size and parental income.
What is the effect of parental income on a student’s average class size?
Answer:
You could run a regression of class size on parental income. That regression would give you an
estimate of this effect. That regression tells you that when parental goes up by 1 (which represents
$1000), average class size goes down by 0.8.
4) You can use an if statement to look at the variables for any given high school. If you wanted to
graph test scores against class size for just high school 1, you could type
. graph y x1 if hs==1
You can also run a regression using just the data for any given high school. Your job: Run
regressions separately for the different high schools. Is the relationship between test scores and
the other variables the same for different high schools? Is it different? Explain.
Answer:
By running the following three regressions, you will notice that the relationship between test
scores and the other variables is the same for all three high schools.
. reg y x1 x2 if hs==1
. reg y x1 x2 if hs==2
. reg y x1 x2 if hs==3
You should notice that, for each regression, ˆ0 14, ˆ1
2, ˆ2
3.
Download