Homework 3 Answer.docx

advertisement
Q1. Use CPS08.xls for this problem. A detailed description of the data is given in
CPS08_Description.pdf. In this exercise, you will investigate the relationship between a
worker’s age and earnings. (Generally, older workers have more job experience, leading
to higher productivity and earnings.)
a. Report the descriptive statistics of the variables of interest.
b. Run a regression of average hourly earnings (AHE) on age (Age). What is the
estimated intercept? What is the estimated slope? Use the estimated regression
to answer this question: how much do earnings increase as workers age by 1
year?
c. Make a scatterplot of the data with the least-squares regression line.
d. Plot the residuals versus age.
e. If the relationship is linear, there should be no pattern in residuals. Is this true?
Explain why or why not.
f. Bob is a 26-year-old worker. Predicts Bob’s earnings using the estimated
regression. Alexis is a 30-year-old worker. Predict Alexis’s earnings using the
estimated regression.
Q2. Use COLLDIS.xls for this problem. A detailed description of the data is given in
COLLDIS_Description.pdf. This contains data from a random sample of high school
seniors interviewed in 1980 and re-interviewed in 1986. In this exercise, you will use
these data to investigate the relationship between the number of completed years of
education for young adults and the distance from each student’s high school to the
nearest four-year college. (Proximity to college lowers the cost of education, so that
students who live closer to a four-year college should, on average, complete more years
of higher education.)
a. Run a regression of years of completed education (ed) on distance to the nearest
college (dist), where dist is measured in tens of miles. (For example, dist = 2
means that the distance is 20 miles.) What is estimated intercept? What is the
estimated slope? Use the estimated regression to answer this question: how
does the average value of years of completed schooling change when colleges
are built close to where students go to high school?
b. Bob’s high school was 20 miles from the nearest college. Predict Bob’s years of
completed education using the estimated regression. How would the prediction
change if Bob lived 10 miles from the nearest college?
c. What is the value of the standard error of the regression? What are the units for
the standard error (meter, grams, years, dollars, cents, or something else)?
d. Beware the confounding variable. List five possible confounding variables. Are
they all measurable?
Q3. Dataset TUITION.xls shows the in-state undergraduate tuition and required fees for
33 public universities in 2008 and 2011.
a. Plot the data with the 2008 in-state tuition (ln08) on the x-axis and the 2011
tuition (ln11) on the y-axis. Does fitting a linear model seem reasonable?
b. Run the simple linear regression for the relationship described in part (a). Report
the output and state the least-squares regression line.
c. Interpret the slope coefficient.
d. Obtain the residuals and plot them versus the fitted value. Explain in one
sentence what you should look for in this plot and why.
e. Give the null and alternative hypothesis for examining the linear relationship
between 2008 and 2011 in-state tuition amount.
f. Write down the test statistic and P-value for the hypotheses state in part (f).
State your conclusion.
g. What percent of the variability in 2011 tuition is explained by a linear regression
model using the 2008 tuition? Explain.
Q4. In the 1980s, Tennessee conducted an experiment in which kindergarten students
were randomly assigned to “regular” and “small” classes, and given standardized tests
at the end of year. (Regular classes contained approximately 24 students and small
classes contained approximately 15 students.) Suppose that, in the population, the
standardized tests have a mean score of 925 points and a standard deviation of 75
points. Let SmallClass denote a binary variable to 1 if the student is assigned to a small
class and equal to 0 otherwise. A regression of TestScore on SmallClass yields:
TestScore=918.0+13.9 SmallClass,
(1.6)(2.5)R2 = 0.01
SER = 74.6
No of obs. = 3,743
a. Do small classes improve test scores? By how much? How large is the effect with
respect to the standard deviation? Is the effect large?
b. Is the estimated effect of the class size on test scores statistically significant?
Carry out a test at the 5% level.
c. What is the mean test score for students in the small class? For students in the
large class?
d. Another researcher uses the same data, but regresses TestScore on LargeClass,
a variable that is equal to 1 if the student is assigned to a large class and equal
to 0 otherwise. What are the regression estimates from this regression?
Related documents
Download