Problem set #2

advertisement
Multiple Choice
1) When the estimated slope coefficient in the simple regression model, ̂ 1 , is zero, then
a. There is a strong relationship between X and Y
b. There is a moderate relationship between X and Y
c. Y cannot be explained by X.
d. The intercept is zero
2)
In the simple linear regression model, the regression slope
a. indicates by how many percent Y increases, given a one percent increase in X.
b. when multiplied with the explanatory variable will give you the predicted Y.
c. indicates by how many units Y increases, given a one unit increase in X.
d. represents the elasticity of Y on X.
3)
The OLS estimator is derived by
a. connecting the Yi corresponding to the lowest Xi observation with the Yi corresponding to the
highest Xi observation.
b. making sure that the standard error of the regression equals the standard error of the slope
estimator.
c. minimizing the sum of absolute residuals.
d. minimizing the sum of squared residuals.
Interpreting the intercept in a sample regression function is
a. not reasonable because you never observe values of the explanatory variables around the
origin. (x=0)
b. reasonable because under certain conditions the estimator is BLUE.
c. reasonable if your sample contains values of Xi around the origin.
d. not reasonable because economists are interested in the effect of a change in X on the change
in Y.
5)
The OLS residuals
a. can be calculated using the coefficient of determination from the regression function.
b. can be calculated by subtracting the fitted values from the actual values.
c. are unknown since we do not know the population regression function.
d. should not be used in practice since they indicate that your regression does not run through
all your observations
6. When the estimated slope coefficient in the simple regression model, ̂ 1 , is zero,
a.
b.
c.
d.
R2 = Y .
0 < R2 < 1.
R2 = 0.
R2 > (SSR/TSS).
7.
The regression R 2 is defined as follows:
π‘ˆπ‘†π‘†
a. 𝑇𝑆𝑆
b. 𝐸𝑆𝑆⁄𝑇𝑆𝑆
c. 1 - π‘ˆπ‘†π‘†⁄𝑇𝑆𝑆
d. two of the above
8.
9.
Which of the following statements is correct?
a. TSS = ESS + USS
b. TSS = ESS + SST
c. ESS > TSS
d. R2 = 1 – (ESS/USS)
The regression R2 is a measure of
a. whether or not X causes Y.
b. the goodness of fit of your regression line.
c. whether or not ESS > TSS.
d. the square of the determinant of R.
Analytical questions
1.
Consider the regression model : Yi = 0 + 1 Xi + ο₯i , where Yi denotes average test scores from
fifth-grade classes (TS, measured in percentages) and Xi denotes the data on fifth grade class
size (CS) . You collected data on 25 classes with class sizes ranging from 25 to 36
and obtained the following measures
οƒ₯X = 75 οƒ₯Y = 50, οƒ₯X2 = 625 οƒ₯Y2 = 228, οƒ₯XY = 30
a. What does the term ο₯I represent?
b. Find the estimated regression equation and interpret it fully.
c. Find the R2 and interpret the value. What are the units of measurement for the R2?
d. A classroom has 22 students. What is the regression prediction for the classroom’s average test
score? Is this prediction reliable? Why or why not?
c. The sample average class size across the 25 classrooms is 5. What is the average test scores
across the 25 classrooms?
2.
Sir Francis Galton, a cousin of James Darwin, examined the relationship between the height of
children and their parents towards the end of the 19th century. It is from this study that the name
“regression” originated. You decide to update his findings by collecting data from 110 college
students, and estimate the following relationship:
Studenth = 19.6 + 0.73×Midparh,
R2 = 0.45,
Se = 2.0
where Studenth is the height of students in inches, and Midparh is the average of the parental heights.
(Following Galton’s methodology, both variables were adjusted so that the average female height was
equal to the average male height.). SER is the standard error of regression
(a)
(b)
(c)
Interpret the estimated equation. Is the estimated intercept meaningful? Why or why not.
What is the meaning of the R-squared value in this problem?
Given the positive intercept and the fact that the slope lies between zero and one, what can you say
about the height of students who have quite tall parents? Who have quite short parents?
Q.3
In a simple regression and correlation analysis based on 72 observations, we find r = 0.8 and Se=10
(a)
Find the amount of unexplained variation (Residual Sum of Squares)
(b)
Find the proportion of unexplained variation to the total variation.
(c)
Find the total variation of the dependent variable (TSS)
Q.4. From the text book
Question 4.3 (a) and (d)
Question 4.4
Q.5
SCENARIO: Suppose you are interested in learning about the determinants of college GPA for VIU students.
Being aware of the common factors, you wish to determine the effect of skipping classes on GPA. You have
collected a random sample of 100 VIU students and have recorded their ages (Agei in years), College GPA
(colgpai), high school GPA (hsgpa), provincial achievement exam score (Prov), average lectures missed per
week (Skipped) and the average number of days per week of alcohol consumption (Alcohol)
The statistical analyses you perform are given in the Exhibits file below
Based on the Regression results,
a.
what is the verbal interpretation of the estimated regression equation ?
b.
what college GPA of a randomly selected student who skipped 6 classes? Is this prediction reliable?
Why or why not?
c.
Estimate R2 for the regression and interpret your result.
d.
Based on the computer output, find TSS, ESS and USS.
DESCRIPTIVE STATISTICS
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Sum
Sum Sq.
Deviation
Observations
AGE
20.88652
21.00000
30.00000
19.00000
1.271064
3.229636
2945.000
226.1844
ALCOHOL
1.901064
2.000000
7.000000
0.000000
1.374701
0.981049
268.0500
264.5723
COLGPA
3.056738
3.000000
4.000000
2.200000
0.372310
0.324620
431.0000
19.40610
HSGPA
3.402128
3.400000
4.000000
2.400000
0.319926
-0.310957
479.7000
14.32936
PROV
24.15603
24.00000
33.00000
16.00000
2.844252
0.062098
3406.000
1132.567
SKIPPED
1.076241
1.000000
5.000000
0.000000
1.088882
1.234972
151.7500
165.9929
100
100
100
100
100
100
CORRELATION MATRIX
AGE
ALCOHOL
COLGPA
HSGPA
PROV
SKIPPED
AGE
1.000000
-0.022005
-0.019504
-0.259368
-0.082002
-0.077569
ALCOHOL
COLGPA
HSGPA
PROV
SKIPPED
1.000000
0.017187
-0.045805
0.169029
0.337670
1.000000
0.414555
0.206754
-0.261820
1.000000
0.345806
-0.089662
1.000000
0.115485
1.000000
REGRESSION OUTPUT
Dependent Variable: COLGPA
Method: Least Squares
Date: 01/18/06 Time: 11:51
Sample: 1: 100
Included observations: 100
Variable
Coefficient
Std. Error
t-Statistic
Prob.
C
SKIPPED
3.153084
-0.089521
0.042775 73.71302
0.027990 -3.198383
0.0000
0.0017
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat
0.061849
18.07582
-55.25029
1.983121
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)
3.056738
0.372310
0.812061
0.853887
10.22965
0.001712
Download