Problem set #2B

advertisement
PROBLEM SET #2
1.
In simple linear regression analysis, the dependent variable
a) is the variable that changes in response to changes in an independent variable.
b) is the variable whose changes affect the dependent variable.
c) is on the right-hand side variable.
d) can be either variable.
2.
In simple linear regression analysis, the independent variable is
a) the variable that changes in response to changes in an independent variable.
b) the variable whose changes affect the dependent variable.
c) the left-hand side variable.
d) can be either variable.
3.
The predicted value of 𝑦𝑖 is
a) the value that 𝑦𝑖 takes on when 𝑥𝑖 equals 0.
b) the effect that a one-unit change in the dependent variable is expected to have on the
independent variable, holding all else constant.
c) the value of 𝑦𝑖 when the slope is multiplied a specific 𝑥𝑖 and then that value is added to
the intercept.
d) the observed value of the dependent variable that is associated with a specific value of
the independent variable.
4.
The error term includes all of the following except
a) omitted variables.
b) deterministic relationships.
c) incorrect functional form.
d) measurement error.
5.
The estimated slope coefficient is
a) the estimated marginal effect of 𝑥 on 𝑦.
b) the estimated value of 𝑦 when 𝑥 equals 0.
c) equal to the population slope coefficient.
d) 𝛽̂0.
6.
The residual is
a) the difference between the observed value of the dependent variable and the observed
value of the independent variable.
b) the difference between the observed value of the dependent variable and the predicted
value of the dependent variable.
c) the difference between the predicted value of the dependent variable and the predicted
value of the independent variable.
d) the difference between the predicted value of the independent variable and the observed
value of the independent variable
7.
Suppose you determine the estimated sample regression function to be 𝑦̂𝑖 = 1,016.82 +
473.65 ∙ 𝑥𝑖 . You would conclude that
a) 𝑦𝑖 is estimated to equal 473.65 when 𝑥𝑖 = 0.
b) 𝑦𝑖 is estimated to increase by 473.65 for every one unit increase in 𝑥𝑖 .
c) 𝑦𝑖 is estimated to decrease by 473.65 for every one unit increase in 𝑥𝑖 .
d) 𝑦𝑖 is estimated to increase by 1,016.82 for every one unit increase in 𝑥𝑖 .
8.
The term “goodness-of-fit” refers to
a) the accuracy of the estimated sample regression function.
b) whether or not the estimated sample regression function is correct.
c) method by which we determine the best-fit line.
d) the extent to which observed data match the values expected by theory.
9.
The explained variation in 𝑦 is
a) the distance between the best-fit line and the data points.
b) the distance between the mean and the predicted value of 𝑦.
c) the distance between the mean and the data points.
d) the distance between the observed and predicted values of y.
10. The unexplained variation in 𝑦 is
a) the distance between the observed and predicted values of 𝑦.
b) the distance between the mean and the best-fit line.
c) the distance between the mean and the data points.
d) the distance between the mean and the predicted value of 𝑦.
11. The coefficient of determination (𝑅 2 ) is
a) the ratio of the unexplained variation in 𝑦 to the total variation in 𝑦.
b) the ratio of the explained variation in 𝑦 to the total variation in 𝑦.
c) the ratio of the explained variation in 𝑦 to the unexplained variation in 𝑦.
d) the ratio of the unexplained variation in 𝑦 to the explained variation in 𝑦.
12. In general, a larger 𝑅 2 tends to suggest that
a) the estimated sample regression function explains a greater percentage of the total
variation in 𝑦.
b) the estimated sample regression function is more accurate.
c) the estimated sample regression function explains a greater percentage of the explained
variation in 𝑦
d) the estimated slope coefficient is more likely to equal the population slope coefficient.
13. The standard error of the estimated sample regression function (𝑠𝑦|𝑥 ) is
a) the square root of the unexplained sum of squares.
b) the square root of the explained sum of squares.
c) the square root of the unexplained sum of squares divided by the degree of freedom of
the regression.
d) the square root of the explained sum of squares divided by the degree of freedom
14. In general, a larger 𝑠𝑦|𝑥 tends to suggest that
a) the estimated sample regression function explains a greater percentage of the total
variation in 𝑦
b) the estimated sample regression function is more accurate.
c) the data points fall closer to the best-fit line
d) the data points fall further from the best-fit line.
15. Suppose you wish to determine the degree to which annual earnings of PGA tour players
(Earnings) are related to driving distance (Yards Per Drive). In such a case,
a) you should define Yards Per Drive as the dependent variable.
b) you should define Earnings as the dependent variable.
c) it does not matter which variable you define as the dependent variable.
d) you should define Earnings as the independent variable.
16. Suppose you are given the Excel output in Figure 4.1. You would conclude that each
additional Yard Per Drive is estimated to be associated with
a) a $7,773,135.558 decrease in annual earnings.
b) a $2,867,254.773 increase in annual earnings.
c) a $30,737.523 increase in annual earnings.
d) a $9,823.548 increase in annual earnings.
17. Suppose you are given the Excel output in 4.1. You would conclude that the estimated
sample regression function explains
a) 21.65 percent of the total variation in annual earnings.
b) 4.69 percent of the total variation in annual earnings.
c) 4.21 percent of the total variation in annual earnings.
d) 0.02 percent of the total variation in annual earnings
18. Suppose you are given the Excel output in 4.1. You would conclude that the standard error
of the estimated sample regression function (𝑠𝑦|𝑥 ) is
a) 0.2165.
b) 0.0469.
c) 1,155,621.367.
d) 201.
19. Suppose you are given the Excel output in 4.1. You would conclude that explained sum of
squares is
a) 1.307𝐸 + 13.
b) 2.657E+14
c) .1.200E+13
d) .you cannot tell from the information given
20. Suppose you are given the Excel output in 4.1. You would conclude that the number of
golfers in the sample is
a) 1.
b) 199.
c) 200.
d) 201.
21. Suppose you are given the Excel output in 4.1. You would conclude that number of degrees
of freedom of the regression is
a) 1.
b) 199.
c) 200.
d) 201.
Calculations
1.
A counselor working with teenagers is interested in the relationship between anxiety and
depression. The counselor administers a depression and anxiety test to each teenagers
selected randomly. The scores obtained from the administration of the two inventories are
summaarized below.
The summary statistics are
Anxiety
Sample mean
Standard deviation
25.2632
21.9895
Depression
13.7895
9.8464
19
∑(𝑥1 − 𝑥̅ )(𝑦𝑖 − 𝑦̅) = 3828.0526
𝑖=1
a. If anxiety is the independent variable and depression is the dependent variable, what is the
sample regression function? What do the estimated slope and intercept mean in context of this
problem? Is the intercept meaningful? Why or why not?
b. What is R-squared and what does it mean?
c. If an individual had an anxiety score of 40, what is their predicted level of depression?
Q.2
Consider the regression model Y = A+BX+, where Y is Total Cost of publishing a book and X
is the number of pages in the book. You wish to estimate the regression based on 500 different
books. A spreadsheet calculation with these 500 observations gave the following results:
i = 1500
Yi = 700
iYi = 9000 i2 = 33000
Yi2 = 3600
a.
b.
c.
d.
Determine and interpret the least square regression equation.
Find the following: ESS, TSS, USS
Find the estimate of the standard error of the regression (Se).
Find and interpret the R2. What are the units of measurement for the R2?
Download