Multiple Linear Regression

advertisement
QUESTIONS 1-8 DEAL WITH THE FOLLOWING SITUATION: Stock Prices, Y, are assumed to be
affected by the annual dividend rate, X1, and annual return on equity, X2. A first order regression was fit to the data
and the following analysis resulted.
Sum of
Mean
Source
DF
Squares
Square
F Value
Prob
Model
2
Error
25
Total (Adjusted) 27
Square Root MSE
Dep Mean
1148.64291
248.03566
1396.67857
3.14983
22.10714
574.32145
9.92143
R-square
57.887
0.0001
0.8224
Parameter Estimates
Variable
INTERCEPT
X1
X2
Obs
29
Coefficient
Standard
Estimate
Error
-15.830443 4.67316808
12.276010 1.19702953
0.609433
0.28306125
X1
X2
2.96
17.1
Actual
Var Y
.
T for H0:
B=0
-3.388
10.255
2.153
Prob
0.0023
0.0001
0.0412
Predicted 95% LCL
Value
Mean
95%UCL
Mean
30.9279
33.4198
28.4359
95% LCL
95% UCL
Individual Individual
23.9785
37.8772
1. I am 95% confident that the average stock price for all stocks with a dividend rate of 2.96 and an equity return of
17.1 falls in the range
A. 35.00 to 30.92
B. 20.95 to 23.74
C. 28.44 to 33.42
D. 15.71 to 28.98
E. 23.98 to 37.88
2. What is the p-value for testing for the effect of equity?
A. 2.153
B. 0.0001
C. 0.0412
D. 0.05
E. 0.8224
3. What percent of the sample variability in stock prices can be attributed to variation in the dividend rate and the
return on equity.
A.
B.
C.
D.
E.
78.14
82.24
78.95
80.82
89.27
4. The rejection region for testing that equity and/or dividend rate is useful for estimating the mean stock price is
A. F > F(2, 25, 0.05)
B. |t| > t(2, 25, 0.025)
C. t > t(25, 0.025) or t < -t(25, 0.025)
D. F < F(25, 0.05)
E. F > F(25, 0.05)
5. If there is interaction between annual rate of dividend and return on equity then
A. the stock price interacts with annual rate of dividend
B. the change in the mean stock price associated with each additional point in the dividend rate is a linear function
of the annual return on equity.
C. the slope of annual rate of dividend is a function of the stock price
D. the mean stock price depends on annual rate of dividend and/or the number of equity (only true for interaction
model)
E. annual rate of dividend is correlated with the return on equity.
6. What is your conclusion after testing 2 = 0 versus 2 ≠ 0?
A. At alpha=0.05, we can say that after adjusting for the dividend rate , the return on equity does help predict the
stock price.
B. At alpha=0.05, we can not say that the dividend rate does help predict the stock price.
C. At alpha=0.05, we can say that the dividend rate does help predict the stock price.
D. At alpha=0.05, we can say that after adjusting for the dividend rate, the return on equity does not help predict the
stock price.
E. At alpha=0.05, we can not say that after adjusting for the dividend rate, the return on equity does help predict the
stock price.
7. What is the estimate of the typical sample error when trying to predict stock price with the dividend rate and return
on equity?
A. 3.14983
B. 2.31635
C. 0.82240
D. 4.67317
E. 1.19703
Answers:
1. C Observation 11, third and fourth columns from left
2. C p-value:
3. B definition of r-square
4. A F(k, n-k-1)
5 B interaction interpretation
6. A testing equity - p-value = 0.0412 < 0.05 reject H0
7. A root MSE is the standard error of y given x1 and x2
QUESTIONS 8-14 DEAL WITH THE FOLLOWING SITUATION: A collector of antique grandfather clocks
believes that the price received for the clocks, Y, at an antique auction increases with the age of the clocks, X1, and
with the number of bidders, X2. A first order regression was fit to a random sample of 32 clocks with the following
analysis resulted.
Analysis of Variance
Sum of
Squares
Mean
Square
Source
DF
Model
Error
C Total
2 4277159.7034 2138579.8517
29 514034.51534 17725.32812
31 4791194.2188
Root MSE
Dep Mean
133.13650
1327.15625
R-square
Adj R-sq
F Value
Pro
120.651
0.0
0.8927
0.8853
Parameter Estimates
Variable
INTERCEP
X1
X2
Coefficient Standard
Estimate
Error
-1336.722052
12.736199
85.815133
173.35612607
0.90238049
8.70575681
T for H0:
Parameter=0
-7.711
14.114
9.857
Prob
0.0001
0.0001
0.0001
8. The estimated slope of number of bidders is:
A 85.815133
B 0.90238049
C 0.0001.
D 12.736199
E 14.114
9. The true slope parameter of age is interpreted as:
A The change in the mean auction price for each additional year of age.
B The change in the auction price for each additional year of age.
C The change in the mean auction price for each additional year of age when the number of bidders is held
constant,
D The mean auction price given the age, holding the number of bidders constant.
E The change in the estimated mean auction price for each additional year of age.
10. The test statistic value for testing the utility of the model is:
A 3.33
B 120.651
C 14.114.
D 133.1365
E 0.8927
11. The rejection region for testing that increases in the mean auction price of the clock will be associated
with increases in the number of bidders ( age held constant) is
A t > 2.042 or t < -2.042
B t > 2.045
C t > 1.699 or t < -1.699
D t > 1.699
E t > 1.697
12. The probability of saying that "the mean auction price is different for clocks one year apart in age
(holding number of bidders constant)" when actually there is no difference is called
A a Type II error
B the p-value
C the slope.
D the power of the test
E alpha
13. The degrees of freedom of the estimated variation in the auction prices for all clocks of the same age and
number of bidders is:
A1
B 31
C 30
D 29
E2
14. If there is interaction between age and number of bidders then
A the auction price interacts with age
B the change in the mean auction price associated with each additional bidder is a linear function of the
age.
C the slope of age is a function of the auction price.
D the mean auction price depends on age and/or the number of bidders (only true for interaction model)
E age is correlated with the number of bidders.
ANSWERS
8. A
9. C
10. B
11. D
12. E
13. D
14. B
QUESTIONS 15-23 DEAL WITH THE FOLLOWING SITUATION: The expected sales of a product, Y,
in a city are assumed to be affected by the per capita discretionary income (PCDI), X1, and the population
of the city, X2. A first order model was fit to a random sample of 15 cities
Analysis of Variance
Sum of
Mean
Source
DF
Squares
Square
F Value
Prob>F
Model
Error
Total
2
12
14
53844.71643
56.88357
53901.60000
26922.35822
4.74030
5679.466
0.0001
Root MSE
Dep Mean
2.17722
150.60000
R-square
Adj R-sq
0.9989
0.9988
Parameter Estimates
Parameter
Estimate
Variable
INTERCEP
POP
INCOME
3.452
0.496
0.009
Standard T for H0:
Error
Parameter=0
Prob
2.430
0.006
0.001
0.1809
0.0001
0.0001
Actual
Obs
POP
INCOME
16
17
220
375
2500
3500
15. The
A. PCDI
B. Mean
C. PCDI
D. PCDI
E. PCDI
.
.
1.420
81.924
9.502
Pred
135.6
221.7
Lower95% Upper95% Lower95% Upper95%
Mean
Mean
Predict
Predict
134.1
219.8
137.1
223.6
130.6
216.5
140.5
226.8
null hypothesis for the test of model usefullness is interpreted as
and Population size are not linearly related.
sales is not a linear function of PCDI and Population size.
and Population size do not vary.
and Population size do not help predict the sales.
and Population size do help predict the sales.
16. What is the test statistic value when testing that both coefficients
are equal to zero?
A. 2.17722
B. 81.924 + 9.502 = 90.426
C. 0.1809
D. 0.9988
E. 5679.466
17. What is the interpretation of the prediction interval for observation
17?
A. With 95% confidence we can say that a city with 375,000 people and a
PCDI of $3,500 would have sales between $219,800 and $223,600.
B. With 95% confidence we can say that all cities with 375,000 people and
a PCDI of $3,500 would have mean sales between $219,800 and $223,600.
C. With 95% confidence we can say that a city with 375,000 people and a
PCDI of $3,500 would have sales between $67,300 and $292,300.
D. With 95% confidence we can say that all cities with 375,000 people and
a PCDI of $3,500 would have mean sales between $216,500 and $226,800.
E. With 95% confidence we can say that a city with 375,000 people and a
PCDI of $3,500 would have sales between $216,500 and $226,800.
18. When testing the alternative hypothesis that 1 is less than zero,
what is your conclusion?
A. Since p-value =0.0001, we can say that when holding PCDI constant
increases in population size is associated with decreases in mean sales.
B. Since p-value > 0.05, we can say that increases in population size is
associated with decreases in mean sales.
C. Since the test statistic value is 9.502, we can not say that when
holding PCDI constant increases in population size is associated with
increases in mean sales.
D. Since p-value < 0.05, we can say that when holding city size constant,
PCDI does help estimate mean sales.
E. Since p-value > 0.05, we can not say that when holding PCDI constant
increases in population size is associated with decreases in mean sales.
19. What would be the rejection region value when testing that the change
in the mean sales with each one dollar increase in the PCDI depends on the
number of people in the city. Reject Ho if
A. |t| > t(12, 0.025)
B. F > F(2, 12, 0.05)
C. chi-squared > chi-squared( 12, 0.05)
D. t > t(11, 0.025) or t < -t(11, 0.025)
E. F > F(3, 11, 0.05)
20. What is the meaning of the confidence interval for the mean value
given X1=375 and X2=3500?
A. With 95% confidence we can say that a city with 375,000 people and
PCDI of $3,500 would have sales between $67,300 and $292,300.
B. With 95% confidence we can say that all cities with 375,000 people
a PCDI of $3,500 would have mean sales between $216,500 and $226,800.
C. With 95% confidence we can say that a city with 375,000 people and
PCDI of $3,500 would have sales between $216,500 and $226,800.
D. With 95% confidence we can say that a city with 375,000 people and
PCDI of $3,500 would have sales between $219,800 and $223,600.
E. With 95% confidence we can say that all cities with 375,000 people
a PCDI of $3,500 would have mean sales between $219,800 and $223,600.
of Y
a
and
a
a
and
22. For all cities with the same PCDI, what is the estimated change in the
expected sales when the population of the city increases by one?
A. 99.89% increase
B. 99.88% increase
C. $ 2.177
D. $ 9
E. $ 496
22. Does interaction of X1 and X2 imply that there is correlation between
X1 and X2?
a. Yes.
b. No.
23. What is the value for the multiple coefficient of determination?
A. 2.177
B. 0.0001
C. 9.502
D. 1.4457
E. 99.89%
Answers
----------------------15.
16.
17.
18.
19.
20.
21.
22.
23.
d
e
e
e
d
e
e
b
e
Question 24-27 deal with the following situation: An instructor of BUSA 5325
wants to know if the exam scores of the third exam can be predicted from
the exam scores of the first two exams. The numeric scores of the second
and third exam are available. However, the first exam seems not to have a
linear relationship with exam-3 and has been changed to a categorical
variable. The variables are
Y = Exam score on the third exam
X1a = 1 if student made an A on exam 1
0 if not
X1b = 1 if student made a B on exam 1
0 if not
X2 = Exam score on the second exam
The following model is proposed:
E(Y) = 0 + 1X1a + 2X1b +  3 X2
A random sample of 32 of the students was selected and the analysis of
variance report is below.
PARALLEL (NO-INTERACTION)MODEL - Analysis of Variance
Source
DF
Sum of
Squares
Mean
Square
Model
Error
C Total
3
28
31
2347.91005
2998.55870
5346.46875
782.63668
107.09138
Root MSE
Dep Mean
10.34850
77.28125
R-square
F Value
Prob>F
7.308
0.0009
0.4392
Parameter Estimates
Variable
DF
Parameter
Estimate
Standard
Error
T for H0:
Parameter=0
Prob > |T|
INTERCEP
X1a
X1b
X2
1
1
1
1
47.043655
9.801066
-5.364949
0.356597
12.71357992
4.17220766
4.02638082
0.15862042
3.700
2.349
-1.332
2.248
0.0009
0.0261
0.1935
0.0326
Variable
DF
Standardized
Estimate
Tolerance
Variance
Inflation
INTERCEP
X1a
X1b
X2
1
1
1
1
0.00000000
0.37838587
-0.20590050
0.36907595
.
0.77202700
0.83883060
0.74317755
0.00000000
1.29529148
1.19213582
1.34557349
24. Based on the above F value 7.308 and its p-value, the following conclusion
can be made: At alpha = 0.05
A. We can say that the average exam-3 scores is affected by either the
exam-2 scores or the exam-1 categories.
B. We can say that the average exam-3 scores differ among the exam 1
categories after adjusting for the exam-2 scores
C. We can not say that exam-2 grades can help predict exam-3 scores after
adjusting for the exam 1 categories.
D. I can say that I haven't the foggiest idea what you are talking about
(Hint: this is a wrong answer).
E. We can not say that the average exam-3 scores differ among the exam 1
categories after adjusting for the exam-2 scores.
25. For students with the same exam score on the second exam, what is the
estimated difference in the average exam-3 grades for students who made an A on
exam 1 minus the average exam-3 grades for students who made lower than a B?
A. 9.80
B. 4.03
C. -5.36
D. 0.36
E. 4.91
26. For the inferences in the report to be valid, certain assumptions must
hold. Which of the following is not an assumption?
A. The grades on the third exam are normally distributed for any grade on
the second exam and any category of the first exam.
B. For any grade on the second exam and for any category of the first
exam, the grades on the third exam have the same variation.
C. The difference between a student's third exam score and their expected
grade on the third exam (given their grade on the second exam and their
category on the first exam) are independent from student to student.
D. The letter grade on the first exam is independent of the numeric grade
on the second exam.
E. The expected grade on the third exam is a linear function (as specified
above) of the second exam grade for any category of the first exam.
27. What is the rejection region for testing that the second exam grade is
useful for predicting the third exam grade when the grade category of the
first exam is held constant? Reject Ho if
A. F > F (3, 28, 0.05)
B. F < F (3, 28, 0.05)
C. F > F (2, 28, 0.05)
D. | t | > t (28, 0.025)
E. t > t (31, 0.05)
Questions 28-31 deal with the same situation as in questions 24-27 but use
the interaction model:
E(Y) = 0 + 1X1a + 2X1b + 3 X2 + 4X1aX2 + 5X1bX2
where X1aX2 = EXAM2 score times(dummy variable for Exam 1 A students) and
X1bX2 = EXAM2 score times(dummy variable for Exam 1 B students)
The analysis of variance report is below.
INTERACTION MODEL Analysis of Variance
Source
DF
Sum of
Squares
Mean
Square
Model
Error
C Total
5
26
31
2882.16100
2464.30775
5346.46875
576.43220
94.78107
Root MSE
Dep Mean
9.73556
77.28125
R-square
F Value
Prob>F
6.082
0.0007
0.5391
Parameter Estimates
Variable
DF
Parameter
Estimate
Standard
Error
T for H0:
Parameter=0
Prob > |T|
INTERCEP
X1a
X1b
X2
X1aX2
X1bX2
1
1
1
1
1
1
61.952569
-48.877273
4.914131
0.126128
0.725660
-0.083436
16.63008526
25.07804018
22.73269519
0.21473082
0.30587356
0.27919460
3.725
-1.949
0.216
0.587
2.372
-0.299
0.0010
0.0622
0.8305
0.5620
0.0254
0.7674
Variable
DF
Standardized
Estimate
Tolerance
Variance
Inflation
INTERCEP
X1a
X1b
X2
X1aX2
X1bX2
1
1
1
1
1
1
0.00000000
0.13054158
-1.88698547
0.18859862
2.38205577
-0.24771236
.
0.35891291
0.01891229
0.02328998
0.01758463
0.02580162
0.00000000
2.78619126
52.87565911
42.93691371
56.86784101
38.75725377
28. Based on the F value 6.082 and its p-value, we can make the following
conclusion: At alpha = 0.05
A. We can not say that the average exam-3 scores is affected by either the
exam-2 scores or the exam-1 categories.
B. We can say that either the exam2 grade or the exam 1 letter grade help
predict the exam 3 scores.
C. We can not say that exam-2 grades can help predict exam-3 scores after
adjusting for the exam 1 categories.
D. We can say that exam-2 grades can help predict exam-3 scores after
adjusting for the exam 1 categories.
E. We can say that exam-3
grades can help predict exam-2 scores after
adjusting for the exam 1 categories.
29. In an interaction model the difference in means is a function of
another variable. Using the interaction model, what is the difference
between the average exam-3 grades for category B students (students who made an
B on exam 1) minus the average exam-3 grades for category C students
(students who made neither an A nor a B)?
A. 2
B. 3 + 2*X1b
C. 3
D. 2 + 5 * X2
E. 1 + 4 * X2
30. Which of the following is not a solution to some problems caused by
multicollinearity?
A. Use other procedures than least squares.
B. Use a designed experiment.
C. Drop one or more of the correlated variables.
D. Increase your sample size.
31. Which of the following is not a result of multicollinearity?
A. The t values of important independent variables are small.
B. The predictions of the dependent variable become very poor.
C. Coefficients of variables could have signs that conflict with
theoretical expectations.
D. The standard errors of important independent variables are large
E. The experimental region becomes elliptical and narrow.
32. Is there multicollinearity among the variables of the interaction
model?
A. No because the F-test of interaction is not significant: p-value of
0.0780 > 0.05
B. Yes because the largest VIF (56.87) > 10
C. No because the highest tolerance (.36) < 10
D. No because the standardized estimate (-.24) is less than zero.
E. Yes because R-squared (0.5391) is large.
33
34. When you can not control the values of the independent variables, the
data is said to be
A. Latent variable data
B. observational data
C. experimental data
D. continuous data
E. a random sample
ANSWERS
24. A
25. A
26. D
27. D
28. B
29. D
30. D
31. B
32. B
33.
34. B
Questions 35-38 deal with the following situation: A 10-speed bicycle shop is
located near a large southern university. The owner of the shop is having
difficulty determining the quantity of bicycles to order each month from the
manufacturer. To solve the owner's problem, it is essential that the owner be
able to predict the monthly demand for the bikes. For the last 15 months, the
following variables are available:
Y = the monthly demand for the bikes,
X1 = the average price of lead-free gasoline for the month
X2 = 1 if fall quarter (September-November)
0 if not
X3 = 1 if winter quarter (December-February)
0 if not
X4 = 1 if spring quarter (March - July)
0 if not.
See attached pages for statistics.
Questions 36-39 use the model:
E(Y) = 0 + 1X1 + 2X2 + 3 X3+ 4 X4 .
ANALYSIS OF VARIANCE REPORT
Dependent Variable: Y : Monthly Demand for 10-speed bicycles
Source
df
Model
Error
Total
4
10
14
Sums of
Squares
3861.859
3451.741
7313.6
Root Mean Square Error
Mean of Dependent Variable
Variable
Parameter
Estimate
482.7626
-382.0742
33.11131
17
28.09902
Intercept
X1
X2
X3
X4
35. The
A E(Y)
B E(Y)
C E(Y)
D E(Y)
E E(Y)
\
mean
= 0
= 0
= 0
= 0
= 0
+
+
+
+
Mean Square
F-Ratio
965.4647
345.1741
522.4
Prob > F
2.80
0.085
18.57886 R Squared
40.6
Standard
Error
183.3842
152.1359
13.632
15.16958
14.98537
t-value
(B=0)
2.63
-2.51
2.43
1.12
1.88
Prob.
> |t|
0.0251
0.0308
0.0355
0.2886
0.0902
0.5280
VIF
TOL
1.165
1.795
1.600
1.908
.8584
.5572
.6250
.5240
sales for the winter months fall on the line:
1 X1 + 2 X2 + 3 X3
2
+ 1 X1
3
+ 1 X1
4
+ 1 X1
+ 1 X1
36. Holding the average price of lead-free gasoline constant, the estimated
difference in the mean sales for the winter months minus the mean sales for the
summer months is
A 482.7626
B -382.0742
C
33.11131
D
17
E
28.09902
37. Using alpha of 0.05, what will be the rejection region for testing the
alternative hypothesis that " the quarter of the year and/or the average price
of lead-free gasoline help predict the monthly demand for 10-speed bicycles"?
A Reject H0 if t > 2.228 or t < -2.228
B Reject H0 if F > 3.11
C Reject H0 if F > 3.48
D Reject H0 if t >1.812
E Reject H0 if t > 2.776 or t < -2.776
38. The 4 parameter in the model is interpreted as:
A The mean sales for the spring quarter.
B The mean sales for the spring quarter holding the average price of leadfree gas constant.
C The estimated difference in the mean sales for the spring months minus
the mean sales for the summer months, holding the average price of lead-free
gas constant
D The difference in the mean sales for the spring months minus the mean
sales for the summer months, holding the average price of lead-free gas
constant
E The difference in the mean sales when the average price of lead-free gas
increases by $1, holding the quarter constant.
39. Since there are 15 consecutive months of data, what is often a problem with
this type of data?
A nonlinearity,
B unequal variance,
C correlated errors
D non-normality
40. Multicollinearity says that
A the dependent variable is a linear function of the independent
variables.
B at least one independent variable is (statistically) linearly related to
or correlated with the other independent variables.
C the slope of one independent variable is a linear function of other
independent variables.
D outliers exist in the data.
E the variance inflation factor will be small.
41. In the no-interaction model, there is
A no evidence of multicollinearity because all VIFs are < 10.
B no evidence of multicollinearity because all variables are significant.
C evidence of multicollinearity because X1 is significant.
D evidence of multicollinearity because all VIFs are < 10.
E no evidence of unequal variance because all VIFs are < 10.
42. Which of the following is not a result of multicollinearity?
A the t values of important independent variables are small.
B the standard errors of important independent variables are large
C coefficients of variables could have signs that conflict with
theoretical expectations.
D the predictions of the dependent variable become very poor.
E the experimental region becomes elliptical and narrow.
43. If a residual plot shows all assumptions appear to be satisfied except for
linearity, which variable(s) should you transform?
A the dependent variable
B one or more of the independent variables
C both the dependent and independent variables
44. The residual plot for the interaction model is on the last page. The most
obvious violation is:
A nonlinearity,
B unequal variance,
C correlated errors
D non-normality
35.
36.
37.
38.
39.
40.
41.
42
43
44.
ANSWERS
C
D
C
D
C
B
A
D
B
B
Question 45-48 deal with the following situation: A firm wishes to compare
the costs among three couriers DFW, Carborne, and Metro. The measured
variables are
Y = cost of delivery
x1 = 1 if courier is DFW,
0 if not
x2 = 1 if courier is Carborne, 0 if not
total = pickup plus delivery time
The following model is proposed:
E(Y) = B0 + B1 * X1 + B2 * X2 + B3 * TOTAL
A random sample of 83 deliveries was selected and the analysis of variance
report is attached.
Analysis of Variance
Source
Prob>F
Model
0.0001
Error
C Total
Root MSE
Dep Mean
DF
Sum of
Squares
Mean
Square
F Value
3
1003.99331
334.66444
98.961
79
82
267.16091
1271.15422
3.38178
1.83896
19.96265
R-square
Adj R-sq
0.7898
0.7818
Parameter Estimates
Parameter Standard T for H0:
Variance
Variable Estimate Error Para=0
Prob > |T| Tolerance Inflation
INTERCEP 11.539
X1
-1.086
X2
-1.793
TOTAL
0.194
INTERCEP
X1
X2
TOTAL
45.
A.
B.
C.
D.
E.
What
yhat
yhat
yhat
E(Y)
yhat
1
1
1
1
0.747
0.537
0.480
0.012
15.445
-2.023
-3.733
15.622
0.0001
0.0465
0.0004
0.0001
.
0.687
0.754
0.898
Intercept
1 if DFW, 0 if not
1 if Carborne, 0 if not
delivery time plus pickup time
would be the estimated mean cost for Carborne?
= (11.539 - 1.086) + 0.194*total
= (11.539 - 1.793) + 0.194*total
= (11.539
) + 0.194*total
= B0 +
B2 * X2 + B3 * TOTAL
= 11.539 - 1.086*X1 -1.793*x2 + 0.194*total
0.000
1.455
1.325
1.113
46. What would be the null hypothesis when testing "the difference in mean
costs between DFW and Metro couriers is zero, after adjusting for the
total time that it takes to pickup and deliver the package"?
A. H0: B1=0
B. H0: B2=0
C. H0: B3=0
D. H0: B1=B2=B3=0
E. H0: B1=B2=0
47. The test statistic value for TOTAL is 15.622. From this we can ______.
A. not say the mean cost differs among the couriers, after adjusting for
the total time to pickup and deliver.
B. not say that the mean cost changes with changes in the total time to
pickup and deliver, adjusting for the courier.
C. both courier and total help predict the cost.
D. say that the mean cost changes with changes in the total time to pickup
and deliver, adjusting for the courier.
E. say the mean cost differs among the couriers, after adjusting for the
total time to pickup and deliver.
48. What table value would be used in a 95% confidence interval interval
for the mean value of y conditional on the x values?
A. t(82, 0.025)
B. F(2, 80, 0.025)
C. t(3, 0.05)
D. F(3, 79, 0.05)
E. t(79, 0.025)
49. If a linear combination of the independent variables is highly
correlated with another independent variable, then this is called:
A. multicollinearity
B. a violation of the independence assumption
C. a violation of the linearity assumption
D. nonnormality
E. interaction
50. What does NOT happen as more redundant variables are added. Redundant
variables would be ones that measure the same thing as variables already
in the model. (Assume no missing values.)
A. MSE would increase.
B. Important variables would become insignificant.
C. The correlation between two independent variables would increase.
D. VIFs would increase.
51. In the model, what is the VIF for X1?
A. 14.792
B. 1.694
C. 1.080
D. 1.455
E. 0.537
45.
46.
47.
48.
49.
50.
51.
ANSWERS
----------------------------------B
A
D
E
A
C
D
Download