A multiple regression was performed with 2 independent variables

advertisement
Final Exam, Form A
Spring 2002
Economics 173
Instructor: Petry
Name____________
SSN_____________
Exam Instructions: Be sure you have all 17 pages with 47 questions. Select the best
answer from those provided. To receive credit for the exam, you must properly fill out
your bubble sheets with your name, SSN and net-id. Turn in your signed exam packet
with your bubble sheet. GOOD LUCK!
1. The measurements of monthly rainfall in Aspen in 2001 were as follows: 20, 18, 30,
35, 38, 37, 36, 37, 39, 35, 32, 21. What is the mean and the median of the data?
a.
b.
c.
d.
e.
Mean=30, median=31.5
Mean=31.5, median=35
Mean=35, median=35
Mean=32, median=30
Mean=32.5, median=35.5
2. Based on the relationship between the mean and the median in the above question we
would conclude that the distribution of the monthly rainfall in Aspen in 2001 is:
a.
b.
c.
d.
e.
Negatively skewed
Positively skewed
Symmetric
Normal
Not enough information to determine the shape of the distribution
3. We collected data from 5 different people on the number of jobs they have had and
the change of their salary (in thousand dollars) last year. The results were put in the
table below:
Person
1
2
3
4
5
# of previous jobs
4
4
4
4
4
Change in salary last year
6.8
4.9
-2.8
1.7
9
Based on this sample, which of the following is true?
a.
b.
687318201
The number of previous jobs and the change in salary are positively
correlated
The number of previous jobs and the change in salary are negatively
correlated
Page 1 of 18
c.
d.
e.
The number of previous jobs and the change in salary are not correlated
The covariance between the number of jobs and the change in salary is a
large positive number
The covariance between the number of jobs and the change in salary is a
large negative number
4. To estimate the amount of fish that can be harvested in a pond, the mean weight of
the fish in the pond must be estimated within +/-1 ounce with 99% confidence. What
sample size should be taken if the weight of the fish is normally distributed with   5
ounces? ( z 0.05  1.645, z 0.025  1.96, z 0.005  2.575, )
a.
b.
c.
d.
e.
166
65
66
42
97
5. When conducting a hypothesis test, a type II error occurs when
a.
b.
c.
d.
e.
we reject H 0 when H 0 is false
we fail to reject H 0 when H 0 is true
we reject H 0 when H 0 is true
we fail to reject H 0 when H 0 is false
none of the above
6. Suppose NASA wants to test the claim that the average time an astronaut spends in
space is less then 6 days. They took a sample of 100 astronauts and found that the
sample mean is 5.8. If we assume that the average time spent in space is normally
distributed with   1.2 , then the p-value of our test statistic is:
( z 0.05  1.645, z 0.025  1.96, z 0.005  2.575, )
a.
b.
c.
d.
e.
687318201
bigger than 0.05
smaller than 0.025 but bigger than 0.01
smaller than 0.01
smaller than 0.05 but bigger than 0.025
can’t be determined based on this information
Page 2 of 18
7. A widely acclaimed group of international experts found that a 95% confidence
interval for the average minutes a married couple spends on the phone per month is
[120, 186]. The number of minutes per month is normally distributed with known
standard deviation  . If we carried out a hypothesis test about  and obtained a test
statistic of z=3.504, what was our null hypothesis?
( z 0.05  1.645, z 0.025  1.96, z 0.005  2.575, )
a.
b.
c.
d.
e.
H 0 :   150
H 0 :   16.836
H 0 :   153
H 0 :   186
H 0 :   94
8. A factory owner wants to prove that the average output of workers in her factory
(measured in boxes per hour) is more than 100. She wants to be 95% confident, and
takes a sample of 50 workers. The sample mean turns out to be 105, and the sample
standard deviation s=15. If the relevant critical value is 1.676, we would
a. fail to reject H 0 and conclude that the average output of workers is
less than 100
b. reject H 0 and conclude that the average output of workers is greater
than 100
c. fail to reject H 0 and conclude that the average output of workers is
greater than 100
d. reject H 0 and conclude that the average output of workers is less than
100
e. we can’t decide because the test statistic is the same as the critical
value
9. Suppose that in the above question we actually knew that the average output is
normally distributed with   15 . The result of the same hypothesis test in this case
would be:
687318201
Page 3 of 18
a.
b.
c.
d.
e.
the same as before because the relevant critical value is now bigger than
2.4
the same as before because the relevant critical value is now smaller than
1.676
fail to reject H 0
reject H 0 and conclude that the average output of men is less than 100
There is not enough information to decide.
10. By looking at the rate of return of Intel and the S&P 500 for the last four years, we
can conclude that:
S&P
Period
500
Intel
1
6
2
2
1
4
3
5
1
4
2
5
a) for every additional unit of return of S&P500, Intel’s return will increase by 0.85
unit.
b) for every additional unit of return of S&P500, Intel’s return will decrease by 0.65
unit.
c) for every additional unit of return of S&P500, Intel’s return will increase by 1.05
unit.
d) for every additional unit of return of S&P500, Intel’s return will decrease by 1.05
unit.
e) for every additional unit of return of S&P500, Intel’s return will increase by 0.25
unit.
11. In the last question, what is the sum of the differences between actual Y’s and the
predicted ones?
a) 2.8823
b) 3.6625
c) 1.8952
d) 7.1176
e) 0
12. Which of the following statements violates the “required conditions” for a regression
model:
I.
The distribution of ε looks like chi-square distribution
II.
The standard deviation of ε for high value X’s is the same as low value
X’s.
III.
The mean of ε is 2.02: E(ε)=2.02
687318201
Page 4 of 18
IV.
a)
b)
c)
d)
e)
Errors associated with different values of Y are all independent.
I and II
I and III
III and IV
I and II and III
II and III and IV
Use the following information to answer questions 13-16.
The regression below shows the relationship between the market return (S&P 500) and
the return of the Novell Company.
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.335274
R Square
Adjusted R Square 0.097105
Standard Error
279.2267
Observations
60
ANOVA
df
Regression
Residual
Total
Intercept
S & P 500
1
SS
MS
F
Significance F
572702
572702 7.345388
0.008828
4522119 77967.57
5094821
Coefficients Standard Error t Stat
P-value
Lower 95% Upper 95%
-15.3611
37.20357 -0.41289 0.681207
-89.8321 59.10988
1.810597
0.668058 2.710238 0.008828
0.473334 3.147859
13. What is the proportion of the variability of the return of the Novell Company
explained by the variability in S&P500 returns?
a)
b)
c)
d)
e)
0.8876
0.1266
0.1124
0.8733
Not enough information to compute
14. What is the value of the statistic for testing the following hypothesis?
687318201
Page 5 of 18
Ho: β1 = 1
H1: β1 ≠ 1
a)
b)
c)
d)
e)
1.2134
2.7102
–1.2134
–2.7102
1.8162
15. What is the p-value for testing whether there is a positive linear relation between the
return of the Novell Company and S&P 500:
a) 0.0177
b) 0.0088
c) 0.0044
d) 0.9956
e) 0.9912
16. What are the residual degrees of freedom and total degrees of freedom respectively?
a) 57,59
b) 58,60
c) 57,58
d) 58,59
e) 57,60
17. Going from any other line fitted to a series of X and Y to a “Least Squares
Regression Line”:
a) SSE increases but SSR decreases
b) Both SSE and SSR decrease
c) SSE decreases but SSR increases
d) SSE increases but SST decreases
e) SSR decreases but SST increases
Use the following information to answer question 18.
Below is the summary of 5 regression outputs for the relationship of the market (S&P
500) and 5 different companies.
EATN
Intercept
S&P
Coefficients Standard Error
t Stat
P-value
0.02777373 0.116423452 0.23855784 0.812289
0.47612071
0.20905961 2.27743996 0.026464
CAT
Intercept
S&P
Coefficients Standard Error
t Stat
P-value
0.05203058 0.146670497 0.35474469 0.724068
0.7339081 0.263373712 2.78656548 0.007188
687318201
Page 6 of 18
GM
Intercept
S&P
Coefficients Standard Error
t Stat
P-value
-0.0579397 0.149770652 -0.3868561 0.700279
0.91671769 0.268940608 3.40862503 0.001193
COKE
Intercept
S&P
Coefficients Standard Error
t Stat
P-value
-0.0655282 0.132785514 -0.493489 0.623531
0.88480059 0.238440685 3.71077859 0.000465
NOVELL Coefficients Standard Error
t Stat
P-value
Intercept
-0.1536111 0.372035737 -0.4128933 0.681207
S&P
1.81059653 0.668058235 2.71023757 0.008828
18. Which company or companies do you suggest for an aggressive investment assuming
the market is expected to rise?
a) 50% CAT and 50% EATN
b) EATN
c) 50% GM and 50% CAT
d) NOVELL
e) COKE
19. According to the following correlation matrix, which companies would you select as
a pair to minimize your risk?
EATN
EATN
CAT
GM
COKE
NOVELL
a)
b)
c)
d)
e)
1
0.62617225
0.21937638
0.13324676
-0.0018949
CAT
GM
COKE
1
0.218434208
1
0.141381763 0.05954184
1
0.050441165 0.33289566 -0.16053
NOVELL
1
EATN and NOVELL
CAT and COKE
EATN and CAT
GM and COKE
COKE and NOVELL
20. A random sample of 40 observations was taken out of a normal population. The
sample variance turned out to be 11. Test the claim that the population variance is less
than 15 knowing that χ2.95, 39= 25.7 and χ2.05, 39= 54.57. Your conclusion using a 5%
significance level is:
a) Reject the null hypothesis and conclude that σ2  15
b) Fail to reject the null hypothesis and conclude that there is insufficient evidence to
claim that σ2  15
687318201
Page 7 of 18
c) Fail to reject the null hypothesis and conclude that there is insufficient evidence to
claim that σ2  15
d) Reject the null hypothesis and conclude that σ2  15
e) There is insufficient evidence to conclude anything
21. Given that p̂ =.2 and n=250, estimate the 95% confidence interval for p.
Z0.025=1.96, Z0.05=1.645
a) [0.165 0.235]
b) [0.150 0.250]
c) [0.175 0.225]
d) [0.003 0.397]
e) [0.103 0.297]
22. To test whether more than 20% of the UIUC students think that the US soccer team
will win 50% or more of their games at the upcoming World Cup 2002 in Japan and
South Korea, your null hypothesis would be:
a) H0: p  0.50
b) H0: p  20
c) H0: p  0.20
d) H0: p =0.50
e) H0: p  0.20
23. If you need to estimate the proportion of the UIUC students who will watch the 2002
World Cup this summer within .04 units with 90% confidence, how big of a sample do
you need given that Z0.05=1.645, Z0.10=1.28.
a) 21
b) 406
c) 422
d) 752
e) 423
687318201
Page 8 of 18
Use the following information to answer questions 24-25.
Suppose Kraft Foods only produces regular butter, fat-reduced butter, fat-free butter and
margarine. The management will terminate production of any product whose proportion
of the total sales volume is less than 10%. Assuming people buy only one of the 4, a
survey of 400 people was conducted and the responses are:
The Number of Consumers of Each Product
Regular butter:
80
Fat-reduced butter:
100
Fat-free butter:
200
Margarine:
20
24. To make the correct decision, management should construct the following null and
the alternative hypotheses:
a)
b)
c)
d)
e)
H0: p  0.10
H0: p  10
H0: p  0.10
H0: p =0.10
H0: p  0.10
Ha: p  0.10
Ha: p  10
Ha: p  0.10
Ha: p  0.10
Ha: p  0.10
25. Regarding the question above, what would the management’s decision be regarding
margarine production given that Z0.05=1.665: (assume 5% significance level)
a) Fail to reject the null hypothesis and discontinue producing margarine
b) Reject the null hypothesis and discontinue producing margarine
c) Fail to reject the null hypothesis and conclude that there is not enough evidence to
continue producing margarine
d) Reject the null hypothesis and continue producing margarine
e) Fail to reject the null hypothesis and conclude that there is not enough evidence to
discontinue producing margarine.
26. The owner of a travel agency is deciding where to open a new office. She believes
that people whose income is over $40,000 will use the agency services. If more than 60%
of the households within a radius of 5 miles have an income of $40,000 or more, then the
owner will open a new agency in the area. Suppose you have data on all household
incomes in the area. What kind of technique would you use to determine whether the
agency will enter a given area?
a)
b)
c)
d)
e)
Single population quantitative Z-test
Single population qualitative t-test
Two population difference in proportions
Single population quantitative t-test
Single population qualitative Z-test
687318201
Page 9 of 18
27. Suppose that you want to compare the means of two populations, but you are not
sure whether you should use EQUAL or UNEQUAL variances when constructing your
test statistic. How do you proceed?
a) construct an F-test and decide whether variances are equal or unequal and proceed
with a χ2 - test
b) assume the variances are equal and do a t - test
c) construct an F-test and determine to use unequal variances and do a t - test
d) look at the variances and eyeball whether they are the same or not and then use a
χ2 - test
e) construct an F-test and decide whether variances are equal or unequal and proceed
with a t-test
Use the following information to answer questions 28-31.
In this regression the dependent variable is salary and the independent variable is work
experience.
SUMMARY OUTPUT
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations
14
ANOVA
df
Regression
Residual
Total
Intercept
Experience
1
12
13
SS
MS
571.9287593 571.9287593
292.1736322 24.34780268
F
Coefficients Standard Error
T Stat
P-value
-7.442922509 2.607462884 -2.854469206 0.014501968
1.613399136 0.332889992
0.000400514
28. The coefficient of correlation is equal to_____and the coefficient of determination is
equal to______:
a.
b.
c.
d.
e.
0.814 and 0.662
0.662 and 0.814
0.510 and 0.715
0.715 and 0.510
None of the above
687318201
Page 10 of 18
29. The P-value for the test for whether there is significant correlation between salary
and work experience is equal to:
a.
b.
c.
d.
e.
0.000400514
0.000200257
0.000801028
0.014501968
None of the above
30. The test statistic for testing the significance of the coefficient of work experience in
this regression is equal to:
a.
2.855
b. – 0.847
c. – 2.866
d.
4.847
e. None of the above
31. The standard error of the estimate is equal to:
a.
b.
c.
d.
e.
4.934
24.348
292.174
0.333
None of the above
687318201
Page 11 of 18
32. The following graph is obtained from a regression analysis. In this regression which
of the following could likely be a problem?
15
10
residuals
5
0
0
5
10
15
20
25
-5
-10
-15
-20
time
a.
b.
c.
d.
e.
Positively autocorrelated errors
Heteroskedasticity
Negatively autocorrelated errors
a and c
b and c
687318201
Page 12 of 18
33. The following graph is obtained from a regression analysis. In this regression, which
of the following is likely to be a problem?
a.
b.
c.
d.
e.
Positively autocorrelated errors
Heteroskedasticity
Negatively autocorrelated errors
a and c
b and c
34. It has been computed that the 95% confidence interval is [144.4, 154.2] for the
average exam score when a student spent 10 hours on average per week studying for the
class. The 99% prediction interval for a student who spent 10 hours on average per week
studying for the class will be
a. of the same width
b. narrower
c. cannot be determined based on the provided information.
d. wider
e. None of the above
35. Which of the following is only possible in time-series data?
a.
Autocorrelation
b.
Heteroskedasticity
c.
Non-normality of error terms
d.
Multicolinearity
e.
Both A and B
687318201
Page 13 of 18
36. The Durbin-Watson test statistic is designed to test _____and it will vary
between_____:
a.
b.
c.
d.
e.
Heteroskedasticity, -4 and 4
Autocorrelation, 0 and 4
Non normality assumption, 2 and 4
Multicollinearity, 4 and 0
None of the above
37. The overall significance F-test is used to
a. Test if all independent variables are significant
b. Test if the model explains all the variation in the dependent variable
c. Test if at least one of the independent variables is significant
d. Give the same information as R^2
e. Select between 2 different models
38. A multiple regression was performed with 2 independent variables and 50
observations on time series data. You suspect that the regression error terms may be
correlated, therefore violating a key regression assumption. What test would you perform
to verify this?
a.
b.
c.
d.
e.
the partial F test
the Durbin-Watson test
the F test for absolute significance
the t-test for individual significance
the t-test for mean difference
39. Upon performing the correct test for the problem above, you got a test statistic equal
to 3.15. Your critical values are as follows:
dL = 1.46
dU = 1.63
You would then conclude that:
a.
b.
c.
d.
e.
687318201
there is no autocorrelation
there is significant positive autocorrelation
there is significant negative autocorrelation
the model is not significant as a whole
the means of your independent variables are not equal
Page 14 of 18
40. In Project II, you were asked to create a number of variables. The following text is
taken directly from the Project II description, and provides the context for this question.
“They believe that employees who foresee a bright future in the company will have better
motivation and, hence, will generate in higher sales. So they want to use the three
questions about future employment possibilities (ADVNCOK, PROMFAIR and
FUTOPPS) as predictors. Rather than including their results separately, they believe that
they should form a scale by adding the results of the three questions together to form a
Guttman Scale.” For this scale to be reliable,
a.
b.
c.
d.
e.
the multiple R statistic for autocorrelated errors measured across
independent variables should be significantly negative
the multiple R statistic for autocorrelated errors measured across
independent variables should be significantly positive
the components should be negatively correlated with each other
the components should be positively correlated with each other
both b and d
41. Recall that for Project II, you identified variables that were not individually
significant. Then, to see if they might be significant as a group, you performed a partial
F-test. Suppose that the test resulted in a rejection of the null hypothesis. Under these
circumstances, your FINAL MODEL would be:
a.
b.
c.
d.
e.
the full model
the reduced model
a model with completely new independent variables
used for prediction
both a and d
42. When computing a seven period moving average, the number of observations lost
are:
a.
b.
c.
d.
e.
687318201
2
3
4
5
6
Page 15 of 18
43. The time series of U.S. Gross Domestic Product (GDP), when plotted, reveals many
ups and downs, but in general, shows a tendency to increase with time. Such a general
tendency is best picked up by:
a.
b.
c.
d.
e.
the cyclical component
the seasonal component
the trend component
the random component
none of the above
Use the following information to answer questions 44-47.
A manufacturer of ski equipment has noticed that there appears to be a seasonal pattern to
her sales data. She therefore undertakes the computation of seasonal indices from the
quarterly data she has from the 1st quarter of 1996 until the last quarter of 1999. The
residuals and the predicted y values from the linear trend model are provided below:
Observation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Predicted Y
92.44852941
94.47205882
96.49558824
98.51911765
100.5426471
102.5661765
104.5897059
106.6132353
108.6367647
110.6602941
112.6838235
114.7073529
116.7308824
118.7544118
120.7779412
122.8014706
Residuals
13.55147
-2.47206
-31.4956
22.48088
14.45735
-2.56618
-31.5897
28.38676
5.363235
-5.66029
-33.6838
25.29265
4.269118
-7.75441
-38.7779
40.19853
44. The percentage trend for the 3rd quarter of 1997 should be:
a.
b.
c.
d.
e.
687318201
114.4%
97.5%
69.8%
126.6%
cannot be computed from the information provided
Page 16 of 18
45. Suppose the seasonal indices the ski equipment manufacturer came up with are:
Q1 = 1.094
Q2 = 0.958
Q3 = 0.688
Q4 = 1.26
Then the seasonally adjusted value of the original series for the 3rd quarter of 1997 should
be:
a.
b.
c.
d.
e.
106.105
50.216
196.25
92.87
cannot be computed from the information provided
46. And the seasonally adjusted forecast for the 3rd quarter of 1997 is:
a.
b.
c.
d.
e.
69.17
70.57
71.96
73.35
cannot be computed from the information provided
47. When selecting a model from competing forecasting techniques, a way to choose is
on the basis of:
a.
b.
c.
d.
e.
687318201
MAD
SSE
DW
Both a and b
None of the above
Page 17 of 18
Answer Key:
1.
b
2.
a
3.
c
4.
a
5.
d
6.
d
7.
e
8.
b
9.
b
10.
b
11.
e
12.
b
13.
c
14.
a
15.
c
16.
d
17.
c
18.
d
19.
e
20.
b
21.
b
22.
c
23.
e
24.
c
687318201
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
b
e
e
a
a
d
a
c
b
d
a
b
c
b
c
d
e
e
c
c
a
c
d
Page 18 of 18
Download