252x0541 4/22/05 1 ECO252 QBA2

advertisement
252x0541 4/22/05
ECO252 QBA2
Final EXAM
May 2-6, 2005
1
252x0541 4/22/05
ECO252 QBA2
Final EXAM
May 2-6, 2005
TAKE HOME SECTION
Name: _________________________
Student Number: _________________________
Class days and time : _________________________
1) Please Note: computer problems 2,3 and 4 should be turned in with the exam (2). In problem 2, the 2
way ANOVA table should be checked. The three F tests should be done with a 5% significance level and
you should note whether there was (i) a significant difference between drivers, (ii) a significant difference
between cars and (iii) significant interaction. In problem 3, you should show on your third graph where the
regression line is. Check what your text says about normal probability plots and analyze the plot you did.
Explain the results of the t and F tests using a 5% significance level. (2)
2) 4th computer problem (4+)
This is an internet project. You are trying to answer the question, ‘how well does manufacturing explain
differences in income?’ You should use some measure of income per person or family in each state as your
dependent variable and try to explain it as a function of (to start with) percent of output or labor force in
manufacturing. This should start out as a simple regression. Then you should try to see whether there are
other variables that explain the differences as well. One possibility is the per cent of the adult population
with college or high school diplomas. Possible sources of data are below, but think about what you use, and
try to find some other sources. Total income of a state, for example is a very poor choice, rather than some
per capita measure because it is simply going to be high for places with a lot of people without indicating
how well off they are. Similarly the fraction of the workforce with a certain education level is far better then
the number. For instructions on how to do a regression, try the material in Doing a Regression.
http://www.nam.org/s_nam/sec.asp?CID=5&DID=3 Manufacturing share in state economies
(http://www.nam.org/Docs/IEA/26767_20002001ManufacturingShareandChangeinStateEconomies.pdf?DocTypeID=9&TrackID=&Param=@CategoryI
D=1156@TPT=2002-2001+Manufacturing+Share+and+Change+in+State+Economics)
http://www.nemw.org/data.htm Per capita income by state.
http://www.nemw.org/data.htm State personal income per capita.
http://www.bea.doc.gov/bea/regional/data.htm Personal income per capita by state.
http://www.census.gov/statab/www/ Many state statistics, including persons with bachelor’s degrees.
http://www.epinet.org/content.cfm/datazone_index Income inequality, median income, unemployment rates.
Anyway, your job is to add whatever variable you think ought to explain your income measure. Consider all
50 states your sample. Your report should tell what numbers you used, from where and from what years.
What coefficients were significant and do you think on the basis of your results that manufacturing is an
important predictor of a state’s prosperity? Mark all significant F and t coefficients using a 5% significance
level. Explain VIFs.
Of course, if you don’t like this assignment, get approval to research something else on the internet. For
example, does the per cent of the population in prison affect the crime rate (maybe with a few years’ lag)?
Or are there better predictors? And get out the Durbin-Watson, prison vs. crime rate is a time series project.
[8]
3) Hotshot Associates is afraid of sex discrimination charges and collects the data below. The dependent
variable is income in thousands of dollars and the two independent variables are education in years and a
dummy variable indicating sex (1 means a female). The lines in the middle are missing because the totals
2
252x0541 4/22/05
are reliable and you don’t need them. The only thing that is missing is you. Add yourself to the sample as a
21st observation with 12 years of education and an income of 100.0 (thousand) plus the last two digits of
your student number as hundreds. For example Roland Dough’s student number is 123689, so he adds
$8900 to $100000 to get 108900, which he records as 108.9.
y
Row
1
2
3
4
5
INC
39.0
43.7
62.6
42.8
55.0
17 72.9
18 56.1
19 67.1
20 82.3
1168.5
x1
x2
x12
x 22
EDUC
2
4
8
8
8
SEX
0
1
0
1
0
4
16
64
64
64
0
1
0
1
0
16
16
17
21
241
0
1
0
0
7
256
256
289
441
3285
y2
1521.00
1909.69
3918.76
1831.84
3025.00
x1 y
78.0
174.8
500.8
342.4
440.0
x2 y
x1 x 2
0.0
43.7
0.0
42.8
0.0
0
4
0
8
0
0 5314.41 1166.4
0.0
1 3147.21
897.6 56.1
0 4502.41 1140.7
0.0
0 6773.29 1728.3
0.0
7 70091.67 14783.9 370.6
0
16
0
0
81
a. Compute the regression equation Yˆ  b0  b1 x1 to predict salaries the basis of education only.
(2)
b. Compute R 2 . (2)
c. Compute s e . (2)
d. Compute s b1 and do a significance test on b1 (1.5)
e. Compute s b0 and do a confidence interval for b0 (1.5)
f. You are about to hire your nephew for the summer and want to know how much to pay him He
has 14 years of education. Using this create a prediction interval his salary. Explain why a
confidence interval for the price is inappropriate. (3)
g. Do an ANOVA table for the regression. What conclusion can you draw from the hypothesis test
in the ANOVA? (2)
[22]
Extra credit from here on.
h. Do a multiple regression of price against education and sex.(5)
i. Compute R-squared and R-squared adjusted for degrees of freedom for this regression and
compare them with the values for the previous problem. (2)
j. Using either R – squares or SST, SSR and SSE do F tests (ANOVA). First check the usefulness
of the simple regression and then the value of ‘sex’ as an improvement to the regression. How
should this impact Hotshot Associates’ discrimination problem? (Don’t say a word without
referring to a statistical test.) (3)
k. Predict what you will pay your nephew now. How much change is there from your last
prediction? (2)
4) An airport authority wants to compare training of air traffic controllers at three locations. Data is on the
next page. To personalize these data add the last two digits of your student number as a 9 th number to
column C.
a. Compare the performance of locations A, B, and C assuming that the underlying distribution is nonNormal. (4)
[26]
b. Use a one-way ANOVA to test the hypothesis of equal means. (5) It is legitimate to check your results by
computer, but I expect to see hand computations every step of the way.
[31]
c. (Extra Credit) Decide between the methods that you used in a) and b). To do this test for equal variances
and for Normality on the computer. What is your decision? Why?
[4]
You can do most of this with the following commands in Minitab if you put your data in 3 columns of
Minitab with A, B, and C above them.
MTB > AOVOneway A B C
MTB > stack A B C c11;
#Does a 1-way ANOVA
# Stacks the data in c12, col.no. in c12.
3
252x0541 4/22/05
SUBC>
SUBC>
MTB >
MTB >
subscripts c12;
UseNames.
rank c11 c13
vartest c11 c12
#Puts the ranks of the stacked data in c13
#Does a bunch of tests, including Levene’s
On stacked data in c11 with IDs in c12.
MTB > Unstack (c13);
SUBC>
Subscripts c12;
SUBC>
After;
SUBC>
VarNames.
#Unstacks the ranks in the next 5 available
# columns. Uses IDs in c12.
MTB > NormTest 'A';
SUBC>
KSTest.
#Does a test (apparently Lilliefors)for Normality
# on column A.
Data for Problem 4
Row
1
2
3
4
5
6
7
8
A
96
82
88
70
90
91
87
88
B
65
74
72
66
79
82
73
C
60
73
85
61
79
85
88
79
This might help.
MTB > sum c1
Sum of A
Sum of A = 692
MTB > ssq c1
Sum of Squares of A
Sum of squares (uncorrected) of A = 60278
MTB > sum c2
Sum of B
Sum of B = 511
MTB > ssq c2
Sum of Squares of B
Sum of squares (uncorrected) of B = 37535
4
Download