252x0761 11/16/07 Student Number: _________________________ Class days and time : _________________________

advertisement
252x0761 11/16/07
ECO252 QBA2
THIRD EXAM
April 16, 2007
TAKE HOME SECTION
Name: _________________________
Student Number: _________________________
Class days and time : _________________________
Please Note: Computer problems 2 and 3 should be turned in with the exam (2). In problem 2, the 2 way
ANOVA table should be checked. The three F tests should be done with a 1% significance level and you
should note whether there was (i) a significant difference between drivers, (ii) a significant difference
between cars and (iii) significant interaction. In problem 3, you should show on your third graph where the
regression line is. You should explain whether the coefficients are significant at the 1% level. Check what
your text says about normal probability plots and analyze the plot you did. Explain the results of the t and F
tests using a 5% significance level. (3)
III Do the following. (22+ points) Note: Look at 252thngs (252thngs) on the syllabus supplement part of
the website before you start (and before you take exams). Show your work! State H 0 and H 1 where
appropriate. You have not done a hypothesis test unless you have stated your hypotheses, run the
numbers and stated your conclusion. (Use a 95% confidence level unless another level is specified.)
Answers without reasons or accompanying calculations usually are not acceptable. Neatness and
clarity of explanation are expected. This must be turned in when you take the in-class exam. Note
that from now on neatness means paper neatly trimmed on the left side if it has been torn, multiple
pages stapled and paper written on only one side. Show your work!
1) The Lees, in their book on statistics for Finance majors, ask about the relationship of gasoline prices  y 
in cents per gallon to crude oil prices x1  in dollars per barrel and present the data for the years 1975 1988. I have obtained most of the data for the years 1980 – 2007. It is presented below.
Row
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Year
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
GasPrice
1.25
1.38
1.30
1.24
1.21
1.20
0.93
0.95
0.96
1.02
1.16
1.14
1.13
1.11
1.11
1.15
1.23
1.23
1.06
1.17
1.51
1.46
1.36
1.59
1.88
2.30
*
3.10
CrudePrice
26.07
35.24
31.87
26.99
28.63
26.25
14.55
17.90
14.67
17.97
22.22
19.06
18.43
16.41
15.59
17.23
20.71
19.04
12.52
17.51
28.26
22.95
24.10
28.53
36.98
50.23
*
90.00
Yr-1979
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
This data set also contains the year with 1979 subtracted from it x 2  . You may need to use this later.
Ignore it in Problem 1. Note that the numbers for 2006 have not yet been published in my source, Statistical
1
252x0761 11/16/07
Abstract of the United States, and that the numbers for 2007 are my estimates for third quarter prices. These
are unleaded prices, which the Lees did not use. You are supposed to use only the numbers for 1990
through 2006 and one other observation for your data. You will thus have n  17 observations. The other
column is the value for the year 1980  a  , where a is the second to last digit of your student number. If
you are unsure of the data that you are using or if you want help with the sums that you need to do the
regression go to 3takehome072a.
Show your work – it is legitimate to check your results by running the problem on the computer. (In fact, I
will give you 2 points extra credit for checking it and annotating the output for significance tests etc.) But I
expect to see hand computations for every part of this problem.
a. Compute the regression equation Y  b0  b1 x to predict the price of gasoline on the basis of
crude oil prices. (3)
b. Compute R 2 . (2)
c. Compute s e . (2)
d. Compute s b1 and do a significance test on b1 (2)
e. Compute a confidence interval for b0 . (2)
f. You have a crude price for 2007 Using this, predict the gasoline price for 2007 and create a
prediction interval for the price of gasoline for that year. Explain why a confidence interval for the
price is inappropriate and check to see if my estimated price is in the interval. (3)
g. Do an ANOVA for this regression. (3)
f) Make a graph of the data. Show the trend line and the data points clearly. If you are not willing
to do this neatly and accurately, don’t bother. (2)
[19]
2) Now we can use the date to see if there is a trend line in addition to the effect of crude oil.
a. Do a multiple regression of the price of gasoline against crude prices and the data variable,
which has been massaged to make 1980 year 1. This involves a simultaneous equation solution.
Attempting to recycle b1 from the previous page won’t work. (7)
c. Compute the regression sum of squares and use it in an ANOVA F test to test the usefulness of
this regression. (4)
b. Compute R 2 and R 2 adjusted for degrees of freedom for both this and the previous problem.
Compare the values of R 2 adjusted between this and the previous problem. Use an F test to
compare R 2 here with the R 2 from the previous problem. The F test here is one to see if adding a
new independent variable improves the regression. This can also be done by modifying the
ANOVAs in b.(4)
d. Use your regression to predict the price of gasoline in 2007. Is this closer to the estimated
gasoline price? Do a confidence interval and a prediction interval. (3)
[37]
e. Again there is extra credit for checking your results on the computer. Use the pull-down menu or
try
Regress GasPrice on 2 CrudePrice Yr-1979 (2)
3) According to Russell Langley, three sopranos were discussing their recent performances. Fifi noted that
she got 36 curtain calls at La Scala last week, but Adalina put her down with the fact that she got 39. Could
one of the singers really say that she had more curtain calls than another or could the differences just be due
to chance?
Personalize the data below by adding the last digit of your student number to each number in the
first row. Use a 10% significance level throughout this question.
Row
1
2
3
4
Fifi
36
22
19
16
Adelina
39
14
20
18
Maria
21
32
28
22
a) State your hypothesis and use a method to compare means assuming that each column represents a
random sample of curtain calls at La Scala. (4)
2
252x0761 11/16/07
b) Still assuming that these are random samples, use a method that compares medians instead. (3)
c) Actually, these were not random samples. Though row 1 represents curtain calls at La Scala (Milan), row
2 was in Venice, row 3 in Naples and row 4 in Rome. Will this affect our results? Does this show anything
about audiences on the four cities? Use an appropriate method to compare medians. (5)
d) Do two different types of confidence intervals between Milan and the least enthusiastic opera house.
Explain the difference between the intervals. (2)
e) Assume that we want to compare medians instead. How does the fact that these data were collected at
three opera houses affect the results? (3)
f) Do you prefer the methods that compare medians or means? Don’t answer this unless you can
demonstrate an informed opinion. (1)
g) (Extra credit) Do a Levine test on these data and explain what it tests and shows.(3)
h) (Extra credit)Check your work on the computer. This is pretty easy to do. Use the same format as in
Computer Problem 2, but instead of car and driver numbers use the singers’ and cities’ names. You can use
the stat and ANOVA pull-down menus for One-way ANOVA, two-way ANOVA and comparison of
variances of the columns. You can use the stat and the non-parametrics pull-down menu for Friedman and
Kruskal-Wallis. You also probably ought to test columns for Normality. Use the Statistics pull-down menu
and basic statistics to find the normality tests. The Kolmogorov-Smirnov option is actually Lilliefors. The
ANOVA menu can check for equality of variances. In light of these tests was ANOVA appropriate? You
can get descriptions of unfamiliar tests by using the Help menu and the alphabetic command list or the Stat
guide. (Up to 7) [58]
You should note conclusions on the printout – tell what was tested and what your conclusions are using a
10% significance level.
3
Download