4/16/03 252x0332
(Page layout view!)
ECO252 QBA2 Name
THIRD HOUR EXAM Hour of Class Registered (Circle)
April 21 - 22, 2003
I. (30+ points) Do all the following (2points each unless noted otherwise).
1.
Which of the following components in an ANOVA table are not additive? a) Sum of squares. b) Degrees of freedom. c) Mean squares. d) It is not possible to tell.
TABLE 11-4
A campus researcher wanted to investigate the factors that affect visitor travel time in a complex, multilevel building on campus. Specifically, he wanted to determine whether different building signs (building maps versus wall signage) affect the total amount of time visitors require to reach their destination and whether that time depends on whether the starting location is inside or outside the building. Three subjects were assigned to each of the combinations of signs and starting locations, and travel time in seconds from beginning to destination was recorded. An Excel output of the appropriate analysis is given below:
ANOVA
Source of Variation
Signs
Starting Location
Interaction
Within
SS
14008.33
12288
48
35305.33 df MS
1 14008.33
1
1 48
F
12288 2.784395
P-value F crit
0.11267 5.317645
0.13374 5.317645
0.0109 0.919506 5.317645
8 4413.167
Total 61649.67 11
2.
Referring to Table 11-4, at 1% level of significance, a) there is insufficient evidence to conclude that the difference between the average traveling time for the different starting locations depends on the types of signs. b) there is insufficient evidence to conclude that the difference between the average traveling time for the different types of signs depends on the starting locations. c) there is insufficient evidence to conclude that the relationship between traveling time and the types of signs depends on the starting locations. d) All of the above.
4/16/03 252x0332
TABLE 11-2
An airline wants to select a computer software package for its reservation system.
Four software packages (1, 2, 3, and 4) are commercially available. The airline will choose the package that bumps as few passengers, on the average, as possible during a month. An experiment is set up in which each package is used to make reservations for
5 randomly selected weeks. (A total of 20 weeks was included in the experiment.) The number of passengers bumped each week is obtained, which gives rise to the following
Excel output:
ANOVA
Source of Variation
Between Groups
Within Groups
SS
212.4
136.4 df
3
MS
8.525
F
8.304985
P-value
0.001474
F crit
3.238867
Total 348.8
3.
Referring to Table 11-2, the between group mean squares is a) 70.8 b) 212.4 c) 637.2 d) 8.525
4.
Referring to Table 11-2, the within groups degrees of freedom is a) 3 b) 4 c) 16 d) 19
5.
Referring to Table 11-2, at a significance level of 1%, a) there is insufficient evidence to conclude that the average numbers of customers bumped by the 4 packages are not all the same. b) there is insufficient evidence to conclude that the average numbers of customers bumped by the 4 packages are all the same. c) there is sufficient evidence to conclude that the average numbers of customers bumped by the 4 packages are not all the same. d) there is sufficient evidence to conclude that the average numbers of customers bumped by the 4 packages are all the same.
2
4/16/03 252x0332
6.
The Journal of Business Venturing reported on the activities of entrepreneurs during the organization creation process. As part of a designed study, a total of 71 entrepreneurs were interviewed and divided into 3 groups: those that were successful in founding a new firm ( n
1
=
34), those still actively trying to establish a firm ( n
2
= 21), and those who tried to start a new firm but eventually gave up ( n
3
= 16). The total number of activities undertaken (e.g., developed a business plan, sought funding, looked for facilities) by each group over a specified time period during organization creation was measured. The objective is to compare the mean or median number of activities of the 3 groups of entrepreneurs. The underlying distribution is not known to be Normal, nor is it likely that the columns have similar variances. Identify the method that would be used to analyze the data.. a) Friedman Test for differences in medians. b) Kruskal-Wallis Rank Test for Differences in Medians c) One-way ANOVA F test d) Two-way ANOVA
7.
The slope ( b
1
) represents a) predicted value of Y when X = 0. b) the estimated average change in Y per unit change in X . c) the predicted value of Y . d) variation around the line of regression.
8.
The least squares method minimizes which of the following? a) SSR b) SSE c) SST d) All of the above
TABLE 13-2
A large mail order house weighs its mail upon arrival. And would like to be able to estimate the number of orders it contains. It has data on 25 shipments giving the weight of the mail(in pounds) (in column 1)and the number of (thousands of) orders ( in column 2).
The data are not shown here but you may want to know that the largest weight on the list was 652 pounds and the largest number of orders was 20.2 (thousand). Since they didn’t know what they were doing, they did 2 regressions, but only one is correct.
Welcome to Minitab, press F1 for help.
MTB > Retrieve "C:\Documents and Settings\RBOVE\My Documents\Drive D\MINITAB\2x0331-
8.MTW".
Retrieving worksheet from file: C:\Documents and Settings\RBOVE\My Documents\Drive
D\MINITAB\2x0331-8.MTW
# Worksheet was saved on Wed Apr 16 2003
Results for: 2x0331-8.MTW
MTB > regress c1 1 c2
Regression Analysis: Weight versus Orders
The regression equation is
Weight = 5.6 + 32.8 Orders
Predictor Coef SE Coef T P
Constant 5.55 15.78 0.35 0.728
Orders 32.760 1.137 28.82 0.000
S = 24.10 R-Sq = 97.3% R-Sq(adj) = 97.2%
3
4/16/03 252x0332
Analysis of Variance
Source DF SS MS F P
Regression 1 482693 482693 830.82 0.000
Residual Error 23 13363 581
Total 24 496056
Unusual Observations
Obs Orders Weight Fit SE Fit Residual St Resid
4 7.5 203.00 251.25 8.09 -48.25 -2.13R
9 9.2 365.00 306.94 6.64 58.06 2.51R
R denotes an observation with a large standardized residual
MTB > regress c2 1 c1
Regression Analysis: Orders versus Weight
The regression equation is
Orders = 0.191 + 0.0297 Weight
Predictor Coef SE Coef T P
Constant 0.1912 0.4747 0.40 0.691
Weight 0.029703 0.001030 28.82 0.000
S = 0.7258 R-Sq = 97.3% R-Sq(adj) = 97.2%
Analysis of Variance
Source DF SS MS F P
Regression 1 437.64 437.64 830.82 0.000
Residual Error 23 12.12 0.53
Total 24 449.76
Unusual Observations
Obs Weight Orders Fit SE Fit Residual St Resid
9 365 9.200 11.033 0.164 -1.833 -2.59R
R denotes an observation with a large standardized residual
9.
Referring to Table 13-2, what is the number of orders you would expect when the mail weighs
500 pounds?
10.
Referring to Table 13-2, what percentage of the total variation in orders sold is explained by weight?
11.
Referring to Table 13-2, interpret the p value for testing whether
1
exceeds 0. a) There is insufficient evidence (at the
= 0.10) to conclude that weight ( X ) is a useful linear predictor of orders received ( Y ). b) Weight ( X ) is a poor predictor of orders received ( Y ). c) For every 1 pond increase in weight, we expect the number of orders sold to increase by
0. d) There is sufficient evidence (at the
= 0.05) to conclude that weight ( X ) is a useful linear predictor of orders received. ( Y ).
4
4/16/03 252x0332
12.
Referring to Table 13-2, give a 99% confidence interval for
1
and interpret the interval.
13.
A hospital does a test of goodness of fit to see if arrivals per hour follow a Poisson distribution with a mean of 2. The data are below. The f column has been copied from the Poisson table.
The O and the E columns both add to 480. x O E F o
F e
D
O n f
0 65 64.961 0.13542 0.13534 0.0000817 0.135417 0.135335
1 130 129.922 0.40625 0.40601 0.0002440 0.270833 0.270671
2 125 129.922 0.66667 0.67668 0.0100103 0.260417 0.270671
3 96 86.615 0.86667 0.85712 0.0095427 0.200000 0.180447
4 37 43.308 0.94375 0.94735 0.0035980 0.077083 0.090224
5 11 17.323 0.96667 0.98344 0.0167703 0.022917 0.036089
6 0 5.774 0.96667 0.99547 0.0288003 0.000000 0.012030
7 0 1.650 0.96667 0.99890 0.0322373 0.000000 0.003437
8 0 0.412 0.96667 0.99976 0.0330963 0.000000 0.000859
9+ 16 0.116 1.00000 1.00000 0.0000040 0.033333 0.000241 a) What method is the hospital using to check goodness of fit? (1) b) What is the critical value it uses if c) Does it accept the null hypothesis
H
0
:
.
10 ?
(2)
Poisson
? Why? (1).
14.
Since the administrator mistrusts the results, the analysis is redone. The data are below. x O E E
O
E
O
2
E
E
O
2
O
E
2
0 65 64.961 -0.03920 0.0015 0.00002 65.039
1 130 129.922 -0.07792 0.0061 0.00005 130.078
2 125 129.922 4.92208 24.2269 0.18647 120.264
3 96 86.615 -9.38544 88.0865 1.01699 106.402
4 37 43.308 6.30752 39.7848 0.91866 31.611
5 11 17.323 6.32272 39.9768 2.30777 6.985
6+ 16 7.952 -8.04784 64.7677 8.14467 32.193 a) What is the value of the test statistic this time?(2). b) What is the table value against which we test the test statistic?
c) Do we accept or reject the null hypothesis this time? Why? (1)
.
10
(1) d) Why are there three fewer rows this time?(1) e)The first method is supposedly more powerful than the second method. Do these results illustrate this fact? Why?(1)
15. Turn in your computer problems 2 and 3 marked as requested in the Take-home. (5 points, 2 point penalty for not doing.)
5
4/16/03 252x0332 ECO252 QBA2
Third EXAM
April 21, 22 2003
TAKE HOME SECTION
-
Name: _________________________
Social Security Number: _________________________
Please Note: computer problems 2 and 3 should be turned in with the exam. In problem 2, the 2 way
ANOVA table should be completed. The three F tests should be done with a 5% significance level and you should note whether there was (i) a significant difference between drivers, (ii) a significant difference between cars and (iii) significant interaction. In problem 3, you should show on your third graph where the regression line is.
II. Do the following: (22+ points) assume a 5% significance level. Show your work!
1. The Lees, in their book on statistics for Finance majors, ask about the relationship of gasoline prices to crude oil prices
and present the following data for the years 1979 - 1988. (To get you started the sum of the crude price column is 211.16 and the sum of the numbers squared in the crude price column is
4936.3.
Obs Gas Price Crude Price
No (cents/gal)(dollars/barrel)
1 86 12.64
2 119 21.59
3 133 31.77
4 122 28.52
5 116 26.19
6 113 25.88
7 112 24.09
8 86 12.51
9 90 15.40
10 90 12.57
Just to make things interesting, change the tenth number in the Gas Price column by adding the 3 rd digit of your Social Security number to it. For example, Seymour Butz’s SS number is 123456789 and he will change 90 to 93. This should not change the results by much.
Show your work – it is legitimate to check your results by running the problem on the computer, but I expect to see hand computations for every part of this problem. a. Compute the regression equation Y
b
0
b x to predict the price of gasoline on the basis of crude oil prices. (3) b. Compute R
2
. (2) c. Compute s e
. (2) d. Compute s b
1
and do a significance test on b
1
(2) e. In 1978, the price of crude oil was $9.00 per barrel . Using this create a prediction interval for the price of gasoline for that year. Explain why a confidence interval for the price is inappropriate. (3)
6
4/16/03 252x0332
2. According to the Lees, the daily rate of return for a stock in percent is summarized in the following table.
Add the second to last digit of your Social Security number to the 50. For example, if Seymour Butz’s SS number is 123456789, he will change the 50 to 58 and the total to 203. x interval z interval O
O n
F
0
F e f e
E D below -3 20
-3 to -2 25
-2 to -1 30
-1 to 0 50
0 to 1 40
1 to 2 25 above 2 5
195
From the data we find that x
0 and s
1 .
6 .
On the basis of this test to see if the data follows a Normal distribution by a) a chi-squared test (5) and b) a Lilliefors test. (5)
Hint: To find the probability of being on a given interval, you need values of mean and variance I gave you in place of z . You must use the sample
and
. Once you find the values of z you need, put the probability in the f e
column. (You will have to round the values of z to numbers like 1.25 to use the
Normal table – Round cliff-hangers like 1.875 to 1.87.) I showed you in class how to do the using the F e
column, but in any case, for example, the item in the first row of the and the second is P
3
x
2
. If you have f e
column f e
column is P
x
3 f e
, you should be able to get E and do a chi-squared
test, remembering that we lost degrees of freedom using the data to estimate the mean and variance. You will probably need to fill in the entire table to do the Lilliefors test. Explain why this has to be a Lilliefors test rather than a K-S test.
3) (Extra credit) The Lees present the following data.
Region
Years of Work Experience
1 2 3
1
2
3
4
16
21
18
13
19
20
21
20
24
21
22
25
To vary the results, change the 25 by adding 1/10 of the third digit of your SS number. For example, if
Seymour Butz’s SS number is 123456789, he will change the 25 to 25.3. a) Do a 2-way ANOVA on these data and explain what hypotheses you test and what the conclusions are.
(6) b) What other method could we use on these data to see if years of experience makes a difference while allowing for cross-classification? Under what circumstances would we use it? Try it and tell what it tests and what it shows.
7