Class 28 Assignment with Answsers - Darden Faculty

advertisement
Class 28 Assignment Answers
These questions refer to EMBS Case Problem 2. “Alumni Giving” which concerns data for 48 US national
universities (America’s Best Colleges, Year 2000 Edition). Both the University of Notre Dame and the
University of Virginia are included. The following five variables are in the data set.
Variable
Description
School
Graduation
Rate
The name Percentage
of the
of enrollees
University who
graduate
Mean
Median
Mode
Standard
Deviation
Skewness
Minimum
Maximum
Count
% of Classes Under
20
Percentage of
Classes offered
with <= 20
students.
Student/Faculty
Ratio
Number of
students enrolled
divided by total
number of faculty
Alumni Giving Rate
83.042
83.5
92
8.607
55.729
59.5
65
13.194
11.542
10.5
13
4.851
29.271
29
13
13.441
-0.282
66
97
48
-0.501
29
77
48
0.582
3
23
48
0.370
7
67
48
Percentage of living
alumni who gave to
the University in
2000
1. Test the hypothesis that graduation rate and alumni giving rate are (linearly) independent. We expect
universities with higher graduation rates to have higher mean giving rates. [15 points]
A regression of giving rate on graduation rate shows a positive linear relationship with
reported p-value of 5.24E-10. For Ha: b>0, the p-value is half that, or 2.62E-10. We reject H0
in favor of Ha. The results are statistically significant.
Intercept
Graduation Rate
Coefficients
-68.76
1.18
Standard
Error
12.58
0.15
t Stat
-5.46
7.83
P-value
1.82E-06
5.24E-10
2. If the graduation rate of school A is 5 percentage points higher than that of school B, how much
higher do we expect school A’s giving rate to be? [10 points]
Using the above regression (graduation rate is all we know), the expected giving rate will be
1.18*5 = 5.9 percentage points higher for school A.
3. If you learn that A and B above have identical student to faculty ratios, what is your revised answer to
question 2? Be certain to explain why it went up (if it went up) or why it went down (if it went down) or
why it stayed the same. Direct your response to a university administrator. [15 points]
For this question, we know both graduation rate and student/faculty ratio. Since the latter is
also predictive of giving rate, we will use a multiple regression to answer this question.
Intercept
Graduation Rate
Student/Faculty Ratio
Coefficients
-19.10631
0.75574
-1.24595
Standard
Error
15.55006
0.16023
0.28430
t Stat
-1.22870
4.71669
-4.38250
P-value
0.22557
0.00002
0.00007
(Note the p-value associated with student/faculty ratio is very low. Student/faculty ratio is an
important variable which should not be ignored.) The 5 point higher graduation rate leads us
to expect 0.756*5 = 3.8 percentage points higher giving rate for A. Our answer went down (5.9
to 3.8) because graduation rates and faculty/student ratios are negatively correlated in the
sample. (Schools with higher graduation rates are expected to have lower faculty/student
ratios….which in turn also lead to higher giving rates.) The answer to 2 reflected this reality.
The higher grad rate for A would also imply a lower student faculty ratio…and the
combination would lead to expecting 5.9 more percentage points in giving rate. When we
learned that A did NOT have a lower student/faculty ratio than B, our expectations for its
giving rate go down and we expect a smaller giving rate gap between the two schools.
4. Provide a point forecast of alumni giving rate for a university with graduation rate of 80, 65 percent of
its classes with 20 or fewer students, and a student/faculty ratio of 20. [25 points] (To answer this
question, I expect you will build a linear regression model. Do not try anything fancy. Just pick which
subset of the three numerically scaled variables you think comprise the best model.)
From a modeling stand-point, the question is whether percent under 20 is needed. Does it add
predictive poser to the model given we have both grad rate and student/faculty ratio? To see,
we try the three-variable model.
Intercept
Graduation Rate
% of Classes Under
20
Student/Faculty Ratio
Coefficients
-20.7201
0.7482
Standard
Error
17.5214
0.1660
t Stat
-1.1826
4.5082
P-value
0.2433
0.0000
0.0290
-1.1920
0.1393
0.3867
0.2084
-3.0823
0.8358
0.0035
The p-value associated with %under20 is 0.83---not significant. We do not need and should
not use all three variables. The model used to answer Q3 should be used to come up with the
point forecast. Using a sumproduct to perform the calculation results in a point forecast of
16.4 for the alumni giving rate of the school in question. See below.
Intercept
Graduation Rate
Student/Faculty Ratio
Coefficients
-19.10631
0.75574
-1.24595
Intercept
Graduation Rate
Student/Faculty Ratio
POINT FORECAST
1
80
20
16.43
5. Of the 48 universities in the data set, which one has the most surprisingly low alumni giving rate? [10
points] (Hint: The answer is not U. of California-Davis. Its last-place giving rate is explained by its
relatively low graduation rate and large classes.)
I will use our 2-variable regression to calculate predictions (expectations) for each of the 48
schools and then identify the school with actual giving rate most below the prediction. This is
the same thing as finding the school with the most negative residual.
25
ERRORS or RESIDUALS
20
15
10
5
0
-5
0
10
20
30
40
50
-10
-15
PREDICTED VALUES
In the scatter plot of errors versus predicted, the circled point is the one with the most negative
error. It is school 35 (U. of Michigan-Ann Arbor) for which the regression prediction was 24.9
but the actual giving rate was 13….a full 11.9 points below expectation. I will leave it to you
Notre Dame readers to draw your own conclusions. (You can also identify the most negative
residual by asking EXCEL to give you the residuals.....and either eyeball or sort.)
6. Bo notices that some of the 48 have “university” in their names, some have “college” and the rest
have “institute”. Bo wonders whether these names are predictive of student/faculty ratio? (Formulate
and test a relevant hypothesis.) [25 points]
Let us use H0: mean S/F ratio is equal for the three names. Ha will be not all equal. We can
use either ANOVA single factor or regression with 2 dummies to test this hypothesis.
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.306267658
R Square
0.093799878
Adjusted R
Square
0.053524317
Standard Error
4.719185001
Observations
48
ANOVA
df
Regression
Residual
Total
Intercept
Dcollege
Dinstitute
2
45
47
SS
103.7348
1002.1818
1105.9167
Coefficients
11.8636
-0.3636
-7.3636
Standard
Error
0.7114
3.4120
3.4120
MS
51.8674
22.2707
F
2.3290
t Stat
16.6754
-0.1066
-2.1582
P-value
0.0000
0.9156
0.0363
Significance
F
0.1090
Although the mean giving S/F ratio for institutes is significantly lower than for Universities
(the group not included in the model) because the p-value is 0.036, overall we CAN NOT reject
H0 ( the p-value for our H0 is 0.1090). The differences in three sample means are not
statistically significant. Part of the reason is that there are only 2 colleges and 2
institutes…which makes our estimates of their means highly uncertain---a fact accounted for
in our p-value.
Download