LAST NAME (Please Print)

advertisement
LAST NAME (Please Print):
FIRST NAME (Please Print):
HONOR PLEDGE (Please Sign):
Statistics 111
Midterm 4
• This is a closed book exam.
• You may use your calculator and a single page of notes.
• The room is crowded. Please be careful to look only at your own exam. Try to sit
one seat apart; the proctors may ask you to randomize your seating a bit.
• Report all numerical answers to at least two correct decimal places or (when appropriate) write them as a fraction.
• All question parts count for 1 point.
1
1. Captain Hornblower is sailing from Bombay to London. He can either sail through the
Suez canal or around the Cape of Good Hope. The Suez route risks capture by Somali
pirates; this happens with probability 1/80 and will cost a random ransom that has
a uniform distribution between $2 million and $4 million. If he travels around the
Cape, the longer trip will cost him $30,000.
What is the expected loss from taking the Suez route?
What is his best choice?
2. Suppose the duration of a summer romance has density
f (x) = θx(θ−1) for 0 < x < 1 and 0 < θ
and is 0 otherwise. (Here, time is scaled so that a full summer has duration 1.)
What is the survival function?
What is the hazard function?
What is the shape of the failure rate?
You observe three summer romances, which lasted 1/2, 3/4, and 2/3 of
the summer. What is your maximum likelihood estimate for θ?
3. You can invest d dollars in fixing up a house to sell. You believe that the sale price of
the fixed-up house has an exponential distribution with parameter λ = (5d)−1 where
d is the amount you invest.
What is your net expected profit from investing $10K?
If you utility for money is linear and you have $30K in the bank, how
much should you invest?
2
4. Describe how you would use 2-fold cross-validation to assess the predictive accuracy
of a nonparametric regression. (Anthony: Please print neatly.) (6 points)
5. You want to compare salaries for new graduates among Harvard, Yale, Princeton, and
Duke. To control for effects due to major, you pick one person from each school who
majored in Statistics, Economics, Computer Science, English, and Biology.
Source
df
SS
MS
School
Major
50
Error
30
Total
600
Complete the table above (10 points).
3
F
In words specific to the problem, what is the appropriate null hypothesis regarding
school?
What is the critical value for a 0.05 level test on school?
In words specific to the problem, what is the conclusion for a 0.05 level test regarding
school?
Was it useful to use blocks in this problem?
Assume that one should not have used blocks. Write the one-way ANOVA table for
the situation above without blocking (12 points).
Source
df
SS
MS
F
6.
Which model is better for the lifespan of rabbit in the wild: (A) competing risks or a (B) Cox proportional hazards model?
7.
You fit a Cox proportional hazards model to the lifespans of companies.
The two covariates (measured in millions) are Liquidity, with coefficient 10, and CEO
Pay, with coefficient -12. Suppose Exxon (in the numerator) has $8M and $11M for
Liquidity and CEO Pay, respectively, while Walmart has $6M and $12M. What is the
hazard ratio?
Which company will probably last longer?
4
8. You observe the following IQs for random Duke students in three different majors.
Statistics
Economics
English
120
122
121
115
115
113
110
105
107
110
Suppose you do an analysis of variance and that the mean squared error is 5 and
that the test is significant at the 0.05 level. Now you want to find which majors are
significantly different.
What is the value of your critical value?
What is the value of your test statistic for comparing Statistics and
English majors?
9.
In order to pass this test, you need to know probability and statistics and
have a working calculator. The chance that you know probability is 0.8, the chance
that you know statistics is 0.9, and you brought two calculators to class: one fails
with probability 0.7 and the other fails with probability 0.6. What is the probability
that you pass?
10. List all, and only, the true statements (10 pts.)
A. Multicollinearity occurs when two explanatory variables are strongly correlated.
B. Interactions in regression are handled by taking the product of explanatory variables.
C. Increasing failure rates describe human lifespans.
D. Including irrelevant explanatory variables reduces predictive accuracy.
E. Interpolation is less reliable than extrapolation.
F. A running line smoother is more accurate than bin smoothing.
G. In high dimensions, it becomes hard for statistical tests to select a good model.
H. People underestimate common risks.
I. Humans frame risk perception in terms of dread and controllability.
J. People tend to have linear utility for money.
5
Download