Spring 2009 - Exam 3 (No solutions will be posted)

advertisement
STA 6126 – Exam 3 – Spring ‘09 PRINT Name ____________
True or False:
In linear regression, an observation with a ‘high’ leverage value is one whose dependent
variable (Y) is ‘far’ away from the other observations.
In simple logistic regression, erepresents how much the odds change multiplicatively
when X increases by 1 unit.
When we plot the residuals from a linear regression model versus X, and observe a ‘U’shape, then there is evidence that the relationship between Y and X is not linear.
If we fit Backward Elimination with Significance Level to Stay SLS=0.05 and Forward
Selection with Significance Level to Enter SLE=0.05, we will always obtain the same
selected regression model.
When we add new variable(s) to a regression model that do not contribute to the fit using
existing variables, it is possible for MSE to increase and Adjusted-R2 to decrease.
In a simple logistic regression, our estimated probability is 0.5 when X = -a/b
Problems:
A county’s population is currently 100,000. If it grows 10% annually, what will be its
population in 5 years?
The following Models are fit, relating per capita retail sales (Y) to 4 predictors: per
capita establishments, income, federal expenditures, and males per 100 females for a
random sample of n=40 counties in SMSA’s.
ANOVA
Regression
Residual
Total
Intercept
estabs
income
fedexp
m100f
ANOVA
Regression
Residual
Total
Intercept
income
fedexp
m100f
df
4
35
39
SS
95.1186
115.1728
210.2915
MS
23.7797
3.2907
F
7.2264
Coefficients
11.9435
6.2938
0.0160
0.3807
-0.1324
Standard Error
6.7627
1.8903
0.0530
0.1461
0.0600
t Stat
1.7661
3.3296
0.3023
2.6055
-2.2066
P-value
0.0861
0.0021
0.7642
0.0134
0.0340
Significance F
0.0002
df
3
36
39
SS
58.6387
151.6527
210.2915
MS
19.5462
4.2126
F
4.6400
Coefficients
26.6642
0.0048
0.2965
-0.2136
Standard Error
5.7900
0.0598
0.1628
0.0620
t Stat
4.6052
0.0802
1.8210
-3.4432
P-value
0.0000
0.9365
0.0769
0.0015
df
SS
94.8180
115.4735
210.2915
MS
31.6060
3.2076
F
9.8535
Standard Error
6.6675
0.1443
0.0587
1.8625
t Stat
1.8075
2.6379
-2.2139
3.3597
P-value
0.0790
0.0122
0.0333
0.0019
Significance F
0.0076
ANOVA
Regression
Residual
Total
Intercept
fedexp
m100f
estabs
3
36
39
Coefficients
12.0517
0.3806
-0.1300
6.2574
Significance F
0.0001
Compute Cp for each model. Based on this criterion, which is best?
If we used Backward Elimination with SLS=.10, what is the selected model?
A study was conducted to determine whether the size of deposit (X, in cents) is related to
whether a bottle is returned (Y=1 if returned, 0 if not). Samples of 500 bottles were sold
at each of 6 deposit levels (X=2,5,10,15,20,25,30). The following table gives the results.
A logistic regression model is fit:
 X  
e   X
1  e   X
Deposit
2
5
10
20
25
30
NumSold Returned Prob(Retrn)
500
72
500
103
0.206
500
170
0.340
500
296
0.592
500
406
0.812
500
449
Predicted
#N/A
#N/A
#N/A
#N/A
Complete the table above.
Test whether the probability of returning a bottle is related to the amount of deposit.
H0:
Test Statistic:
HA:
P-value:
By how much do the estimated odds of returning the bottle increase when we increase the
deposit by 1 cent?
A study was conducted to observe the effect of ginkgo on acute mountain sickness
(AMS) in Himalayan trekkers. Trekkers were given either acetozalomide (ACET=1) or
placebo (ACET=0) and either ginkgo biloba (Ginkgo=1) or placebo (Ginkgo=0). Further
a cross-product term was created: Acetgink=ACET*GINKGO. Three models are fit:
  ACET  
e   A A
1  e   A A
Model
-2ln(L)
  ACET , Ginkgo 
Null (No IVs)
532.378
e    A A  G G
1  e    A A  G G
Acet
501.663
  ACET , Ginkgo 
Acet,Ginkgo
501.444
e   A A GG   AG A*G
1  e   A A GG   AG A*G
A,G,A*G
501.318
Test whether there is a significant Acetozalomide Effect at the 0.05 significance level
(Model 1):
H0:
HA:
Likelihood-Ratio Test Statistic: ________________
Wald Test Statistic: ________________
Rejection Region: ___________
Rejection Region: ___________
Test whether there is either a Ginkgo main effect and/or ACET*GINKGO interaction
(=0.05)
H0:
HA:
Likelihood-Ratio Test Statistic: ________________
Rejection Region: ___________
Based on Model 1 give the predicted probabilities of suffering from AMS for the
Acetozalomide and non-acetozalomide users:
Acetozalomide:
Non-Acetozalomide:
Have a Great Summer!
Download