Regression Analysis

advertisement
Beasley
BST 612 Assignment 2 – 150 points
Regression Analysis
Frigerio et al. measured the energy intake (EI) of 32 Gambian women. Sixteen of the subjects were
lactating (X=1) and the remainder were non-pregnant non-lactating (X=0). The values of EI were also
converted to ranks (REI). Import the data in BST612ASSIGN2-EI.XLS into SPSS, JMP, or SAS.
Use either SPSS (Analyze-Regression-Linear); SAS (PROC REG) or JMP (Analyze-Fit Y by X ); to
compute the regression solution (Y = b0 + b1X) for regressing EI and REI (Dependent, Response, Y
variable) on to X (Independent, Factor, X variable).
2
1. What is the Model R ?
EI
R = _______
REI
R = _______
2
2
EI
2. What are results of the
Regression Models?
F-ratio
(2 points).
REI
p-value
F-ratio
p-value
(3 points)
3. For EI, What is the value of b0? __________
(1 point)
4. What is the value of b1? ________
(1 point)
5. For REI, What is the value of b0? __________
(1 point)
6. What is the value of b1? ________
(1 point)
7. What is the predicted value for X = 1 (Lact group)?
EI
_______
REI
______
(2 points)
8. What is the predicted value for X = 0 (NPNL group)?
EI
_______
REI
______
(2 points)
9. For EI, How would you interpret the regression intercept (b0)?
(3 points).
10. For EI, How would you interpret the regression slope (b1)?
(3 points).
11. For EI, What is symbolic notation for the null hypothesis for this REGRESSION model?
(3 points).
12. What is this null hypothesis mean in words?
(2 points)
13. Use either SPSS (Analyze-Correlate-Bivariate); JMP (Analyze-Multivariate MethodsMultivariate); or SAS (PROC CORR) to compute the Pearson and Spearman correlations.
Pearson r = ___________
(4 points)
Spearman r = ___________
Pearson r2 = ___________
Spearman r2 = ___________
14. How do these values in 13 relate to the values in question 1?
(3 points)
1
BST 612 Assignment 2 – 150 points
Beasley
Bivariate data with a Dichotomous variable
15. Use either SPSS (Graph-Scatter-Simple); JMP (Fit Y by X); or SAS (PROC CORR or GPLOT) to
create a scatterplot with EI on the Y axis and X on the X axis.
(3 points).
16. By examining the scatterplot, do the data appear to be homoscedastic or heteroscedastic?
Explain.
(3 points).
SPSS: Use Analyze-Compare Means-Independent Samples T-Test and
Use Analyze-Compare Means-One Way ANOVA
JMP: Change the X variable to be Nominal, then Use Analyze-Fit Y by X
Under the Oneway Analysis Banner select the Means/Anova/Pooled t Means and Std Dev
SAS: Use PROC TTEST; CLASS X; VAR EI REI; RUN; and
Use PROC GLM; CLASS X; MODEL EI REI = X / solution; MEANS group / t; RUN;
17. Enter the following Results.
NPNL
Lact
X=0
X=1
EI
Mean
SD
(10 points total)
95% CI
Lower Bound
Upper Bound
Mean Diff
t
F
p-value
SE(MDiff)
n
Model R2 = _____
df
18. In symbolic notation, what was the null hypothesis for the previous analysis?
(2 points)
19. Explain how are the H0 in (18) and the H0 in (11) equivalent?
(3 points).
20. Do you think there is a causal relationship between these variables? Explain.
(3 points).
Nonparametric Alternative
JMP: Change the X variable to be Nominal, then Use Analyze-Fit Y by X
Under the Oneway Analysis Banner select the Nonparametric Wilcoxon Test
SPSS: Use Analyze-Nonparametric Tests and K Independent Samples. Place EI and REI in the Test
Variables List and X as the Grouping variable and Define the range from 0 to 1. Select the
Descriptives option.
SAS: Use PROC NPAR1WAY; CLASS X; VAR EI REI; RUN; and
EI
21. What are results of the
Kruskal-Wallis Test?

2
REI
p-value

2
p-value
(4 points)
22. Divide the 2 value by (N-1): 2/(N-1) = ____________
(2 points)
23. How does this value from 22 related to previous results in question 1?
(3 points)
24. Write a brief interpretation of these results.
(6 points)
2
BST 612 Assignment 2 – 150 points
Beasley
Based on Gravely and Littlefield, researcher conducted a study to determine the cost of J = 3 prenatal
clinical staffing models: (j = 1) Physician-based; (j = 2) Mixed (M.D., R.N.) staffing; and (j = 3)
Clinical Nurse Specialist with physician available for consultation. The subjects were women who
were 18 years old or older, obtained prenatal care at one of the three facilities, and who had delivered
within 48 hours of the interview. The cost was defined as the amount of money billed over and above
the amount covered by the patient’s health insurance (AMNT). These values were also converted to
Ranks (RAMNT). There are set of dummy coded variables (Xj) that represent group membership. The
data are in file BST612-ASSN2-AMNT.xls.
Use either SPSS (Analyze-Regression-Linear); SAS (PROC REG) or JMP (Fit Model); to compute the
regression solution (Y = b0 + b1X1 + b2X2) for regressing AMNT and RAMNT (Dependent, Response,
Y variable) on to X (Independent, Factor, X variable).
25. What is the Model R2?
26. What are results of the
Regression Models?
27. For AMNT,
AMNT
R2 = _______
AMNT
F-ratio
p-value
RAMNT
R2 = _______
F-ratio
(2 points).
RAMNT
p-value
(3 points)
What are the values of b0? ________
(1 points)
b1? ________
(1 points)
b2? ________
(1 points)
28. For RAMNT, What are the values of b0? ________
(1 points)
b1? ________
(1 points)
b2? ________
(1 points)
29. What is the predicted value for X1 = 1?
AMNT
_______
RAMNT
______
(2 points)
30. What is the predicted value for X2 = 1?
AMNT
_______
RAMNT
______
(2 points)
31. How would you interpret the regression intercepts (b0)?
(3 points).
32. How would you interpret the regression slopes (b1 and b2)?
(3 points).
33. For AMNT, What is symbolic notation for the null hypothesis for the FULL REGRESSION
model?
(3 points).
34. What is this null hypothesis mean in words?.
(2 points).
3
BST 612 Assignment 2 – 150 points
Beasley
SPSS: Use Analyze-General Linear Model-Univariate and
Select AMNT as the Dependent Variable and Group as Fixed Factor
Select Options – Descriptive statistics Estimates of Effect Size Observed Power
Parameter Estimates Residual Plot
Select Post-Hoc and Move Group to the Post Hoc Tests for box and Select the Tukey option
JMP: Change the GROUP variable to be Nominal, then Use Analyze-Fit Y by X
Under the Oneway Analysis Banner select the Means/Anova/Pooled t and
Means and Std Dev Compare Means – All Pairs, Tukey HSD options
SAS: Use PROC GLM; CLASS group; model AMNT RAMNT = group / solution; MEANS group /
tukey; RUN;
35. Complete the ANOVA Source Table with SS, df, MS, and F, and test the null hypothesis in (1) at
the  = .05 level of significance for AMNT.
(5 points)
Source
SS
df
Between
______
___
MS
F
p-value
_____
____
_____
Within
_______
___
_____
___________________________________________________________________________
Total
________
___
36. State the null hypotheses for the ONE-WAY ANOVA.
(3 points).
37. Explain how are the H0 in (36) and the H0 in (33) equivalent?
(4 points).
44. What is the FULL Model R2? R2 = _______
(2 points)
45. Compute Tukey HSD for each pairwise comparison.
(6 points).
Mean Diff
Lower Bound
Upper Bound
1 vs 2
1 vs 3
2 vs 3
4
Beasley
BST 612 Assignment 2 – 150 points
46. Complete the ANOVA Source Table with SS, df, MS, and F, and test the null hypothesis in (1) at
the  = .05 level of significance for RAMNT.
(5 points)
Source
SS
df
Between
______
___
MS
F
p-value
_____
____
_____
Within
_______
___
_____
___________________________________________________________________________
Total
________
___
47. What is the Model R2? R2 = _______
(2 points)
48. Compute Tukey HSD for each pairwise comparison.
(6 points).
Mean Diff
Lower Bound
Upper Bound
1 vs 2
1 vs 3
2 vs 3
49. What are results of the
Kruskal-Wallis Test?

2
AMNT
p-value

2
RAMNT
p-value
(4 points)
50. Divide the 2 value by (N-1): 2/(N-1) = ____________
(2 points)
51. How does this value related to previous results?
(3 points)
52. Based on these results explain which Staffing Models are significantly different and if any should
be preferred over another.
(3 points).
53. Do you think there is a causal relationship between these variables? Explain.
(3 points).
54. Write a brief Results section based on your answers. The Results section should usually report
inferential tests and effect magnitudes in the text, while descriptive statistics should be reported in a
table or graph.
(7 points)
5
BST 612 Assignment 2 – 150 points
Beasley
Multiple Group Comparisons (ANOVA Models) EXTRA CREDIT (up to 20 points)
EC1. Determine the F-ratio which results from the given one-way ANOVA data.
Source
df
SS
MS
F
Between
4
30.5
_____
______
Within
____
_____
_____
________________________________________________
Total
99
165.0
(4 points)
EC2. For the data in question EC1, the estimated percent variance in the dependent variable accounted
for by the independent variable is:
2 = R2 = _______________
(2 points)
In a one factor ANOVA with J = 4 groups and nj = 5 subjects per group:
EC3. Y 1 = 22
Y 2 = 24
What is the value for SS Between?
Y 3 = 20
Y 4 = 26
_________________
EC4. s1 = 2.0
s2 = 2.2
What is the value for SS Within?
s3 = 2.1
s4 = 2.3
_________________
EC5. Reconstruct the ANOVA Source Table
Source
df
SS
MS
F
Between
____
_____
______
_____
Within
____
_____
______
________________________________________________
Total
_____
(4 points)
(4 points)
(4 points)
EC6. For the data in question EC5, the estimated percent variance in the dependent variable accounted
for by the independent variable is:
2 = R2 = _______________
(2 points)
6
Download