Assignment 4: Due Monday, 4/18/2016

advertisement
STA 6166 – Spring 2016
Project 4 – Due Monday 4/18/16
Part 1: Comparing 2 Proportions
Part 1a) Independent Samples
Description: Retrospective Study of Treatment with Ribavirin and the survival of patients with SARS. Of 97
SARS patients given Ribavirin, 10 died. Of 132 SARS patients not given Ribavirin 17 died.


Test whether the probability of death in the population of SARS patients is the same, whether or not
the patient receives Ribavirin. H0: R - NR = 0 HA: R - NR ≠ 0
Obtain a 95% Confidence Interval for R - NR
Part 1b) Dependent Samples
A study compared using genital swab versus bedside smear slide in detecting sperm in sexual assault victims.
Both methods were used on n = 724 cases. For external tests, 199 cases tested positive on both genital swab and
slide smear, 69 tested positive on genital swab and negative on slide smear, 31 tested negative on genital swab
and positive on slide smear, and 425 tested negative on both genital swab and slide smear. Test whether there is
a significant difference in the proportions of all possible cases testing positive on the 2 methods of detecting
sperm.
Part 2: Chi-Square Test for Association
A study was conducted, taking a sample of homes in Philadelphia from the 18th Century, and classifying them
based on the home value (6 categories) and whether or not they had table furnishings (Yes/No). Test whether or
not there is an association between home value category and presence/absence of table furnishings.
Part 3: Relative Risk and Odds Ratio
Typhoon Saomei caused high rates of injuries and deaths in the Longhua Village in China in 2006. The
following table gives the incidence of injury for several risk factors. For each risk factor, give the relative risk
and odds ratio (and 95% Confidence Intervals for each) for the “Risk” group relative to the “Reference” or
“baseline” group.
Risk Factor
Gender
Occupation
Education
Risk Group/Ref Group
Risk=Male
Reference=Female
Risk=Fisherman
Reference=Other
Risk=Illiterate/ElemSchool
Reference= At least Jr. High
# Injured
85
44
30
99
105
24
# Not Injured
Total
1543
1459
164
2838
1994
1008
Part 4: Simple Linear Regression
A researcher is interested in the effect of different levels of a nutrient in the feed of mice on
weight gain. She samples 30 mice of a particular breed and assigns them randomly to one of 6
levels of the nutrient (0, 20, 40, 60, 80, 100). There are 5 mice per level. The datasets are
micegrow.xls and micegrow.dat. The response (dependent) variable is weight change over a
3-week period.




Obtain a scatterplot of weight change versus nutrient level
Fit a simple linear regression, relating weight change to nutrient level
Test whether there is a positive association between weight change and nutrient level
Give a 95% confidence interval for the mean change in weight as nutrient level is increased
by 1 unit
 Obtain the analysis of variance table and coefficients of correlation and determination
 Conduct the F-test for Lack of fit
Part 5: Multiple Linear Regression
Description: Regression models for adjusted total costs (Y, millions of $HK)
and average floor area (m^2), total floor area (m^2), average storey height (m)
for 14 Reinforced Concrete (RC) and 23 steel buildings in Hong Kong.
Variables/Columns
Building ID (within type)
7-8
Building Type
16
/* 1=RC, 2=Steel
Average floor area
18-24
Total Floor Area
26-32
Average storey height
36-40
Adjusted Construction Cost
42-48
*/
 Fit a multiple linear regression model, relating cost Y to the 3 numeric predictors: average
floor area, total floor area, and average storey height and a dummy variable for Steel
Buildings. Give the estimated regression equation
 Obtain the actual and predicted cost for each building
 Obtain the analysis of variance and test whether any of the predictors are associated with
sale price (=0.05):
H0: 1=…=4 = 0 HA: Not all s are 0
 State which (if any) of the individual partial regression coefficients are significant at the
=0.05 significance level (controlling for all other variables).
 Fit a model with all interactions between steel type and each of the 3 predictors. Test
whether the interaction effects are all 0 simultaneously at the =0.05 significance level using
the method COMPARING REGRESSION MODELS on slides 10 and 11 of Chapter 12.
 What proportion of the variation in Costs is “explained by each model?
Download