Stat 301 – Exam 1 Name: ________________________ October 1, 2013

advertisement
Stat 301 – Exam 1
October 1, 2013
Name: ________________________
INSTRUCTIONS: Read the questions carefully and completely. Answer each question
and show work in the space provided. Partial credit will not be given if work is not
shown. Use the JMP output. It is not necessary to calculate something by hand that JMP
has already calculated for you. When asked to explain, describe, or comment, do so
within the context of the problem and support statements with statistical summaries. Be
sure to include units of measurements when discussing quantitative variables.
A person’s percentage body fat is determined from a person’s density. The density is
obtained from the displacement of water in a large tub. In this exam we will look at
men’s percentage body fat.
1. [27 pts] The American Council on Exercise (ACE) has a chart that describes different
levels of percentage body fat. For male athletes the range of body fat is 6 to 13%. A
random sample of 44 men has their percentage body fat determined by water
displacement. The JMP analysis of the data is given below.
100.0%
75.0%
50.0%
25.0%
0.0%
10
5
0
10
20
30
40
Percentage Body Fat
50
Count
15
maximum
quartile
median
quartile
minimum
Mean
Std Dev
Std Err Mean
Upper 95% Mean
Lower 95% Mean
N
Hypothesized Value
DF
Test Statistic
Prob > |t|
Prob > t
Prob < t
40.1
25.0
17.1
11.7
3.7
18.78
9.318
1.405
21.61
15.95
44
13
43
t Test
4.1145
0.0002*
<.0001*
0.9999
a) [4] Looking at the histogram, describe the distribution of percentage body fat. Be
sure to comment on shape, center and variability.
1
b) [3] Looking at the box plot, are their any potential outliers? How do you know
this? If so, what is (are) the associated percentage body fat(s)?
c) [8] Could this sample be from a population of athletes? Test the hypothesis that
the population mean percentage body fat is 13% versus an alternative that the
population mean percentage fat is greater than 13%. Be sure to give the null and
alternative hypothesis using appropriate statistical notation, value of the test
statistic, P-value, decision and reason for reaching that decision and a conclusion
in the context of the problem. For this problem use the usual significance level of
0.05.
d) [4] Give the values for the 95% confidence interval for the population mean
percentage body fat. Explain briefly why this confidence interval is consistent
with the test of hypothesis you did in c).
2
e) [4] Construct a 95% prediction interval for the percentage body fat of a randomly
selected man. Note: the appropriate value of t* is 2.0167.
f) [4] What is the difference in interpretation between the confidence interval in d)
and the prediction interval in e)?
2. [33 pts] The random sample of 44 men includes 24 men who are under 40 years of
age and 20 men who are 40 to 55 years of age. The JMP analysis appears below.
50
Rsquare
Adj Rsquare
Root Mean Square Error
Mean of Response
Observations (or Sum Wgts)
BodyFat
40
30
20
t Test
Difference
Std Err Dif
Upper CL Dif
Lower CL Dif
Confidence
10
0 40 to 55
under 40
Age Group
Level
40 to 55
under 40
Number
20
24
Mean
23.9650
14.4583
Std Dev
10.2158
5.7648
Std Err Mean
2.2843
1.1767
0.264098
0.246576
8.087709
18.77955
44
Assuming equal variances
9.507 t Ratio
3.88237
2.449 DF
42
14.448 Prob > |t|
0.0004*
4.565 Prob > t
0.0002*
0.95 Prob < t
0.9898
Lower 95%
19.184
12.024
Upper 95%
28.746
16.893
a) [5] Compare the percentage body fat of 40 to 55 year old men to that of the men
under 40. Be sure to compare centers, variability and mention if there are any
potential outliers in either group.
3
b) [3] What is the value of sp, the pooled estimate of the common standard deviation,
σ?
c) [8] Test the hypothesis that there is no difference between the population mean
percentage body fat of 40 to 55 year old men and the population mean percentage
body fat of men under 40 years old. Be sure to give the null and alternative
hypothesis using clearly understood statistical symbols, value of the test statistic,
P-value, decision and reason for reaching that decision and a conclusion in the
context of the problem.
d) [5] Give the 95% confidence interval for the difference in population mean
percentage body fat. What does this say about how much the population mean
percentage body fat of men 40 to 55 years old differs from that of men under 40
years old?
e) [4] Is the condition of equal population standard deviations, σ, satisfied for these
data? Support your answer.
4
Below is the JMP output for the distribution of the two-sample residuals.
0.95
Normal Quantile Plot
1.64
1.28
0.85
0.67
0.0
0.75
0.60
0.45
0.30
-0.67
0.20
-1.28
0.10
-1.64
0.05
5
-20
-15
-10
-5
0
5
10
15
Count
10
20
Residual
f) [3] Looking at the normal quantile plot describe what you see and what this tells
you about the condition that random errors are normally distributed.
g) [3] Looking at the box plot compare the median to the mean? What does this
comparison indicate about the shape of the distribution of residuals?
h) [2] Looking at the histogram, where is the mound? How would you describe the
shape of the distribution of residuals?
5
3. [40] Measuring percentage body fat by displacement of water is a time consuming
process that requires the individual to be naked. Could a less time consuming and
invasive measurement, like the circumference of a man’s abdomen (cm), be used to
predict the percentage body fat? Below is JMP output looking at the relationship
between abdomen circumference and percentage body fat.
50
Summary of Fit
RSquare
RSquare Adj
Root Mean Square Error
Mean of Response
Observations (or Sum Wgts)
BodyFat
40
30
20
0.706645
0.699660
5.10637
18.77955
44
10
0
50
100
1
Abdomen
Parameter Estimates
Term
Estimate Std Error
Intercept
–41.19
6.012
Abdomen (cm)
0.648
0.0644
t Ratio Prob>|t| Lower 95%
–6.85 <.0001*
–53.32
10.06 <.0001*
0.518
Upper 95%
–29.06
0.778
a) [3] Describe the general relationship between abdomen circumference and
percentage body fat. Use complete sentences and say something about direction,
form, strength and unusual values.
b) [3] Give the equation of the least squares regression line relating percentage body
fat to the abdomen circumference.
c) [2] Use the equation in b) to predict the percentage body fat for a man with
abdomen circumference of 110 cm.
6
d) [5] Calculate a 95% prediction interval for a man with abdomen circumference of
110 cm. Note t* = 2.01081. Note: the sample mean abdomen circumference is
92.6 cm and the sample variance of abdomen circumference is 146.21 cm2.
e) [5] Give an interpretation of the estimated slope coefficient within the context of
the problem.
f) [3] Why doesn’t the intercept have an interpretation within the context of the
problem?
g) [3] Give the value of R2 and an interpretation of that value within the context of
the problem.
h) [2] Give the value of the estimate of the random error standard deviation, σ.
7
i) [6] Report the 95% confidence interval for the slope. Use this interval to test for a
statistically significant relationship between percentage body fat and abdomen
circumference.
j)
[4] Describe what you see in the plot of residuals versus predicted body fat.
What does this plot tell you about the adequacy of the linear model?
10
BodyFat Residual
5
0
-5
-10
0
10
20
30
40
50
BodyFat Predicted
k) [4] Comment on the condition of normally distributed random errors. Be sure to
support your comments by referring to the normal quantile plot of residual.
5
0
0.9
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
-10
0.1
-5
0.0
BodyFat Residual
10
Normal Quantile
8
Download