Stat 401B – Exam

advertisement
Stat 401B – Exam 1
September 30, 2008
Name: ________________________
INSTRUCTIONS: Read the questions carefully and completely. Answer each question
and show work in the space provided. Partial credit will not be given if work is not
shown. Use the JMP output. It is not necessary to calculate something by hand that JMP
has already calculated for you. When asked to explain, describe, or comment, do so
within the context of the problem. Be sure to include units when discussing quantitative
variables.
Believe it or not, statistics and beer go together. The statistician responsible for creating
the t-table was William Sealy Gosset, the master brewer for Guinness in Dublin, Ireland
from 1932 to 1937. This exam will look at data for a random sample of 40 beers brewed
in the United States. Of particular interest is the alcohol content (%). This will be the
response variable for the various analyses that quantify and try to explain variation.
1. [27 pts] Refer to the JMP output Distribution of % Alcohol in answering the
following questions.
a) [2] Looking at the histogram, describe the shape of the distribution of %
Alcohol.
b) [4] Give the values of the sample mean and sample median % Alcohol.
What does the comparison of these values indicate about the shape of the
distribution? Explain briefly.
c) [2] Looking at the box plot, are their any potential outliers? If so, what is
(are) the associated % Alcohol value(s)?
1
d) [8] Is the population mean % Alcohol for beers brewed in the U.S. 5% or
is it something lower than 5%? Perform the appropriate test of
hypotheses. Be sure to give the null and alternative hypothesis, value of
the test statistic, P-value, decision and reason for reaching that decision
and a conclusion in the context of the problem. For this problem use the
usual significance level of 0.05.
e) [4] Give a 90% confidence interval for the population mean % Alcohol.
Note: the appropriate value of t* is 1.685. Explain briefly why this
confidence interval is consistent with the test of hypothesis you did in d).
f) [7] Comment on the distribution of the residuals for this one sample
problem. Indicate what you see in the plots and what this tells you about
the conditions on the random error in our model.
2
2. [33 pts] It turns out that 15 of the beers in our random sample are designated as
light beers. Light beers are brewed to have fewer calories than regular beers. The
categorical variable Light is Y if the beer is a light beer and N if the beer is a
regular beer. Refer to the JMP output Comparison of Light and Regular Beers.
a) [4] Report the sample means and sample standard deviations of % Alcohol
for Light and Regular beers. Be sure to label these clearly.
b) [4] What is the value of sp, the pooled estimate of the common standard
deviation, σ?
c) [8] Test the hypothesis that there is no difference between the population
mean % Alcohol light and regular beers against the alternative that there is
a difference. Be sure to give the null and alternative hypothesis, value of
the test statistic, P-value, decision and reason for reaching that decision
and a conclusion in the context of the problem.
3
d) [5] Give the 95% confidence interval for the difference in population
mean % Alcohol (Regular – Light). What does this say about how much
light beer differs from regular beer?
e) [5] Is the condition of equal population standard deviations, σ, satisfied for
these data? Support your answer two different ways.
f) [7] Looking at the analysis of the distribution of the two-sample residuals,
describe what you see in each of the three plots (histogram, box plot and
normal quantile plot). Based on your descriptions, are the normality
condition and/or the identically distributed condition satisfied for these
data? Explain briefly.
4
3. [40] In addition to the % Alcohol the caloric content (Calories) is recorded for
each beer in the random sample of 40 beers. Refer to the JMP output
Relationship between % Alcohol and Calories.
a) [2] Describe the general relationship between Calories and % Alcohol.
Use complete sentences.
b) [3] Give the equation of the least squares regression line predicting %
Alcohol from Calories.
c) [5] Give an interpretation of the estimated slope coefficient within the
context of the problem.
d) [3] Why doesn’t the estimated intercept have a meaningful interpretation
within the context of the problem?
e) [3] Give the percentage of the variation in % Alcohol that is explained by
the linear relationship with Calories.
f) [2] Give the value of the estimate of the error standard deviation, σ.
5
g) [6] Is there a statistically significant linear relationship between Calories
and % Alcohol? Test the appropriate hypothesis and include all relevant
steps.
h) [3] Describe what you see in the plot of residuals versus calories. What
does this plot tell you about the linear model?
i) [2] There appears to be two potential outlying residuals. Identify the
beers, % Alcohols and Calories that correspond to these residuals.
j) [6] Comment on the condition of normally distributed random errors. Be
sure to support your comments by referring to the histogram, box plot and
normal quantile plot.
k) [5] Give the predicted % Alcohol and a 95% prediction interval for a beer
with 160 calories. Note t* = 2.024.
6
Download