Stat 401B – Exam 1 September 30, 2008 Name: ________________________ INSTRUCTIONS: Read the questions carefully and completely. Answer each question and show work in the space provided. Partial credit will not be given if work is not shown. Use the JMP output. It is not necessary to calculate something by hand that JMP has already calculated for you. When asked to explain, describe, or comment, do so within the context of the problem. Be sure to include units when discussing quantitative variables. Believe it or not, statistics and beer go together. The statistician responsible for creating the t-table was William Sealy Gosset, the master brewer for Guinness in Dublin, Ireland from 1932 to 1937. This exam will look at data for a random sample of 40 beers brewed in the United States. Of particular interest is the alcohol content (%). This will be the response variable for the various analyses that quantify and try to explain variation. 1. [27 pts] Refer to the JMP output Distribution of % Alcohol in answering the following questions. a) [2] Looking at the histogram, describe the shape of the distribution of % Alcohol. b) [4] Give the values of the sample mean and sample median % Alcohol. What does the comparison of these values indicate about the shape of the distribution? Explain briefly. c) [2] Looking at the box plot, are their any potential outliers? If so, what is (are) the associated % Alcohol value(s)? 1 d) [8] Is the population mean % Alcohol for beers brewed in the U.S. 5% or is it something lower than 5%? Perform the appropriate test of hypotheses. Be sure to give the null and alternative hypothesis, value of the test statistic, P-value, decision and reason for reaching that decision and a conclusion in the context of the problem. For this problem use the usual significance level of 0.05. e) [4] Give a 90% confidence interval for the population mean % Alcohol. Note: the appropriate value of t* is 1.685. Explain briefly why this confidence interval is consistent with the test of hypothesis you did in d). f) [7] Comment on the distribution of the residuals for this one sample problem. Indicate what you see in the plots and what this tells you about the conditions on the random error in our model. 2 2. [33 pts] It turns out that 15 of the beers in our random sample are designated as light beers. Light beers are brewed to have fewer calories than regular beers. The categorical variable Light is Y if the beer is a light beer and N if the beer is a regular beer. Refer to the JMP output Comparison of Light and Regular Beers. a) [4] Report the sample means and sample standard deviations of % Alcohol for Light and Regular beers. Be sure to label these clearly. b) [4] What is the value of sp, the pooled estimate of the common standard deviation, σ? c) [8] Test the hypothesis that there is no difference between the population mean % Alcohol light and regular beers against the alternative that there is a difference. Be sure to give the null and alternative hypothesis, value of the test statistic, P-value, decision and reason for reaching that decision and a conclusion in the context of the problem. 3 d) [5] Give the 95% confidence interval for the difference in population mean % Alcohol (Regular – Light). What does this say about how much light beer differs from regular beer? e) [5] Is the condition of equal population standard deviations, σ, satisfied for these data? Support your answer two different ways. f) [7] Looking at the analysis of the distribution of the two-sample residuals, describe what you see in each of the three plots (histogram, box plot and normal quantile plot). Based on your descriptions, are the normality condition and/or the identically distributed condition satisfied for these data? Explain briefly. 4 3. [40] In addition to the % Alcohol the caloric content (Calories) is recorded for each beer in the random sample of 40 beers. Refer to the JMP output Relationship between % Alcohol and Calories. a) [2] Describe the general relationship between Calories and % Alcohol. Use complete sentences. b) [3] Give the equation of the least squares regression line predicting % Alcohol from Calories. c) [5] Give an interpretation of the estimated slope coefficient within the context of the problem. d) [3] Why doesn’t the estimated intercept have a meaningful interpretation within the context of the problem? e) [3] Give the percentage of the variation in % Alcohol that is explained by the linear relationship with Calories. f) [2] Give the value of the estimate of the error standard deviation, σ. 5 g) [6] Is there a statistically significant linear relationship between Calories and % Alcohol? Test the appropriate hypothesis and include all relevant steps. h) [3] Describe what you see in the plot of residuals versus calories. What does this plot tell you about the linear model? i) [2] There appears to be two potential outlying residuals. Identify the beers, % Alcohols and Calories that correspond to these residuals. j) [6] Comment on the condition of normally distributed random errors. Be sure to support your comments by referring to the histogram, box plot and normal quantile plot. k) [5] Give the predicted % Alcohol and a 95% prediction interval for a beer with 160 calories. Note t* = 2.024. 6