Statistics 401C October 2, 2001 Exam 1 Name: INSTRUCTIONS: Read the questions carefully and completely. Answer each question and show work in the space provided. Partial credit will not be given if work is not shown. When asked to explain, describe, or comment, do so within the context of the problem. 1. [20 pts] Data is collected on the average life expectancy for males and females for 35 of the most populated countries of the world, together with the average number of televisions per 1000 people. Below is a plot of average female life expectancy, Y, versus the average number of TVs per 1000 people, X. (a) [2] Describe the relationship between female life expectancy and number of TVs. (b) [4] Below are summaries of the data. n = 35 X = 5, 378.6 (X − X̄)(Y − Ȳ ) = 31, 191.6 Y = 2, 431 (X − X̄)2 = 555, 476.0 (Y − Ȳ )2 = 2, 524.7 Calculate the correlation coefficient between female life expectancy and number of TVs. 1 (c) [6] Test to see if the correlation is different from zero. Be sure to report the value of the test statistics, P-value, a decision, reason for reaching the decision and a conclusion. (d) [4] A charitable organization looks at these data and decides to buy TVs and ship them to countries with low life expectancies. Will this help? Explain briefly. (e) [4] In these same 35 countries there is a strong positive correlation between the number of physicians per 1000 people and the number of TVs per 1000 people. Does this mean that physicians own most of the TVs? Explain briefly. 2. [20 pts] For a Stat 101 group project one group went to the Lied Recreational Facility and timed individuals from the time they entered the room with the weights and exercise equipment until they left. Workout times for 30 individuals were obtained. Refer to the JMP output and answer the following questions about the analysis of these workout times. (a) [3] Describe the shape of the histogram. (b) [4] What are the mean and median workout times? How do they support your description in (a)? 2 (c) [8] Test the hypothesis that the population mean workout time is 60 minutes against the alternative that it is greater than 60 minutes. Use a significance level of α = 0.05 (d) [5] Is the condition of normality satisfied for these data? Support your answer by referring to the JMP output. 3. [30 pts] Manatees are large, gentle sea mammals that live in the warm waters of Florida. Because these creatures tend to float just below the surface of the water, they are subject to injury and sometimes death from propellers on motor boats. We wish to investigate whether there is a linear relationship between the number of motor boats registered in Florida and the number of manatees killed each year. Refer to the JMP output for the number of motor boats registered (thousands) and the number of manatees killed. (a) [3] Give the equation of the least squares regression line. (b) [4] Give an interpretation of the slope within the context of the problem. 3 (c) [5] Is the slope parameter zero? Support your answer by referring to the JMP output. (d) [3] What is the predicted value and residual for 1985, when there were 585,000 motor boats registered? (e) [5] Give the values for and an interpretation of a 95% prediction interval for a year that has 585,000 motor boats registered. (f) [3] Would you use the equation of the least squares line in (a) to predict the number of manatees killed in a year that had only 300,000 motor boats registered? Explain briefly. (g) [4] Give the value of r2 and an interpretation of this value. (h) [3] Comment on the plot of residuals versus motor boats. 4 4. [30 pts] There is some indication that consumption of moderate amounts of wine can help prevent heart attacks. Below is a plot and summary of yearly wine consumption (liters of alcohol from drinking wine, per person) and yearly death rates from heart disease (deaths per 100,000 people) for 9 European countries. Variable Wine DeathRate N 9 9 Mean 3.97 161.1 StDev 2.26 61.85 (a) [3] The least squares regression line is Pred Death Rate = 246.0 - 21.4*Wine. Graph this line on the plot above. It must be clear to me that you have used the equation to plot the line. (b) [2] Give an interpretation of the intercept within the context of the problem. (c) [6] The estimate of the error standard deviation is SY |X = 41.215. Construct a 95% confidence interval for the slope parameter. (d) [3] Give an interpretation of this confidence interval. 5 (e) [3] Based on your confidence interval in (c) is there a significant linear relationship between wine consumption and death rate? (f) [2] Use the least squares regression line to predict the death rate for another European country, France, that has a wine consumption of 9.1. (g) [2] The actual death rate for France is 71. What is the residual for France? (h) [5] What is the value of r2 for these data? (i) [4] Below is a plot of residuals. Describe what you see in the plot and what this tells you about the relationship between wine consumption and death rates from heart disease. 6