Chapter 14 Write-ups Problem 10. Step 1: Linear regression t-interval x = speed y = steps per second Step 2: There is a strong positive linear relationship between speed and steps per second of competitive runners. By linear regression: ŷ = 1.76608 + 0.080284x The residual plot has a strong pattern, bringing into question the linear model. Examination of the magnitudes of these residuals, however, shows that they are quite small, ranging only from -0.01 to 0.01, so we may be able to tolerate that amount of pattern. The standard deviation is relatively constant throughout. The data appear to be independent, and normal procedures should guarantee that. The normal quantile plot (normal probability plot) of the residuals is roughly linear, suggesting a normal model for the residuals. Chapter 14 Write-ups Step 3: t = Step 3: b SEb b ± t * SEb 0.080284 ± 4.032(0.0016) (0.0738,0.0867) Step 4: We are 99% confident that the slope of the linear relationship between speed and number of steps per second falls within (0.0738, 0.0867). Now for a test of significance: Step 1: H 0 : b = 0 Ha : b > 0 b is the slope of the line between running speed and number of steps per second for competitive runners. Step 2: Same as above. b 0.080284 = = 49.7 df = 5 SEb 0.0016 Step 4: The shaded region is too small to see. Step 3: t = Step 5: P(t > 49.7) = 3.12 ´10-8 Step 6: Reject H0, a test statistic this large -8 by chance alone. p = 3.12 ´10 < .01 = a will rarely happen Step 7: We have strong evidence of a positive linear relationship between speed and number of steps for competitive runners. Problem 11 Step 1: Linear regression t-interval x = year y = lean measured by tenths of mm over 2.9 m Step 2: There is a strong positive linear relationship between year and amount of lean of the Leaning Tower of Pisa. Linear regression produces: ŷ = -61.1209 + 9.31868x Chapter 14 Write-ups The residual plot is patternless, confirming the linear model. It also shows that the standard deviation remains about the same throughout. The normal normal probability plot of the residuals is fairly nonlinear, calling into question whether the residuals vary normally. This data collected over time must give independent measures. Step 3: b ± t * SEb df = 11 9.31868 ± 2.201(0.3099) (8.636,10.000) Step 4: We are 95% confident that the slope of the linear relationship between year and lean of the Tower of Pisa falls within (8.636, 10.000). (There was a problem meeting the requirement that the residuals vary normally.) Now for a test of significance: Step 1: H 0 : b = 0 Ha : b > 0 b is the slope of the line between the year and the amount of lean of the Tower of Pisa. Step 2: Same as above. Step 3: t = Step 4: b 9.31868 = = 30.1 df = 11 SEb 0.3099 The shaded region is too small to see. Step 5: P(t > 30.1) < 0.0001 Step 6: Reject H0, a test statistic this large will rarely -4 happen by chance alone. p < 10 < .01 = a Step 7: We have strong evidence of a positive linear relationship between the year and the amount of lean of the Tower of Pisa. (There was a problem meeting the requirement that the residuals vary normally.) Chapter 14 Write-ups Problem 15 Step 1: Linear regression t interval for slope Slope = D metabolic rate D mass Step 2: This scatterplot of mass vs. metabolic rate shows a strong positive linear relationship. The least squares line is ŷ = 113.165 + 26.878x where x is mass in kg and y in calories. The residual plot of mass vs. residuals is patternless, confirming the linear model. standard deviation appears to be approximately the same throughout. The The normal probability plot of the residuals is only very roughly linear, so there is uncertainty about normality, giving a possible assumption violation. 2.6 410 -207 Ordinary experimental procedures will give independent data, so this assumption appears to be met. -2.6 b ± t * SEb 26.878 ± 1.740(3.786) (20.29, 33.46) df = 17 We are 90% confident that the true slope of the regression line between mass and metabolic rate is between 20.9 and 33.4 calories per kg. There was an assumption violation, however. In repeated random sampling this method captures the true slope 90% of the time.