AP Statistics Linear Regression t Test NOTES Body weights and backpack weights were collected for eight students. Weight(lbs) Backpack weight (lbs) 120 26 187 30 109 26 103 24 131 29 165 35 158 31 116 28 These data were entered into a statistics package and least squares regression of backpack weight on body weight was requested. Here are the results: Predictor Constant Backpack Coef 16.265 0.0908 Stdev 3.937 0.02831 S=2.270 R-sq=63.2% r-sq(adj)=57% t-ratio 4.13 3.21 P 0.006 0.018 What is the equation of the least-squares line? Interpret r-squared in the context of this problem. 2 If a scatter plot, residual plot, and r do not provide convincing evidence for a useful linear relationship, we can use the following model utility test for simple linear regression to help us assess the situation more definitively. Our data is a sample – we have not done a census on all students body weight vs. backpack weight. So, the a and b we have been calculating are statistics – they are based on a sample. But there is some “true” value out there for the slope of the regression line. Let represent the true slope of the linear regression line. This is a parameter. In the same way that x is the sample mean used to estimate the population parameter , b is the sample slope statistic used to estimate the true slope of the regression line (from yˆ a bx ) . Hypotheses about can be tested using a t-test very similar to the t-tests discussed for other situations. The null hypothesis specifies that there is no useful linear relationship between x and y, whereas the alternative hypothesis specifies that there is a useful linear relationship between x and y. If Ho is rejected, we conclude that the simple linear regression model is useful for predicting y. This means that knowledge of x is useful for predicting y. Ho : = 0 (there is no useful linear relationship) Ha : 0 (there is a useful linear relationship) …. or > 0 or < 0 (if needed) test statistic: t = b hypothesizedvalue sb where df = n – 2 and sb se Sxx Conditions: a) The scatterplot indicates a reasonable linear relationship b) The set of observations represents the population and was randomly selected. c) The residual plot does not show any curved pattern and is reasonable scattered with little skewness and no extreme outliers. d) The errors around the regression line at each value of x follow a normal distribution. CALCULATOR TEST: LinRegTTest When p is small we reject Ho and conclude that knowing x does give us information about y. What you must know (or have) for the following HT and CI: b1 = slope estimate (average) based on sample data = hypothesized population parameter for slope s b1 = standard deviation of slope estimate based on sample data n = sample size (number of sampled paired data) sb s.e.( sb1 ) = standard error of slope estimate = 1 : This is what appears on n computer outputs. t* = critical test value based on confidence level and degrees of freedom Run LinRegtTest in calculator and compare with computer output above. Use the data from example 1 to determine if there is a significant relationship between bodyweight and backpack weight. The Fish and Wildlife Agency is interested in being able to estimate the weight of bears based on their length. Data was collected from a random sample of 143 bears and a least squares regression line estimated. The output from this model is provided below. Predictor Constant Length Coef -422.49 10.1487 Stdev 31.19 0.5031 S=56.07 R-sq=74.3% R-sq(adj)=74.1% t-ratio -13.55 20.17 P 0.000 0.000 (Analysis of variance information may also be included, but is not tested on AP test.)