AP Statistics Section 3.2 B Residuals One of the purposes of drawing a regression line is to predict __ y based on ____. x Since any prediction errors we make are errors in y, we would like to find the line that makes the vertical distances from our data points to the line as ______ small as possible. The predicted response will usually not be exactly the same as the actual observed response. One of the first principals of data analysis is to look for an overall pattern and for striking deviations from that pattern. Residuals determine how well the LSL fits the data. A residual is the difference between an observed value of the response variable and the value predicted by the regression line. residual y - yˆ Example: Refer back to section 3. 2 A and find the residual for the subject whose NEA rose by 135 calories. yˆ 3.505 .0034(135) yˆ 3.046 residual 2.7 - 3.046 -.346 The residual is negative because the data point lies ______ below the line. The sum of the least squares residuals is always ____. 0 The graph at the right below is the residual plot of the NEA vs Fat Gain example in Section 3.2 A. A residual plot makes it easier to study the residuals by plotting them against the explanatory variable. Because the mean of the residuals is always 0, the horizontal line at zero helps orient us. This “residual = 0” line corresponds to the regression line we drew in the section 3.2 A notes. The residual plot magnifies the deviations from the line to make patterns easier to see. If the regression line captures the overall pattern of the data, there should be _________ no pattern in the residuals, such as in the graph to the left. CALCULATOR: Put data in 2 lists and find LSL 2nd Y= (STATPLOT) Xlist: Explanatory variable Ylist: RESID 2nd STAT (List) NAMES-Scroll down to RESID ENTER GRAPH ZOOM 9 Here are two important things to look for when you examine a residual plot. patterns fan-shaped 2. The residuals should be relatively small in size. Since smallness is relative, we could find the standard deviation of the residuals, which is given by the equation . 2 residuals s n2 The standard deviation of the residuals represents the amount of error that could “consistently” occur using the LSL to make predictions.