Name: Date: Block: AP Statistics – 3.3 Regression Wisdom – Class Activity – The Wandering Point1 1. The scatterplot shows the four points (1,2), (2,6), (4,2), and (5,6) plotted in a 10x10 graphing window. Find the correlation and the equation of the line of best fit. y = ______________________ r = _______________ 2. Now investigate the influence of one more point on the correlation and the slope. Try each of these as the fifth point and record the new correlation and slope. Also note whether the new points has a small or large residual. (Note: Add each point to the original four, one at a time, see what happens, and then remove that point. There are never more than 5 points in the plot!) Fifth Point Description None the original four points (3,4) right in the center of the given points (8,6) also on the line, but far from the other points (10,7) only close to the line, but much farther away (3,8) above the center of the original cluster (1,7) nearby, but not consistent with the apparent pattern (8,9) farther away, and also not consistent (10,0) farther and stranger … Correlation Slope of LSRL Size of Residual N/A 3. A point that dramatically changes the apparent slope of the regression line is called an influential point. You need to be able to spot potential points in a scatterplot. What should you look for? 4. Originally there were only four points here. Suppose instead that we had started with 50 points clustered in essentially the same region and displaying an association of roughly the same strength and direction. Would our fifty-first points still be as influential? Where would you locate one additional point so influential that it changed the line as dramatically as (10, 0) did above? 1 Activity taken from Stats: Modeling the World by BVD. Name: Date: Block: AP Statistics – 3.3 Regression Wisdom – Class Activity – The Wandering Point2 1. The scatterplot shows the four points (1,2), (2,6), (4,2), and (5,6) plotted in a 10x10 graphing window. Find the correlation and the equation of the line of best fit. y = 2.8 + 0.4x r = 0.3162 2. Now investigate the influence of one more point on the correlation and the slope. Try each of these as the fifth point and record the new correlation and slope. Also note whether the new points has a small or large residual. (Note: Add each point to the original four, one at a time, see what happens, and then remove that point. There are never more than 5 points in the plot!) Fifth Point Description Correlation Slope of LSRL Size of Residual None the original four points 0.3162 0.4 N/A (3,4) right in the center of the given points 0.3162 0.4 0 (8,6) also on the line, but far from the other points 0.5 0.4 0 (10,7) only close to the line, but much farther away 0.6157 0.4228 Small (3,8) above the center of the original cluster 0.2357 0.4 Large (1,7) nearby, but not consistent with the apparent pattern 0.0457 -.06 Medium? (8,9) farther away, and also not consistent 0.7303 0.8 Small (10,0) farther and stranger … 0.4888 -0.3740 Small 3. A point that dramatically changes the apparent slope of the regression line is called an influential point. You need to be able to spot potential points in a scatterplot. What should you look for? To spot influential points in a scatterplot, look for points that depart from the overall pattern, especially those points that are outliers in the explanatory, or x direction. DO NOT look for points with large residuals (these might be outliers, but are not influential). Influential points change the slope of the regression line, so they often have small residuals. 4. Originally there were only four points here. Suppose instead that we had started with 50 points clustered in essentially the same region and displaying an association of roughly the same strength and direction. Would our fifty-first points still be as influential? Where would you locate one additional point so influential that it changed the line as dramatically as (10, 0) did above? The fifty-first point would not be as influential, since the pattern is more established by 50 points than it was by 4 points. In order to be as influential as (10, 0), an additional point would have to be much further out along the x-axis, maybe (100,0). 2 Activity taken from Stats: Modeling the World by BVD.