Top 10 Things to Remember about Summarizing Bivariate Data 10. Always make a picture (scatterplot of data, residual plot) 9. Identify the explanatory and response variables. (either in your equation or define the variables separately) True or False: Pearson’s correlation coefficient, r, does not depend on the units of measurement of the two variables. True or False: The value of Pearson’s correlation coefficient, r, is always between 0 and 1. 8. If the goal is to describe the strength of relationship, report correlation coefficient. (strength, direction, form, and unusual features, always in context) 7. If the goal is to predict, report the LSRL and coefficient of determination. True or False: The slope of the least squares line is the average amount by which y increases as x increases by one unit. True or False: The slopes of the LSRL for predicting y from x, and the LSRL for predicting x from y, are equal. 6. Explain what the slope b and y-intercept mean in CONTEXT in the predicted y = a + bx regression line. 5. Beware of extrapolation (do not assume that a linear model is valid over a wider range of x values) True or False: The LSRL passes through the point True or False: The coefficient of determination is equal to the positive square root of Pearson’s r. 4. A correlation coefficient of 0 does not necessarily imply that there is no relationship between two variables (could be strong but not linear). 3. Watch out for influential observations. (pulls LSRL toward it, but will be close to 0 in residual plot) True or False: If |r |=1, the standard deviation of y is equal to the standard deviation of the residuals. True or False: The standard deviation about the LSRL is roughly the typical amount by which an observation deviates the least squares line. True or False: A transformation (or reexpression) of a variable is accomplished by substituting a function of the variable in place of the variable for further analysis. True or False: The higher the value of the coefficient of determination, the greater the evidence for a causal relationship between x and y 2. Correlation does not imply causation. A strong correlation implies only that the two variables tend to vary together in a predictable way. And the #1 thing to remember…. 1. Only use QUANTITATIVE data when comparing bivariate data Plot your data. (scatterplot) Interpret what you see. (direction, form, strength, outliers) Numerical summary? x, y, s x , s y , andr Mathematical model? (Regression line) How well does it fit? (Residuals and r2) 1)Given this residual plot, which of the following is not a correct conclusion? A Residuals 0 Fitted values a) The pattern in the residuals indicates the regression line does not fit the data well. b) Point A is a candidate as an outlier. c) Point A is a candidate as an influential point. d) The relationship between the variables is positive. e) All of these are correct. 2) Which of the following residual plots indicates a reasonable fit to a given set of data? a) c) Residuals Residuals 0 Residuals Residuals 0 b) 0 d) e) None of these indicates a reasonable fit. 3) Which of the following is a correct conclusion based on the residual plot displayed? Residu als 0 a)The line overestimates the data. Fitted values b) The line underestimates the data. c) It is not appropriate to fit a line to these data since there is clearly no correlation between the variables. d) The data is not related. e) None of these choices is correct. 5) You are given the regression equation: temperature = 30.4 - .72(distance), where temperature is the temperature displayed on a sensor in °C and distance is the distance in centimeters from the sensor to a heat source. Which of the following is not a reasonable conclusion? a) The temperature of the heat source is approximately 30.4°C. b) The temperature decreases approximately .72°C for each centimeter the sensor is moved away from the heat source. c) We can predict that the sensor displays a temperature of 21.76°C when the sensor is 12 centimeters away from the heat source. d) The correlation coefficient between temperature and distance indicates a negative relationship. e) All of these are reasonable. 7) If the correlation coefficient of a bivariate set of data {(x,y)} is r, then which of the following is true? a)The variable x and y are linearly related. b) The correlation coefficient of the set {(y,x)} is also r. c) The correlation coefficient of the set {(x,ay)} is also a ·r. d) The correlation coefficient of the set {(ax,ay)} is also a ·r. e) None of these is true.