CHAPTER 4 –SCATTER PLOTS AND CORRELATION TOPICS COVERED - Sections shown with numbers as in e-book Any topic listed on this document and not covered in class must be studied “On Your Own” (OYO) Section 4.1 – EXPLANATORY AND RESPONSE VARIABLES (pg. 96) Measuring two quantitative variables on the same individual Explanatory variable (plotted on horizontal axes – x – independent variable) Response variable (plotted on the vertical axes – y – dependent variable) o In some cases the choice is important o In others, either variable can be used as x or y Section 4.2 – DISPLAYING RELATIONSHIPS: SCATTER PLOTS (pg. 97) Graph to display the relationship between two quantitative variables measured on the same individual o Scatter plots State the problem Decide which is the explanatory and which is the response variable (if any) Label axes and plot the points o On Your Own - Scatter plots – construct with calculator – if you do not know how to do this, go to the MSC or watch the video: http://student.ccbcmd.edu/elmo/math141s/TIVideo/Section9_3.html Section 4.3 – INTERPRETING SCATTER PLOTS (pg. 99) Look for o The overall pattern o For striking deviations Outliers that fall outside of overall pattern Describe the pattern by o Direction Positive association Negative association No clear direction o Form Straight line Another shape o Strength Strong: Points close to the line or other shape Weak Section 4.4 – ADDING CATEGORICAL VARIABLES TO SCATTER PLOTS (pg. 103) Use a different color/symbol for each category Section 4.5 – MEASURING LINEAR ASSOCIATION: CORRELATION (pg. 104) Correlation coefficient r (We’ll use the calculator to find it) o Number that measures the strength of a linear relationship between the two quantitative variables (Your calculator gives it to you: turn On Diagnostics: 2nd 0[catalog] – scroll down to Diagnostics On and press ENTER twice – it will say: Done) – STAT – CALC – 4:LinReg(ax+b) L1, L2, Y1 (Y1 is in VARS, YVARS, Function) Section 4.6 – FACTS ABOUT CORRELATION (pg. 106) Facts about the correlation coefficient r: o Both variables should be quantitative o Only used to measure linear relationships It does not describe curved relationships no matter how strong they are o It has no units o Has the same value if you change x and y o Has the same value if you change the units of the variables o It is always between -1 and 1 (included) r close to zero indicates very weak linear relationship r close to 1 or -1 indicate a strong linear relationship (points close to a straight line) r = 1 or r = -1 indicate that all points are exactly on a straight line o It is affected by outliers (not resistant to extreme values – like the mean and the standard deviation) o It is not a complete summary of two-variable data; also include the means and standard deviations of both x and y Summary To study relationships between variables, we must measure the variables on the same group of individuals. If we think that a variable x may explain or even cause changes in another variable y, we call x an explanatory variable and y a response variable. A scatterplot displays the relationship between two quantitative variables measured on the same individuals. Mark values of one variable on the horizontal axis (x axis) and values of the other variable on the vertical axis (y axis). Plot each individual’s data as a point on the graph. Always plot the explanatory variable, if there is one, on the x axis of a scatterplot. Plot points with different colors or symbols to see the effect of a categorical variable in a scatterplot. In examining a scatterplot, look for an overall pattern showing the direction, form, and strength of the relationship, and then for outliers or other deviations from this pattern. Direction: If the relationship has a clear direction, we speak of either positive association (high values of the two variables tend to occur together) or negative association (high values of one variable tend to occur with low values of the other variable). Form: Linear relationships, where the points show a straight-line pattern, are an important form of relationship between two variables. Curved relationships and clusters are other forms to watch for. Strength: The strength of a relationship is determined by how close the points in the scatterplot lie to a simple form such as a line. The correlation r measures the direction and strength of the linear association between two quantitative variables x and y. Although you can calculate a correlation for any scatterplot, r measures only straight-line relationships. Correlation indicates the direction of a linear relationship by its sign: r > 0 for a positive association and r < 0 for a negative association. Correlation always satisfies −1 ≤ r ≤ 1 and indicates the strength of a relationship by how close it is to −1 or 1. Perfect correlation, r = ±1, occurs only when the points on a scatterplot lie exactly on a straight line. Correlation ignores the distinction between explanatory and response variables. The value of r is not affected by changes in the unit of measurement of either variable. Correlation is not resistant, so outliers can greatly change the value of r.