UNIT 4 Bivariate Data Scatter Plots and Regression What is Bivariate Data? • Bivariate Data are two quantitative variables. Example: The percentage of people who would vote for a woman president over the last century Displaying Bivariate Data • Bivariate Data is typically displayed with a Scatter Plot • Scatterplots may be the most common and most effective display for data. • Scatterplots are the best way to start observing the relationship and the ideal way to picture associations between two quantitative variables. • X axis – the explanatory (or predictor) variable. • Y axis – the response variable. Examples: For the following state the explanatory variable and the response variable. • Do students learn better with the amount of homework assigned? • Does the weight of a car affect the miles per gallon that the car gets? • Are SAT math scores and GPA related? • Is there an association between a person’s speed and the amount of weight they can squat? What should you look for in a scatterplot • Direction – which way are the points going? positive, negative, neither. • Form - linear, quadratic, exponential, logarithmic • Strength – how much scatter is there in the plot? Weak, moderate, strong Example: Peak Period Freeway Speed and cost per person What is the direction, form strength? What is the direction, form strength? What is the direction, form strength? Outliers • You can describe the overall pattern of a scatterplot by Direction, Form, and Strength of the relationship. • An important kind of deviation is an outlier, an individual value that falls outside the overall pattern of the relationship. A scatter plot is a picture of the relationship between two quantitative variables. If a liner relationship exists between two variables the scatter plot will exist as a swarm of points stretched out in a generally consistent manner. If the relationship isn’t straight, we can find ways to make it more nearly straight. Correlation Calculation The correlation coefficient (r) gives us a numerical measurement of the strength of the linear relationship between the explanatory and response variables. r z z x y n 1 The calculator will do the work. Correlation Conditions • • Correlation measures the strength of the linear association between two quantitative variables. Before you use correlation, you must check several conditions: – Quantitative Variables Condition – Correlation is only used for quantitative variables – Straight Enough Condition - But correlation measures the strength only of the linear association, and will be misleading if the relationship is not linear. – Outlier Condition – Outliers can distort the correlation. When you see an outlier, it’s often a good Sli idea to report the correlations with and without the de713 point. Correlation • Correlation describes the direction and strength of a linear relationship • -1 ≤ r ≤ 1 • r > 0 positive linear association • r = 0 no association • r < 0 negative linear association Strength of the relationship • 𝑟 > 0.8 strong • 0.5 < 𝑟 < 0.8 moderate • 𝑟 < 0.5 weak • Examples -0.35 0.98 0.67 • We need to be careful about interpreting correlation coefficients. Just because two variables are highly correlated does not mean that one causes the other. Correlation and • In statistical terms we say that correlation does Causation not imply causation. • Examples: • The number of ice cream sales and the number of shark attacks on swimmers are highly correlated. • The increase in stock prices and the length of women’s skirts are highly correlated. • The number of cavities in elementary school children and vocabulary size have a strong positive correlation. Three relationships which can be taken (or mistaken) for causation • Causation – Changes in X causes changes in Y • Common response – Both X and Y respond to some unobserved variable • Confounding – The effect of X on Y is hopelessly mixed up with the effects of other variables on Y. Correlation and Causation • Association, relationship and correlation does not imply causation • Causation does imply association, relationship and correlation.