Chapter 7: Scatterplots,
Associations, and Correlations
A.P. Statistics
Scatterplots
Scatterplots are the best way to start observing the relationship between two quantitative variables
Describing Scatterplots
• Direction:
Positive, Negative, none
• Form:
Linear, curved, cluster, etc
• Strength:
At this point: strong, moderate, weak
• Unusual Features: outliers, clusters with, etc
Describing Scatterplots
• Direction
• Form
• Strength
• Unusual Features
Describing Scatterplots
• Direction
• Form
• Strength
• Unusual Features
Variables
• Explanatory or Predictor :
Attempts to explain the observed outcome
Placed on x-axis
• Response :
Measures an outcome
Placed on y-axis
BE LOGICAL
Correlation
• Measures the direction and strength of the linear relationship between the two quantitative variables
• Given as r r
1 n
x s x x
y s y y
n z x z y
1
Correlation: Graphical
Original data Standardized Data
Correlation Conditions
• Quantitative Variables Condition
• Straight Enough Condition
• Outlier Condition
Report correlation with and without outlier
Correlation Properties
1. Correlation makes no difference between explanatory and response variables.
2. Correlation requires both variables to be quantitative.
3. Because r uses standardized values of the observations, r does not change when we change the units of measurements of x, y, or both.
Correlation Properties
4. Positive r indicates positive association between the variables and a negative r indicates negative association.
5. The correlation is always a number between
-1 and 1. The strength of the linear relationship increases as r moves away from
0 towards either -1 or 1.
Correlation Properties
6. Correlation measures the strength of only a linear relationship.
7. Like mean and standard deviation, r is not resistant: it is strongly affected by a few outliers.
8. Correlation has no units. It should not be expressed as a percentage.
Other Information
• Correlation is not a complete description of two variable data—even if it is a linear relationship.
• You should give the means and standard deviations of both x and y.
Straightening Scatterplots
• If a relationship between two quantitative variables is not linear, we can re-express it and straighten the form. Then we can apply the strengths of using correlation and the other measures that come from a linear relationship.
• We will go more into depth about this in
Chapter 10.
• See page 154 in your text for graphic
Problems?
• Don’t say “correlation” when you mean
“association”.
• Don’t correlation categorical data.
• Be sure the association is linear.
• Beware of outliers.
• Don’t confuse correlation with causation.
• Watch out for lurking variables.
Correlation vs. Causation
• Scatterplots and correlations never prove causation.
• The only thing that can show causation is a randomized controlled experiment.
Correlation vs. Causation