Chapter 7: Scatterplots, Associations, and

advertisement

Chapter 7: Scatterplots,

Associations, and Correlations

A.P. Statistics

Scatterplots

Scatterplots are the best way to start observing the relationship between two quantitative variables

Describing Scatterplots

• Direction:

Positive, Negative, none

• Form:

Linear, curved, cluster, etc

• Strength:

At this point: strong, moderate, weak

• Unusual Features: outliers, clusters with, etc

Describing Scatterplots

• Direction

• Form

• Strength

• Unusual Features

Describing Scatterplots

• Direction

• Form

• Strength

• Unusual Features

Variables

Explanatory or Predictor :

Attempts to explain the observed outcome

Placed on x-axis

• Response :

Measures an outcome

Placed on y-axis

BE LOGICAL

Correlation

• Measures the direction and strength of the linear relationship between the two quantitative variables

• Given as r r

1 n

 

 x s x x

  y s y y

 n z x z y

1

Correlation: Graphical

Original data Standardized Data

Correlation Conditions

• Quantitative Variables Condition

• Straight Enough Condition

• Outlier Condition

Report correlation with and without outlier

Correlation Properties

1. Correlation makes no difference between explanatory and response variables.

2. Correlation requires both variables to be quantitative.

3. Because r uses standardized values of the observations, r does not change when we change the units of measurements of x, y, or both.

Correlation Properties

4. Positive r indicates positive association between the variables and a negative r indicates negative association.

5. The correlation is always a number between

-1 and 1. The strength of the linear relationship increases as r moves away from

0 towards either -1 or 1.

Correlation Properties

6. Correlation measures the strength of only a linear relationship.

7. Like mean and standard deviation, r is not resistant: it is strongly affected by a few outliers.

8. Correlation has no units. It should not be expressed as a percentage.

Other Information

• Correlation is not a complete description of two variable data—even if it is a linear relationship.

• You should give the means and standard deviations of both x and y.

Straightening Scatterplots

• If a relationship between two quantitative variables is not linear, we can re-express it and straighten the form. Then we can apply the strengths of using correlation and the other measures that come from a linear relationship.

• We will go more into depth about this in

Chapter 10.

• See page 154 in your text for graphic

Problems?

• Don’t say “correlation” when you mean

“association”.

• Don’t correlation categorical data.

• Be sure the association is linear.

• Beware of outliers.

• Don’t confuse correlation with causation.

• Watch out for lurking variables.

Correlation vs. Causation

• Scatterplots and correlations never prove causation.

• The only thing that can show causation is a randomized controlled experiment.

Correlation vs. Causation

Download