Correlation
Describes the linear relationship between two interval/ratio variables
What is a linear relationship?
Relationship between two interval/ratio variables where when its observations are put on a scatterplot, they can be approximated by a straight line
What is a scatterplot?
Visual method used to display a relationship between two interval/ratio variables
What are the 3 reasons scatterplots are useful?
1) Tells us if we have a linear relationship
2) Tells us the directionality
3) Makes us aware of any outliers
4 (optional) tells us if a relationship even exists
Ordinary least square (OLS) regression
Where a straight line is used to estimate the relationship between interval/ratio variables
What does the 'line of best fit' mean?
The line that approximates a relationship between two I/R variables with the least amount of errors
What is an error in terms of OLS regression?
Any distance between a given plot point (dot) and the approximated line of best fit
What is the 'point' of OLS regression?
To find the line of best fit that minimizes the sum of errors squared
What are the limitations of OLS regression?
1) Must be I/R data to use OLS regression
2) Relationship has to already be fairly linear in the first place
Explain what each value represents in the regression formula y = bx + a
y^ (y hat): The value of the DV predicted by the regression line
b (slope): How much we expect y (the DV) to change for every unit change in x (the IV)
x: The actual score of the dependent variable
a (y-intercept): The point at which the regression line crosses the y-axis (when x = 0)
What is the 'coefficient of determination (r^2)' and what are its characteristics?
Is the proportion of variation in the DV that is predictable fro the IV
1) Ranges from 0 -1 (cannot indicate directionality)
2) The closer r^2 is to 1, the better the line fits the data
What is 'Pearson's correlation coefficient (r)' and what are its characteristics?
Measures the strength of a linear relationship between two I/R variables and returns the MOA back to its original metric
1) Ranges from -1 - 1 (can indicate directionality)
2) Should always have the same sign as the covariance
What 3 things does r^2 tell us?
1) The goodness of the fit (of the line of best fit)
2) The amount of variance in the DV that can be accounted for by the IV
3) The extent to which knowing the IV reduces our error in predicting the DV (is a PRE measure)
Wen used together, what does r^2 and r tell us overall?
They indicate the strength of the relationship between two I/R variables as well as how well a given line fits its data
How do you INTERPRET r^2?
1) Knowing about (X variable) decreases our error in predicting (Y variable) by xx%
2) Knowing about (X variable) increases our ability to predict (Y variable) by xx%
3) xx% of the variability of (Y variable) can be explained by (X variable)
How do you INTERPRET r?
r indicates the strength, or directionality of a relationship where
1.0 = very strong positive relationship
0.0 = no relationship
-1.0 = very strong negative relationship