# Psychology 202a Advanced Psychological Statistics October 22, 2015 ```Psychology 202a
Statistics
October 22, 2015
The plan for today
• Conditioning on a continuous variable
• Introducing correlation and regression
Continuous conditional distributions
• The scatterplot.
• Focusing on conditional center.
• Two natural questions:
– How strong is the relationship?
– What is the relationship?
• Correlation and regression.
How strong is the relationship?
• Pearson product-moment correlation
coefficient
• Population: r (rho)
• Sample: r
• We will develop three ways to understand
the correlation coefficient
First way to understand correlation
• A scale-free covariance
• Covariance:
( X  M X )( Y  MY )
Cov X ,Y   
.
N 1
Covariance (continued)
• Problem: magnitude depends on scale of X
and Y
• Solution: remove the scale by standardizing
• Pearson’s r:
r X ,Y  
Cov X ,Y
s X sY

.
Problem with that way of
understanding r
• Does not make it absolutely clear that the
relationship must be linear in order for r to
make sense as a measure of strength of
association.
What is the relationship?
• Linear regression:
Y   0  1X .
Estimation of regression
parameters
• Slope estimate:
ˆ1
X  M X Y  MY 


.
 X  M X 
2
• Intercept estimate:
ˆ0  MY  ˆ1M X .
Regression as a Model
• Regression as a model for conditional
mean
• What about all those other aspects of a
distribution?
Estimating Regression
• Why are the estimates what they are?
• Definition: residual is an estimate of the error
component of the model:
Y i   0  1X i   i
 i  Y i   0  1X i 

e i  Y i  ˆ0  ˆ1X i

Estimating Regression
• The line that fits best is the one that
minimizes the residuals.
• Once again, negative residuals balance
positive residuals…
• …so we make the residuals positive by
squaring them.
The Principle of Least Squares
• This criterion for best fit is known as the
principle of least squares.
• You will also see it referred to as “ordinary
least squares” …
• …or as “OLS” for short.
• See me if you are interested in why the
OLS estimates are what they are.
Decomposing the sum of squares
• Recall that the model can be broken down
into two components:
– the part we do understand
– the part we don’t understand
• The sum of squares can be broken down
into corresponding components.
• These components have the same