September 15, 2004

advertisement
Clinical Research Training Program 2021
CORRELATIONS
Fall 2004
www.edc.gsph.pitt.edu/faculty/dodge/clres2021.html
1
OUTLINE
Correlations
Multiple Correlation Coefficients
Partial Correlation Coefficients
Multiple Partial Correlation
2
Multiple Correlation Coefficient
The multiple correlation coefficient,
denoted as Ry|x1,…, xp, is a measure of the
overall linear association of one
dependent variable y with p independent
variables x1,…, xp.
The least-squares solution yˆ  βˆ 0  βˆ1 x1      βˆ p x p
has the largest value of Ry|x1,…, xp.
When p = 1, R = ?r.
3
Multiple Correlation Coefficient
Mathematical formula for Ry|x1,…, xp
  yi  y  yˆ i  yˆ 
n
R y| x1 ,..., x p 
i 1
n
y
i 1
 y
2
i
  yˆ
n
i 1
i
 yˆ

 ry, yˆ .
absolute
value
2
Its square:
n
R
2
y| x1 ,..., x p

n
2
ˆ
  yi  y     yi  yi 
2
i 1
i 1
n
2


y

y
 i
i 1
SSR

.
SST
4
Multiple Correlation Coefficient
The quantity R
, measures the
proportionate reduction in the total sum of
squares   yi  y 2 to   yi  yˆ i 2 due to the
multiple linear regression of y on x1,…, xp.
The quantity Ry|x1,…, xp, is the correlation
of the observed value y with the predicted
value yˆ , and this correlation is always
nonnegative.
2
y| x1 ,..., x p
5
Partial Correlation Coefficient
The partial correlation coefficient,
denoted as ryx*|x1,…, xp is a measure of the
strength of the linear relationship
between the dependent variable y and one
independent variables, say x*, after we
control for the effects of other p
independent variables x1,…, xp.
6
Partial Correlation Coefficient
The order of the partial correlation
coefficient depends on the number of
variables that are being controlled for.
First-order partials: ryx*|x1
Second-order partials: ryx*|x1, x2
Third-order partials: ryx*|x1, x2, x3
…etc.
7
Partial Correlation Coefficient
 Mathematical
formula for
ryx2 *| x1 ,..., x p
Or
r
2
yx* | x1 ,..., x p
Extra SS due to adding x * to the model,

given that x ,..., x are already in the model 
1
p



Residual SS using only x1 ,..., x p in the model

2
YX 1| X 2


2
Y|X 2


2
Y|X 1X 2
2
Y|X 2
8
Partial Correlation Coefficient
The quantity r
measures the
proportion of the residual sum of squares
that is accounted for by the addition of x*
to a regression model already involving
x1,…, xp.
The partial F statistic F(x*| x1,…, xp) is
used to test H0: ρ yx*| x1 ,..., x p  0.
2
yx* | x1 ,..., x p
9
Partial Correlation Coefficient
ρ yx*| x1 ,..., x p  0
Hypothesis:
Test statistic:
F x * | x1 , , x p

SSR( x* , x , , x

1
p

)  SSR( x1 , , x p ) /1
MSE( x* , x1 , , x p )
~ F1, n - p-2
10
Partial Correlation Coefficient
Compare two models:
y = 0 + 1x1
y = 0 + 1x1 + 2x2
H0:
ρ yx2 | x1  0.
+
+
H0: 2=0
11
Multiple Partial Correlation
The multiple partial correlation coefficient,
denoted as ry(z1, …, zq)|x1,…, xp is a measure of the
strength of the linear relationship between
the dependent variable y and a set of
independent variables, say z1,…, zq, after we
control for the effects of other p independent
variables x1,…, xp.
12
Multiple Partial Correlation
Mathematical formula for ry2( z1 ,..., zq )| x1 ,..., x p
ry2( z1 ,..., zq )| x1 ,..., x p
Extra SS due to adding z1 ,..., zq to the model, 


given that x1 ,..., x p are already in the model 

Residual SS using only x1 ,..., x p in the model
Hypothesis:
ρ y ( z1 ,..., zq )| x1 ,..., x p  0
Test statistic:
F z1 , , zq | x1 , , x p

SSR( z , , z , x , , x

1
q
1
p

)  SSR( x1 , , x p ) /3
MSE( z1 , , zq , x1 , , x p )
~ Fq, n - p -q-1
13
Multiple Partial Correlation
2
y ( z1 ,..., zq )| x1 ,..., x p
The quantity r
measures the
proportion of the residual sum of squares
that is accounted for by the addition of
z1,…, zq to a regression model already
involving x1,…, xp.
The partial F statistic F(z1,…, zq| x1,…, xp)
is used to test H0: ρ y ( z1 ,..., zq )| x1 ,..., x p  0.
14
Multiple Partial Correlation
Compare two models:
y = 0 + 1x1
+
y = 0 + 1x1 + 2x2 + 3x3 + 
H0:
ρ y(x2 x3 )| x1  0
H0: 2 = 3 =0
15
Partial Correlation Coefficient
y = SBP
Variable
Intercept
Weight
Age
Height
Type I
Partial Corr.
-
Type II
Partial Corr.
-
rSBP,WGT
rSBP,WGT|AGE,HGT
rSBP,AGE|WGT
rSBP,AGE|WGT,HGT
rSBP,HGT|WGT,AGE
16
rSBP,HGT|WGT,AGE
Download