Covariance and Correlation Presentation

Terminology Independence Correlation Covariance and Correlation Professor Richard A. Levine San Diego State University Two Things Terminology Independence Correlation Two Things Definitions Relationship between two variables; joint distributions • µX = E (X ), µY = E (Y ) • σX2 = VAR(X ) = E {(X − µX )2 }; σY2 = VAR(Y ) = E {(Y − µY )2 } • Covariance: σXY = COV (X , Y ) = E {(X − µX ) · (Y − µY )} • Correlation: COV (X , Y ) ρ= σX σY Terminology Independence Correlation Two Things Covariance The sign of COV (X , Y ) provides information on the X , Y relationship: • Large values of X tend to be observed with large values of Y : COV (X , Y ) positive • If X > µX , then Y > µY likely to be true, product of deviations will be positive • If X < µX , then Y < µY likely to be true, product of deviations will be positive too Terminology Independence Correlation Two Things Covariance The sign of COV (X , Y ) provides information on the X , Y relationship: • Large values of X tend to be observed with large values of Y : COV (X , Y ) positive • If X > µX , then Y > µY likely to be true, product of deviations will be positive • If X < µX , then Y < µY likely to be true, product of deviations will be positive too • If large values of X tend to be observed with small values of Y : COV (X , Y ) negative • If small values of X tend to be observed with large values of Y : COV (X , Y ) negative Terminology Independence Correlation Consequences COV (X , Y ) = E {(X − µX ) · (Y − µY )} = E (XY − µX Y − X µX + µX µY ) = E (XY ) − µX µY =⇒ E (XY ) = ρσX σY + µX µY Two Things Terminology Independence Correlation X and Y independent • E (XY ) = E (X ) · E (Y ) • COV (X , Y ) = E (XY ) − µX µY = 0 • ρ = 0, no relationship • VAR(X + Y ) = VAR(X ) · VAR(Y ) Two Things Terminology Independence Correlation Two Things Variance of a sum, VAR(X + Y ) VAR(X + Y ) = E {(X + Y )2 } − {E (X + Y )}2 = E {(X + Y ) · (X + Y )} − E (X + Y ) · E (X + Y ) = E (X 2 + 2XY + Y 2 ) −{E (X )}2 − {E (Y )}2 − 2E (X )E (Y ) = VAR(X ) + VAR(Y ) + 2COV (X , Y ) Terminology Independence Correlation |ρ| ≤ 1 Why? Consider the quadratic h(b) = E {(X − µX )b + (Y − µY )}2 = b 2 E {(X − µX )2 } + 2bE {(X − µX )(Y − µY )} +E {(Y − µY )2 } = b 2 σX2 + 2bCOV (X , Y ) + σY2 ≥ 0, for every b Two Things Terminology Independence Correlation |ρ| ≤ 1 Why? Consider the quadratic h(b) = E {(X − µX )b + (Y − µY )}2 = b 2 E {(X − µX )2 } + 2bE {(X − µX )(Y − µY )} +E {(Y − µY )2 } = b 2 σX2 + 2bCOV (X , Y ) + σY2 ≥ 0, for every b There is one real root and the discriminant b 2 − 4ac must be non-positive: =⇒ {2COV (X , Y )}2 − 4σX2 σY2 ≤ 0 =⇒ −σX σY ≤ COV (X , Y ) ≤ σX σY =⇒ −1 ≤ ρ ≤ 1 Two Things Terminology Independence Correlation Two Things ρ = ±1 iff P(Y = c + dX ) = 1 Why? |ρ| = 1 iff discriminant equals 0 or h(b) has a single root. h(b) = 0 iff P[{(X − µX )b + (Y − µY )}2 = 0] = 1 iff P{(X − µX )b + (Y − µY ) = 0} = 1 iff P(Y = cX + d) = 1 where c = −b and d = µX b + µY with b being the root of h(b). Terminology Independence Correlation Regression line Least squares: find line (slope b) that minimizes h(b). Two Things Terminology Independence Correlation Two Things Regression line h(b) = b 2 σX2 + 2bρσX σY + σY2 set h0 (b) = −2ρσX σY + 2bσX2 = 0 =⇒ b = ρσY σX h00 (b) = 2σX2 > 0 So least squares regression line (best fit according to h(b)) is y = µY + ρσY (x − µX ) σX Terminology Independence Correlation Two Things Regression line y = µY + ρσY (x − µX ) σX • If ρ > 0, slope is positive • If ρ < 0, slope is negative • If ρ = 0, h(ρσY /σX ) = σY2 a constant If ρ is close to +1 or -1, h(ρσY /σX ) is relatively small. Vertical distances of a point from the line are small since h is the expected value of the square of those distances! In all, ρ measures the amount of linearity in the distribution of points. Terminology Independence Correlation Two Things Uncorrelated =⇒ 6 Independent Independent random variables: f (x, y ) = fX (x) · fY (y ) Let Y have density symmetric about zero and X = SY . Here S is independent of Y and takes on values +1 and -1 with probability 1/2 each. This means fX (x) = 12 fY (x) + 21 fY (−x). Terminology Independence Correlation Two Things Uncorrelated =⇒ 6 Independent Independent random variables: f (x, y ) = fX (x) · fY (y ) Let Y have density symmetric about zero and X = SY . Here S is independent of Y and takes on values +1 and -1 with probability 1/2 each. This means fX (x) = 12 fY (x) + 21 fY (−x). E (S) = 1 · P(S = 1) + (−1) · P(S = −1) = 0.5 − 0.5 = 0 COV (X , Y ) = COV (SY , Y ) = E (SY · Y ) − E (SY ) · E (Y ) = E (S) · E (Y 2 ) − E (S) · {E (Y )}2 = 0 but X = SY , so X and Y not independent Terminology Independence Correlation Covariance inequality ρ = COV (X , Y )/(σX σY ) and |ρ| ≤ 1 {COV (X , Y )}2 ≤ VAR(X ) · VAR(Y ) (for those who have taken more analysis, this is a version of the Cauchy-Schwarz inequality) Two Things

Covariance and Correlation Presentation

Related documents

Products

Support

Covariance and Correlation Presentation

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib