Pearson’s Product Moment Correlation Coefficient (PPMCC) Sanjay Singh, PhD Sanjay Singh, PhD | sanjay.singh3210@gmail.com Pearson Product Moment Correlation Coefficient (PPMCC) • Used when X & Y are both metric variables • Given by Karl Pearson in 1896 in his paper ‘Philosophical Transactions of the Royal Society of London’ • Denoted by symbol ‘r’ Sanjay Singh, PhD | sanjay.singh3210@gmail.com Why it is called Product Moment Correlation • ‘Moment’ refers to average of a set of products. Pearson correlation is given by following formula r= x = deviation of X scores from Mean Score in Score in x = X –X Maths Stats Student y = deviation of Y scores from Mean A 9 8 n = Total number of Pairs B 7 6 C 10 11 D 6 18 E 8 7 F 14 9 G 8 3 H 8 6 I 8 8 Moment = If X is a random variable then expected values of X (called E(X)) are the powers of X (like X, X2, X3, etc.). They are known as first moment, second moment, third moment, etc. If you derive you will get first moment as mean, 2nd moment as standard deviation, 3rd moment as skewness and 4th moment as Kurtosis. Note that when you subtract the value of population mean (μ) from X, its known as central moment (X- μ). By definition Pearson Correlation coefficient is product of central moment of X and central moment of Y divided by total number of pairs. Sanjay Singh, PhD | sanjay.singh3210@gmail.com X Y y = Y –Y xy Assumptions of PPMCC • Quantitative Measures: Both IV & DV should be quantitative (interval or ratio) • Linearity: X & Y should be linearly related (In case of curvilinear or no correlation it should not be applied) • Absence of outliers: There should be no significant outliers. (Outliers affect linearity) • Normality: The variables should be normally distributed in their respective population. If one variable is DV (dependent variable) then at least DV must be normally distributed. The Pearson correlation is, however, reasonably robust when there is departure from normality (Sprinthall, l987, cited from Martin et al. (1993), Havelicek and Peterson, 1977) • Minimum 30 observations: For the assumption of normality to hold true there should be minimum 30 observations. Sanjay Singh, PhD | sanjay.singh3210@gmail.com Formulas for PPMCC: Formula 1: Deviation Score Formula Sanjay Singh, PhD | sanjay.singh3210@gmail.com Formula 2: Z score formula Sanjay Singh, PhD | sanjay.singh3210@gmail.com Formula 3: Raw Score Formula/Machine Formula Sanjay Singh, PhD | sanjay.singh3210@gmail.com Formula 4: Covariance Formula The formula on right side is for population. In case of sample the denominator will have (n-1). This is because of difference in sample and population Standard deviation formulas. Sanjay Singh, PhD | sanjay.singh3210@gmail.com Calculation of PPMCC Score in Maths Student Score in Stats x = X –𝑿 y = Y –𝒀 xy Score in Score x=X– y=Y– Stude Math in nt s Stats 𝑿 xy X2 𝒀 A 9 8 A 9 8 B 7 6 B 7 6 C 10 11 C 10 11 D 6 18 D 6 18 E 8 7 E 8 7 F 14 9 F 14 9 G 8 3 G 8 3 H 8 6 H 8 6 I 8 8 I 8 8 X Y X Y Sanjay Singh, PhD | sanjay.singh3210@gmail.com Y2 Importance • Explains Variability (Variance = Square of r) • Could be a base for causation but does not guarantee it. Sanjay Singh, PhD | sanjay.singh3210@gmail.com References • http://ww2.amstat.org/publications/jse/v9n3/stanton.html • The SAGE Encyclopedia of Communication Research Methods edited by Mike Allen Sanjay Singh, PhD | sanjay.singh3210@gmail.com