CORRELATION One of the statistical concepts that is most related to this type of analysis is the correlation coefficient. The correlation coefficient is the unit of measurement used to calculate the intensity in the linear relationship between the variables involved in a correlation analysis, this is easily identifiable since it is represented with the symbol r and is usually a value without units which is located between 1 and -1. In other words, correlation examines the relationship between pairs of scores. The greater the association between the two variables, the more accurate we can predict the relationship between the variables. Correlation between two variables can be either a positive correlation, a negative correlation, or no correlation. Following are a few methods to calculate correlation 1. Pearson’s product moment (r) 2. Rank-order correlations (Spearman’s ‘p’ or ‘rho’) 3. Biserial correlation 4. Tetrachoric Correlation 5. Multiple correlation • Positive correlation: A positive correlation between two variables means both the variables move in the same direction. An increase in one variable leads to an increase in the other variable and vice versa. For example, happiness leads to increase psychological well-being. Positive correlation is denoted by + (plus) • Negative correlation: A negative correlation between two variables means that the variables move in opposite directions. An increase in one variable leads to a decrease in the other variable and vice versa. For example, stress decreases performance. Negative correlation is denoted by – (minus) • Weak/Zero correlation: No correlation exists when one variable does not affect the other. For example, there is no correlation between the number of years of school a person has attended and the letters in his/her name. 1 Following are the interpretation ± 0.00 ± 0.20 ±0 .40 ± 0.70 𝑡𝑜 𝑡𝑜 𝑡𝑜 𝑡𝑜 ± ± ± ± Indifferent or negligible relationship Low correlation or very slight relationship Substantial or marked relationship High correlation 0.20 0.40 0.70 1.00 PEARSON’ PRODUCT MOMENT METHOD: In 1896, Karl Pearson, scientist associated with Galton’s laboratory, developed a rigorous mathematical treatment of these matters. Galton was the first person to use the symbol ‘r’ for a simple correlation coefficient. The term product moment is borrowed from physics. It refers to the measure of the strength of a linear association between two variables — denoted by r. Formula: X Y X2 Y2 Deviation of each score from the mean on test X Deviation of each score from the mean on test Y Squared deviation scores on test X Squared deviation scores on test Y For example: Following are the marks of 5 students obtained in test X and test Y. Determine whether correlation exists between marks on test X and test Y A 50 40 Test x Test y B 60 60 C 30 40 X (X-M) 10 20 -10 0 -20 Y (Y-M) 0 20 0 -10 -10 D 40 30 E 20 30 X2 Y2 XY 100 400 0 100 400 1000 0 400 0 0 100 600 0 400 0 0 200 600 Subject Test X Test Y A B C D E Total Mean 50 60 30 40 20 200 40 40 60 40 30 30 200 40 STEP 1: subtract test score of test X minus Mean (e.g., 50 - 40 = 10) 2 STEP 2: STEP 3: STEP 4: STEP 5: STEP 6: subtract test score of test Y minus Mean (e.g., 40 - 40 = 0) Square the values obtained in X to get X2 values (e.g., 10 * 10 = 100) Square the values obtained in Y to get Y2 values (e.g., 0 * 10 = 0) Multiple the values obtained in X with Y to get XY values (e.g., 10 * 0 = 0) Apply Person Correlation formula and Substitute the values. STEP 7: STEP 8: r = .77 (high correlation) STEP 9: Interpretation: The obtained score of .77 indicates that there is a high correlation found Test X and Test Y. Sums for practice A) B) score in score in Students English Maths A 40 78 B 21 70 C 25 60 D 31 55 E 38 80 F 47 66 C) sl.no Stock A 1 2 3 4 5 Stock B 45 50 53 58 60 9 8 8 7 5 D) Sl no: Sl no: 1 2 3 4 5 College A College B 10 5 13 10 15 15 17 20 19 25 1 2 3 4 Group A Group B 40 99 25 79 22 69 54 89 3 Answers A B C D r = 0.35 r = - 0.90 r = 0.064 r = -2.19 4