Correlation Correlation Correlation refers to a relationship that exists between pairs of measures. Knowledge of the strength and direction of the relationship allows us to predict one variable from the other with an accuracy greater than chance. Correlation For example, you can guess someone’s weight more accurately if you know how tall they are because height and weight are positively correlated. Correlation For example, you can guess someone’s weight more accurately if you know how tall they are because height and weight are positively correlated. When two variables are positively correlated that means they tend to both move higher or both move lower at the same time. Correlation For example, you can guess someone’s weight more accurately if you know how tall they are because height and weight are positively correlated. When two variables are positively correlated that means they tend to both move higher or both move lower at the same time. Generally, taller people weigh more than shorted people. Correlation When two variable are negatively correlated that means that they change in value inversely. Higher scores on one generally go with lower scores on the other. Correlation When two variable are negatively correlated that means that they change in value inversely. Higher scores on one generally go with lower scores on the other. For example, the outdoor temperature and the weight of one’s clothing are negatively correlated. The higher the temperature is, the less clothing we wear. The lower the outdoor temperature, the more clothing we wear. Correlation When two variable are negatively correlated that means that they change in value inversely. Higher scores on one generally go with lower scores on the other. For example, the outdoor temperature and the weight of one’s clothing are negatively correlated. The higher the temperature is, the less clothing we wear. The lower the outdoor temperature, the more clothing we wear. Another example: How old you are and how much longer you will live are negatively correlated. Correlation When two variable are negatively correlated that means that they change in value inversely. Higher scores on one generally go with lower scores on the other. For example, the outdoor temperature and the weight of one’s clothing are negatively correlated. The higher the temperature is, the less clothing we wear. The lower the outdoor temperature, the more clothing we wear. Another example: How old you are and how much longer you will live are negatively correlated. The existence of a negative correlation does not mean the absence of a relationship. It means that the variables tend to move in opposite directions, not that the variables are unrelated. Correlation A near zero correlation is said to exist when scores on the two variables are unrelated. Higher scores on one variable are just as likely to be accompanied by higher scores as by lower scores on the other variable. Correlation A near zero correlation is said to exist when scores on the two variables are unrelated. Higher scores on one variable are just as likely to be accompanied by higher scores as by lower scores on the other variable. An example: The street number on your house and the odometer reading of your car. Correlation For Each of the Following Examples, State From Your General Knowledge Whether the Correlation Between the Two Variables is Likely to be Positive, Negative, or Near Zero, and Explain Why Average Number of Calories Eaten Per Day and Body Weight Average Number of Calories Eaten Per Day and Body Weight Positive – Caloric Intake is One of the Major Determinants of Body Weight Golf Scores and the Number of Years of Golfing Experience Golf Scores and the Number of Years of Golfing Experience Negative – Golfers Improve with Experience and Hence Would Be Expected to Get Better (Lower) Scores Length of Hair and Shoe Size in Adult Males Length of Hair and Shoe Size in Adult Males Near Zero – It’s Doubtful That These Two Measures Could Be Influenced By Common Factors Amount of Formal Education One Has Received and the Time Spent Collecting Public Assistance (Welfare) Amount of Formal Education One Has Received and the Time Spent Collecting Public Assistance (Welfare) Negative – Educated Individuals Are More Likely to Be Employable and Hence Less Likely to Need Welfare Per Capita Consumption of Alcohol in a Group of Cites and Suicide Rates in Those Cities Per Capita Consumption of Alcohol in a Group of Cites and Suicide Rates in Those Cities Positive – A Common Set of Stresses and Other Factors Are Likely to Influence Rates of Both Alcoholism and Suicide in a Given Community Number Correct on a Current Events Test and Time Spent Reading the Newspaper Number Correct on a Current Events Test and Time Spent Reading the Newspaper Positive – The Newspapers Are Full of Stories Concerning Current Events Around the World Strength of Traditional Religious Beliefs and Favorableness of Attitude Toward Abortion on Demand Strength of Traditional Religious Beliefs and Favorableness of Attitude Toward Abortion on Demand Negative – Members of Traditional Religious Groups Are More Likely to Regard Abortion as Immoral Than Others Height and Political Conservatism Height and Political Conservatism Near Zero Correlation Correlations can be represented either graphically by the construction of a special type of graph called a scatter-plot diagram or through the computation of an index called a coefficient of correlation. Correlation Correlations can be represented either graphically by the construction of a special type of graph called a scatter-plot diagram or through the computation of an index called a coefficient of correlation. Most of the seminal work that went into the development of these representation was done by the early statistician Karl Pearson, employed by the Guinness Brewery. Correlation Construction of a scatter-plot diagram requires acquiring pairs of scores from each subject. For example, suppose we wanted to look at the relationship between height and self esteem in men. Perhaps we have a hypothesis that how tall you are effects your level of self esteem. So we collect pairs of scores from twenty male individuals. Height, measured in inches, and Self Esteem based on a selfrating scale (where higher scores mean higher self esteem). Man Height Self Esteem 1 68 4.1 2 71 4.6 3 62 3.8 4 75 4.4 5 58 3.2 6 60 3.1 7 67 3.8 8 68 4.1 9 71 4.3 10 69 3.7 11 68 3.5 12 67 3.2 13 63 3.7 14 62 3.3 15 60 3.4 16 63 4.0 17 65 4.1 18 67 3.8 19 63 3.4 20 61 3.6 Man Height Self Esteem 1 68 4.1 2 71 4.6 3 62 3.8 4 75 4.4 5 58 3.2 6 60 3.1 7 67 3.8 8 68 4.1 9 71 4.3 10 69 3.7 11 68 3.5 12 67 3.2 13 63 3.7 14 62 3.3 15 60 3.4 16 63 4.0 17 65 4.1 18 67 3.8 19 63 3.4 20 61 3.6 Man Height Self Esteem 1 68 4.1 2 71 4.6 3 62 3.8 4 75 4.4 5 58 3.2 6 60 3.1 7 67 3.8 8 68 4.1 9 71 4.3 10 69 3.7 11 68 3.5 12 67 3.2 13 63 3.7 14 62 3.3 15 60 3.4 16 63 4.0 17 65 4.1 18 67 3.8 19 63 3.4 20 61 3.6 Man Height Self Esteem 1 68 4.1 2 71 4.6 3 62 3.8 4 75 4.4 5 58 3.2 6 60 3.1 7 67 3.8 8 68 4.1 9 71 4.3 10 69 3.7 11 68 3.5 12 67 3.2 13 63 3.7 14 62 3.3 15 60 3.4 16 63 4.0 17 65 4.1 18 67 3.8 19 63 3.4 20 61 3.6 Man Height Self Esteem 1 68 4.1 2 71 4.6 3 62 3.8 4 75 4.4 5 58 3.2 6 60 3.1 7 67 3.8 8 68 4.1 9 71 4.3 10 69 3.7 11 68 3.5 12 67 3.2 13 63 3.7 14 62 3.3 15 60 3.4 16 63 4.0 17 65 4.1 18 67 3.8 19 63 3.4 20 61 3.6 Man Height Self Esteem 1 68 4.1 2 71 4.6 3 62 3.8 4 75 4.4 5 58 3.2 6 60 3.1 7 67 3.8 8 68 4.1 9 71 4.3 10 69 3.7 11 68 3.5 12 67 3.2 13 63 3.7 14 62 3.3 15 60 3.4 16 63 4.0 17 65 4.1 18 67 3.8 19 63 3.4 20 61 3.6 Linear Regression Man Height Self Esteem 1 68 4.1 2 71 4.6 3 62 3.8 4 75 4.4 5 58 3.2 6 60 3.1 7 67 3.8 8 68 4.1 9 71 4.3 10 69 3.7 11 68 3.5 12 67 3.2 13 63 3.7 14 62 3.3 15 60 3.4 16 63 4.0 17 65 4.1 18 67 3.8 Best Fitting Line 19 63 3.4 Line of Prediction 20 61 3.6 Linear Regression Coefficient of Correlation Coefficient of Correlation A statistical computation that indicates the strength and direction of an underlying correlation Coefficient of Correlation A statistical computation that indicates the strength and direction of an underlying correlation Coefficient of Correlation A statistical computation that indicates the strength and direction of an underlying correlation Always results in a signed number in the range from -1.00 to +1.00 If the sign is positive, that indicates the underlying relationship is a positive correlation. If the sign is negative, it indicates an underlying negative correlation. The closer the value is to “1” (either positive or negative) the stronger is the indicated underlying relationship. Coefficient of Correlation A statistical computation that indicates the strength and direction of an underlying correlation Always results in a signed number in the range from -1.00 to +1.00 If the sign is positive, that indicates the underlying relationship is a positive correlation. If the sign is negative, it indicates an underlying negative correlation. The closer the value is to “1” (either positive or negative) the stronger is the indicated underlying relationship. Coefficient of Correlation A statistical computation that indicates the strength and direction of an underlying correlation Always results in a signed number in the range from -1.00 to +1.00 If the sign is positive, that indicates the underlying relationship is a positive correlation. If the sign is negative, it indicates an underlying negative correlation. The closer the value is to “1” (either positive or negative) the stronger is the indicated underlying relationship. Coefficient of Correlation r = +.86 Coefficient of Correlation Sign r = +.86 Coefficient of Correlation Sign Magnitude r = +.86 Coefficient of Correlation Sign Magnitude r = +.86 Direction Coefficient of Correlation Sign Magnitude r = +.86 Direction Strength Coefficient of Correlation r = +.86 r = +.31 r = -.96 r = +.04 r = +1.02 Coefficient of Correlation r = +.86 r = +.31 r = -.96 r = +.04 Strongest Weakest r = +1.02 Computational Error Direction? Strength? Prediction? Correlation Does Not Imply Causation Correlation Does Not Imply Causation The Third Variable Problem The Problem of Directionality Correlation Does Not Imply Causation The Third Variable Problem Refers to the possibility that two variables are correlated with each other, not because one causes the other, but because both are effects of some third unidentified cause. Correlation Does Not Imply Causation The Problem of Directionality Even when two variables are correlated because of a causal relationship between them, from the correlational data alone, we can not tell which is the cause and which is the effect. Why Do Correlation Research? Why Do Correlation Research? When we are dealing with variables that we have not yet learned to directly control Why Do Correlation Research? When we are dealing with variables that we have not yet learned to directly control When we are dealing with variables that it would not be ethical to directly control Why Do Correlation Research? When we are dealing with variables that we have not yet learned to directly control When we are dealing with variables that it would not be ethical to directly control Reasons of economy (cheaper, faster easier) because we are analyzing data that already exists rather than creating data through our experimentation. Why Do Correlation Research? When we are dealing with variables that we have not yet learned to directly control When we are dealing with variables that it would not be ethical to directly control Reasons of economy (cheaper, faster easier) because we are analyzing data that already exists rather than creating data through our experimentation. Prelude to Experimentation