Example 3.6 Measures of Association: Covariance and Correlation EXPENSES.XLS A survey questions members of 100 households about their spending habits. The data in this file represent the salary, expense for cultural activities, expense for sports-related activities, and the expense for dining-out for each household over the past year. Do these variables appear to be related linearly? 3.1 | 3.2 | 3.3 | 3.4 | 3.5 | 3.7 | 3.8 | 3.9 | 3.10 | 3.11 Covariance and Correlation When we need to summarize the relationship between two variables we can use the measures covariance and correlation. We summarize the type of behavior observed in a scatterplot. Each measures the strength (and direction) of a linear relationship between two numerical variables. The relationship is “strong” if the points in a scatterplot cluster tightly around some straight line. If this line rises form left to right then the relationship is “positive”. If it falls from left to right then the relationship is “negative”. 3.1 | 3.2 | 3.3 | 3.4 | 3.5 | 3.7 | 3.8 | 3.9 | 3.10 | 3.11 Determining Linear Relationships Scatterplots of each variable versus each other would provide the answer to the question but six scatterplots would be required, one for each pair. To get a quick indication of possible linear relationships we can use Stat-Proto obtain a table of correlations and/or covariances. 3.1 | 3.2 | 3.3 | 3.4 | 3.5 | 3.7 | 3.8 | 3.9 | 3.10 | 3.11 Table of Correlations and Covariances To get the table, place the cursor anywhere in the data set and use the StatPro/Summary Stats/Correlations, Covariances menu item and proceed in the obvious way. 3.1 | 3.2 | 3.3 | 3.4 | 3.5 | 3.7 | 3.8 | 3.9 | 3.10 | 3.11 Relationships The only relationships that stand out are the positive relationships between salary and cultural expenses and between salary and dining expenses. The negative relationships are between cultural and sports-related expenses. To confirm these graphically we show scatterplots of Salary versus Culture and Culture versus Sports 3.1 | 3.2 | 3.3 | 3.4 | 3.5 | 3.7 | 3.8 | 3.9 | 3.10 | 3.11 Scatterplot Indicating Positive Relationship 3.1 | 3.2 | 3.3 | 3.4 | 3.5 | 3.7 | 3.8 | 3.9 | 3.10 | 3.11 Scatterplot Indicating Negative Relationship 3.1 | 3.2 | 3.3 | 3.4 | 3.5 | 3.7 | 3.8 | 3.9 | 3.10 | 3.11 Correlation and Covariance Properties In general, the following properties are evident from the Table of correlations and covariances. – The correlation between a variable and itself is 1. – The correlation between X and Y is the same as the correlation between Y and X. Therefore, it is sufficient to list the correlations below (or above) the diagonal in the table. (The same is true for the covariances). – The covariance between a variable and itself is the variance of the variable. We indicate this in the heading of the covariance table. 3.1 | 3.2 | 3.3 | 3.4 | 3.5 | 3.7 | 3.8 | 3.9 | 3.10 | 3.11 Correlation and Covariance Properties -- continued – It is difficult to interpret the magnitudes of covariances. These depend on the fact that the data are measured in dollars rather than, say, thousands of dollars. It is such easier to interpret the magnitudes of the correlations because they are scaled to be between -1 and +1. 3.1 | 3.2 | 3.3 | 3.4 | 3.5 | 3.7 | 3.8 | 3.9 | 3.10 | 3.11