The Statistical Imagination • Chapter 15. Correlation and Regression Part 2: Hypothesis Testing and Aspects of a Relationship When to Test a Hypothesis Using Correlation and Regression 1) There is one representative sample from a single population 2) There are two interval/ratio or interval-like ordinal variables 3) There are no restrictions on sample size, but generally, the larger the n, the better 4) A scatterplot of the coordinates of the two variables fits a linear pattern Test Preparation • Before proceeding with the hypothesis test, check the scatterplot for a linear pattern • Calculate the Pearson’s r correlation coefficient and the regression coefficient, b • Compute the means of X and Y and use them and b to compute a • Specify the regression equation, insert values of X, solve for Ý, and plot the line on the scatterplot • Provide a conceptual diagram Features of the Hypothesis Test • Step 1. Stat. H: ρ = 0 • That is, there is no relationship between X and Y • The Greek letter rho (ρ) is the correlation coefficient obtained if Pearson’s correlation coefficient were computed for the population • A ρ of zero asserts that there is no correlation in the population and that the regression line has no slope Features of the Hypothesis Test (Continued) • Step 2. The sampling distribution is the tdistribution with df = n - 2 • When the Stat. H is true, sample Pearson’s r’s will center around zero • This test does not require a direct calculation of a standard error Features of the Hypothesis Test (Continued) • Step 4. The test effect is the value of Pearson’s r • The test statistic is tr • The p-value is estimated from the t-distribution table, Statistical Table C in Appendix B Four Aspects of a Relationship • With correlation and regression analysis, because both variables are of interval/ratio level, the analysis is mathematically rich • All four aspects of a relationship apply Existence of a Relationship • Test the Stat. H that ρ = 0, that there is no relationship between X and Y • If the Stat. H is rejected, a relationship exists Direction of a Relationship • Direction is indicated by the sign of r and b, and by observing the slope of the pattern of coordinates in a scatterplot • A positive relationship is revealed with an upward slope, and r and b will be positive • A negative relationship is revealed with a downward slope, and r and b will be negative Strength of a Relationship • Strength is determined by the proportion of the total variation in Y explained by X • This proportion is quickly obtained by squaring Pearson’s r correlation coefficient Nature of a Relationship 1) Interpret the regression coefficient, b, the slope of the regression line. State the effect on Y of a one-unit change in X 2) Provide best estimates using the regression line equation. Insert chosen values of X, compute Ý ’s and interpret them in everyday language Careful Interpretation of Findings • A correlation applies to a population, not to an individual • E.g., predictions of Y for a value of X provide the best estimate of the mean of Y for all subjects with that X-score • A statistical relationship may exist but not mean much. It is important to distinguish statistical significance (i.e., the existence of a relationship) from practical significance (i.e., the strength of the relationship Spurious Correlation • A spurious correlation is one that is conceptually false, nonsensical, or theoretically meaningless • E.g., in the 1990s there is a positive correlation between the amount of carbon dioxide released into the atmosphere and the level of the Dow Jones stock index