Statistics - Healey Chapter 13-14

Week 12 Chapter 13 – Association between variables measured at the ordinal level & Chapter 14: Association Between Variables Measured at the Interval-Ratio Level Chapter 13 Association Between Variables Measured at the Ordinal Level This Presentation  Two Types of Ordinal Variables  Gamma  Spearman’s Rho  Hypothesis Tests for Gamma and Rho Two Types of Ordinal Variables Continuous ordinal variables: 1. Have many possible scores Resemble interval-ratio level variables Use Spearman’s Rho: rs Example: a scale measuring attitudes toward handgun control with scores ranging from 0 to 20     Collapsed ordinal variables: 2. Have just a few values or scores Use Gamma: G     Can also use Somer’s d and Kendall’s tau-b (see text website) Example: social class measured as lower, middle, upper Gamma  Gamma is used to measure the strength and direction of the relationship between two ordinal level variables that have been arrayed in a bivariate table  Before computing and interpreting Gamma, it will always be useful to find and interpret the column percentages Gamma Interpretation:  Use the table below as a guide to interpret the strength of gamma in overall terms Gamma  In addition to strength, gamma also identifies the direction of the relationship  In a negative relationship, the variables change in different directions  Example: As age increases, income decreases (or, as age decreases, income increases)  In a positive relationship, the variables change in the same direction  Example: As education increases, income increases (or, as education decreases, income decreases) Gamma Gamma Gamma In addition to strength and direction, a hypothesis test of Gamma can also indicate if the two variables share a relationship in the population, or if the two variables are significantly related Hypothesis Test of Gamma:  Step 1: Make Assumptions and Meet Test Requirements  Random sampling  Ordinal level of measurement  Normal sampling distribution Gamma Hypothesis Test of Gamma:  Step 2: State the Null Hypothesis  Ho: γ = 0   No relationship exists between the variables in the population H1: γ ≠ 0  A relationship exists between the variables in the population Gamma Hypothesis Test of Gamma:  Step 3: Select the Sampling Distribution and Establish the Critical Region  Sampling distribution = Z distribution  Set alpha (two-tailed)  Look up Z(critical) in Appendix A Gamma Hypothesis Test of Gamma:  Step 4: Compute the Test Statistic Ns  Nd Z(obtained)  G  N (1  G2 ) Ns  Nd w hereG  Ns  Nd Gamma Hypothesis Test of Gamma:  Step 5: Make a Decision and Interpret the Results  Compare Z(obtained) to Z(critical)  If Z(obtained) falls in the critical region, reject Ho  If Z(obtained) does not fall in the critical region, fail to reject Ho  Interpret results Spearman’s Rho (rs)  Measure of association for ordinal-level variables with a broad range of different scores and few ties between cases on either variable  Computing Spearman’s Rho 1. Rank cases from high to low on each variable 2. Use ranks, not the scores, to calculate Rho Spearman’s Rho (rs) Spearman’s Rho (rs) Spearman’s Rho (rs) Spearman’s Rho (rs)      Rho is positive, therefore jogging and self-image share a positive relationship: as jogging rank increases, self-image rank also increases On its own, Rho does not have a good strength interpretation But Rho2 is a PRE measure For this example, Rho2 = (0.86)2 = 0.74 Therefore, we would make 74% fewer errors if we used the rank of jogging to predict the rank on selfimage compared to if we ignored the rank on jogging Spearman’s Rho (rs) In addition to strength and direction, a hypothesis test of Rho can also indicate if the two variables share a relationship in the population, or if the two variables are significantly related Hypothesis Test of Spearman’s Rho:  Step 1: Make Assumptions and Meet Test Requirements  Random sampling  Ordinal level of measurement  Normal sampling distribution Spearman’s Rho (rs) Hypothesis Test of Spearman’s Rho:  Step 2: State the Null Hypothesis  Ho: ρs = 0   No relationship exists between the variables in the population H1: ρs ≠ 0  A relationship exists between the variables in the population Spearman’s Rho (rs) Hypothesis Test of Spearman’s Rho:  Step 3: Select the Sampling Distribution and Establish the Critical Region  Sampling distribution = Student’s t  Alpha = 0.05 (two-tailed)  Degrees of freedom = N-2 = 8  t(critical) = ±2.306 Spearman’s Rho (rs) Hypothesis Test of Gamma:  Step 4: Compute the Test Statistic Spearman’s Rho (rs) Hypothesis Test of Gamma:  Step 5: Make a Decision and Interpret the Results  t(obtained) = 4.77  t(critical) = ±2.306  t(obtained) falls in the critical region, so reject Ho  Jogging and self-image are related in the population from which the sample was drawn Chapter 14 Association Between Variables Measured at the Interval-Ratio Level This Presentation  Scattergrams Graphs that display relationships between two interval-ratio variables Regression Coefficients and the Regression Line  Regression line summarizes the linear relationship between X and Y  Regression coefficients predict scores on Y from scores on X Pearson’s r  Preferred measure of association for two interval-ratio variables Coefficient of determination: r2 Correlation matrix      Scattergrams  Scattergrams have two dimensions: The X (independent) variable is arrayed along the horizontal axis  The Y (dependent) variable is arrayed along the vertical axis  Each dot on a scattergram is a case  The dot is placed at the intersection of the case’s scores on X and Y  Scattergrams  A regression line, which summarizes the linear relationship between X and Y, is added to the graph  “Eyeball” a straight line that connects all of the dots or comes as close as possible to connecting all of the dots  To be more precise: calculate the conditional mean of Y for each value of X, plot those values, and connect the dots  Inspection of a scattergram should always be the first step in assessing the relationship between two interval-ratio level variables Scattergrams Linearity  A key assumption of scattergrams and regression analysis is that X and Y share a linear relationship  In a linear relationship the dots of a scattergram form a straight line pattern Linear Relationship: Example Scattergrams Linearity  In a nonlinear relationship the dots do not form a straight line pattern Scattergrams Three Questions  Does a relationship exist?  A relationship exists if the conditional means of Y change across values of X  As long as the regression line lies at an angle to the X axis (and is not parallel to the X axis), we can conclude that a relationship exists between the two variables Scattergrams Three Questions  How strong is the relationship?  Strength of the relationship is determined by the spread of the dots around the regression line  In a perfect association, all dots fall on the regression line  In a stronger association, the dots fall close (are clustered tightly around) the regression line  In a weaker association, the dots are spread out relatively far from the regression line Scattergrams Three Questions  What is the direction of the relationship? (Direction of association is determined by the angle of the regression line) What is the Direction of the Relationship? Scattergrams Based on this scattergram for percent college educated (X) and voter turnout (Y) on election day for 50 states: Does a relationship exist? How strong is the relationship? What is the direction of the relationship? Is the relationship linear? Scattergrams  Does a relationship exist?  The regression line falls at an angle to the X axis (it is not parallel), therefore we can conclude that an association exists between voter turnout and college education Scattergrams  How strong is the relationship?   The greater the extent to which dots are clustered around the regression line, the stronger the relationship This relationship is weak to moderate in strength Scattergrams  What is the direction of the relationship?    Positive: Regression line rises from lower-left to upper-right Negative: Regression line falls from upper-left to lower-right This is a positive relationship: As percent college educated increases, voter turnout increases Scattergrams  Is the relationship linear?  The conditional means on Y form a straight line, as demonstrated by the regression line  Therefore, the relationship is linear Pearson’s r  Pearson’s r is a measure of association for interval- ratio level variables  Pearson’s r can indicate the direction of association, but it does not have an acceptable strength interpretation  But, by squaring r, we obtain a PRE measure called the coefficient of determination  The coefficient of determination indicates the percentage of the variation in Y that is explained by X Pearson’s r  Calculate r Pearson’s r  r = 0.50  r is positive, therefore the relationship between X and Y is positive  As the number of children in dual-career families increases, husbands’ hours of housework per week also increases  r2 = (0.50)2 = 0.25  r2 is 0.25, therefore the number of children in dual-career families explains 25% of the variation in husbands’ hours of housework per week Pearson’s r Hypothesis Test of Pearson’s r  Step 1: Make Assumptions and Meet Test Requirements       Random sample Interval-ratio level measurement Bivariate normal distributions Linear relationship Homoscedasticity Normal sampling distribution Pearson’s r Hypothesis Test of Pearson’s r  Step 2: State the Null Hypothesis   H o: ρ = 0 H 1: ρ ≠ 0  Step 3: Select the Sampling Distribution and Establish the Critical Region     Sampling distribution = Student’s t Alpha = 0.05 (two-tailed) Degrees of freedom = N-2 = 10 t(critical) = ±2.228 Pearson’s r Hypothesis Test of Pearson’s r  Step 4: Compute Test Statistic Pearson’s r Hypothesis Test of Pearson’s r  Step 5: Make a Decision and Interpret the Results  t(critical) = ±2.228  t(obtained) = 1.83  t(obtained) does not fall in the critical region, so we fail to reject Ho  The two variables are not related in the population Correlation Matrix  A correlation matrix is a table that shows the relationships between all possible pairs of variables Correlation Matrix  Using the matrix below:  What is the correlation between GDP and inequality?  Of all the variables correlated with Inequality, which has the strongest relationship? The weakest?

Statistics - Healey Chapter 13-14

Related documents

Products

Support

Statistics - Healey Chapter 13-14

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib