Three-Way Hierarchical Log-Linear Analysis: Positive Assortative Mating We shall learn how to do the three-way analysis using data collected at East Carolina University by Jay Gammon. He was testing the prediction that persons should desire mates that are similar to themselves (should desire "positive assortative mating"). Three of the categorical variables were Religion, Hair Color, and Eye Color. It was also predicted that women would be stronger in their preference for positive assortative mating, so we have a three-way analysis, Self x Mate x Gender. The Data The data are in the file Loglin3h.sav, which you can download from my SPSS Data Page. Each row in the data file represents one cell in the 3 x 3 x 2 contingency table, with the freq variable already set as the weighting variable. Variable Relig_S is the participant’s religion, Relig_M is the religion the participant wanted e’s mate to have, and Gender is the participant’s gender. There are three values of Religion: Catholic, Protestant, and None. To avoid very small expected frequencies I excluded data from participants who indicated that they were, or desired their mate to be, Jewish or of Eastern religion. The SPSS Actions to Do the Analysis 1. Analyze, Loglinear, Model Selection. Select "Use Backward Elimination.” Move Relig_S, Relig_M, and Gender into the Factors box, defining the range for the Religion variables as 1,3 and for Gender as 1,2. Click Model and select Saturated. Click Options and select Display Frequencies, Residuals, Parameter Estimates, and Association Table. Let Delta be .5 (there are cells with zero frequencies). “OK” to run the analysis. Log3h.docx 2 2. Analyze, Descriptive Statistics, Crosstabs. Move Relig_S into the Rows box and Relig_M into the Columns box. Click Statistics and select 2. Click Cells and select Observed, Expected, and Row Percentages. 3. Analyze, Descriptive Statistics, Crosstabs. Keep Relig_S into the Rows box, but replace Relig_M with Gender in the Columns box. In Statistics keep 2. In Cells keep Observed, but replace Row Percentages with Column Percentages. The Saturated Model The first model evaluated is the saturated model, which includes all effects and thus perfectly fits the data. The “Tests that K-Way ……” show us that we could delete the threeway interaction with little effect, but that dropping all of the two-way interactions would significantly reduce the goodness of fit between model and data. The “Tests of PARTIAL associations” (which are adjusted for all other effects in the model) indicate a significant association between respondents' own religion and that desired in their mates, as well as main effects of all variables. “Estimates for Parameters” shows high values for the four parameters (one per df) in the Relig_S x Relig_M effect. The parameter estimates for the main effects reflect the fact we did not have equal numbers of respondents in the Catholic, Protestant, and None groups and we had more female respondents than male respondents. Backward Elimination After evaluating the full model, HILOGLINEAR attempts to remove effects, starting with the highest-order effect. It removes it and tests whether the removal significantly (.05 default alpha) increased the 2 (made the fit worse). If removing that effect has no significant effect, then that effect remains out and HILOGLINEAR moves down to the next level (in our case, the 2-way effects). The effect at that level whose removal would least increase the 2 is removed, unless its removal would cause a significant increase of the 2. Then the next least important effect at that level is evaluated for removal, etc., etc., until all effects at a level are retained or all effects have been tested and removed. When an interaction effect is retained, lower-order effects for all combinations of the variables in the higher-order effect must also be retained. With luck (not always) this method will lead you to the same final model that the tests of partial associations would suggest. With our data, deleting the triple interaction does not significantly increase the goodness-of-fit 2, so the triple interaction is removed. The two-way effects are then evaluated. Deletion of Relig_S x Gender or Relig_M x Gender would not have a significant effect, so Relig_M x Gender (which produces the smaller change) is deleted. Note that with Relig_M x Gender out, Relig_S x Gender is now significant, so the backwards elimination stops. We are left with a model which contains Relig_S x Relig_M and Relig_S x Gender. Since a hierarchical analysis always retains lower-order effects contained within retained higher-order effects, our model also includes the main effects of Relig_S, Relig_M, and Gender. The model fits the data well -- the goodness-of-fit chi-square has a nice, high p of .954, and all of the residuals are small. Crosstabs was used to obtain unpartialled tests of the two-way associations that our hierarchical analysis indicated were significant. Crosstabs' tests totally ignore variables not included in the effect being tested. For the Relig_S x Relig_M analysis we obtain an enormous 2. 3 Positive Assortative Mating on the Main Diagonal Our research hypothesis was that individuals would prefer to mate with others similar to themselves (in this case, of the same religion). Look at the main diagonal (upper left cell, middle cell, lower right cell) of the Relig_S x Relig_M table. Most of the counts are in that diagonal, which represents individuals who want mates of the same religion as their own. If we sum the counts on the main diagonal, we see that 185 (or 92.5%) of our respondents said they want their mates to be of the same religion as their religion. How many respondents would we expect to be on this main diagonal if there was no correlation between Relig_S and Relig_M? The answer to that question is simple: Just sum the expected frequencies in that main diagonal -- in the absence of any correlation, we expect 108 (54%) of our respondents to be on that main diagonal. Now we can employ an exact binomial test of the null hypothesis that the proportion of respondents desiring mates with the same religion as their own is what would be expected given independence of self religion and ideal mate religion (binomial p = .54). The one-tailed p is the P(Y 185 | n = 200, p = .54). Back in PSYC 6430 you learned how to use SAS to get binomial probabilities. In the little program below, I obtained the P(Y 184 | n = 200, p = .54), subtracted that from 1 to get the P(Y 185 | n = 200, p = .54), and then doubled that to get the two-tailed significance level. The SAS output shows that p < .001. data p; LE184 = PROBBNML(.54, 200, 184); GE185 = 1 - LE184; p = 2*GE185; proc print; run; Obs LE184 GE185 p 1 1 0 0 You can also use SPSS to get an exact binomial probability. See my document Obtaining Significance Levels with SPSS. 4 The unpartialled 2 on Relig_S x Gender is also significant. The column percentages in the table make it fairly clear that this effect is due to men being much more likely than women to have no religion. SAS Catmod options pageno=min nodate formdlim='-'; data Religion; input Relig_Self Relig_Mate Sex count; cards; 1 1 1 20.5 1 1 2 7.5 1 2 1 0.5 1 2 2 0.5 1 3 1 1.5 1 3 2 1.5 2 1 1 1.5 2 1 2 1.5 2 2 1 86.5 2 2 2 49.5 2 3 1 3.5 2 3 2 2.5 3 1 1 0.5 3 1 2 1.5 3 2 1 2.5 3 2 2 3.5 3 3 1 8.5 3 3 2 15.5 proc catmod; weight count; model Relig_Self*Relig_Mate*Sex = _response_; Loglin Relig_Self|Relig_Mate|Sex; run; Submit this code to obtain the analysis of the saturated model in SAS. Karl L. Wuensch, Dept. of Psychology, East Carolina University, Greenville, NC 27858 USA March, 2012 5 Links Return to Wuensch’s Statistics Lessons Page Download the SPSS Output Log-Linear Contingency Table Analysis, Two-Way Three-Way Nonhierarchical Log-Linear Analysis: Escalators and Obesity Four Variable LOGIT Analysis: The 1989 Sexual Harassment Study