11/27/6 Estimating and Comparing Means 1 Estimating and Comparing Means SEM has been primarily used as a set of tools to investigate relationships. As we know from our study of regression analysis, relationship can be used to compare means. So, since SEM is a generalization of regression techniques, SEM can also be used to compare means through relationships. The relationship method of comparing means Model: G:\MdbT\P595\AmosMeansEstimated\IndtExample Data: G:\ MdbT\P595\AmosMeansEstimated\IndtExample.sav Simplest Example: Comparing two independent means. Mean performance of Men vs. Women in an intro statistics course. Men are Group=1, Women are Group=2. Scatterplot of the relationship. Data matrix: Variance of Group scores The SEM Raw regression coefficient Difference between means when there are only two groups. Variance of residuals Raw coefficients Standardized regression coefficient – r in simple regression Standardized coefficients r-squared FYI- The critical ratio for the regression coefficient is -.55, p = .583. So performance is not related to Gender. That is, there is no significant difference between Group means. The “relationship” method for comparing means, taken from regression analysis, is also called the “Covariate” method. It’s called this because the variable representing group membership is treated as a predictor or “covariate” in SEM terminology. Personally, I think the term “relationship” is a better description. 11/27/6 Estimating and Comparing Means The Separate Groups method. Model: G:\MdbT\P595\AmosMeansEstimated\IndtExampleSeparateMeans Data: G:\ MdbT\P595\AmosMeansEstimated\IndtExample.sav Amos and most other SEM programs provide another way of comparing means. 2 Overview . . . 1) A multiple-group model is created, in which a path diagram is specified separately for each group. 2) A general model is applied in which a separate mean value for each group is estimated. 3) A special model, in which the same mean value for each group is estimated. 4) The chi-square difference test between the general and special models is used to test the significance of the difference. If the chi-square is significant, that means that the general model allowing separate means fit significantly better than the special model that required means to be equal, leading to the conclusions that the means must differ significantly. Independent Groups t Example: A Two group model comparing Performance between the two groups. 1. With a blank Amos Input field, double-click on “Group number 1”. This opens a “Manage Groups” dialog box. a. Type the name of the first group in the box. b. For each subsequent group, click on “New” and type the name of the group. c. When done specifying group names, click on the “Close” button. 2. File -> Data Files . . . -> File Name. (Follow the steps below carefully.) a. Click on [File Name] and identify the data file. b. Click on [Grouping Variable] button and click on the name of the variable identifying the groups. c. Click on the Group Value button and identify the value for the first group. d. Highlight each subsequent group in the Data Files window, and repeat a, b, and c. 11/27/6 Estimating and Comparing Means 3. Draw the path diagram. It will be the same for each group. 3 For an independent samples t, the path diagram will simply be a single rectangle. 4. View -> Analysis Properties -> Estimation -> Estimate Means and Intercepts 5. Right-click on an object -> Object Properties. a. click on the “Parameters” tab and uncheck the [All Groups] button. VERY IMPORTANT. b. Click on the “Males” group name. c. Enter a name for the mean of the Male group, e.g., malemean. c. Click on the Females group name, then enter a name for the mean of the Female group, e.g. femmean. The path diagram should look like the following : 6. Save the file. 11/27/6 Estimating and Comparing Means 7. Double-click on “Default Model”. a. Type “Means Separate” in the Model Name field. b. Click on the New button, then type Means Equal in the Model Name button. c. Type malemean = femmean in the Parameter Constraints field. d. Click on the “Close” button. 8. Run the model. 4 11/27/6 Output from Amos Estimating and Comparing Means 5 Note that the chi-square is not significant, indicating that the means are not significantly different. From the Text output, Chi-square = .260, p = .610. Note that the test was conducted allowing the variances to be unequal. Why, Why, Why? Why go to all this trouble to compare two means? The Separate means method was much more difficult to carry out than the Relationship method. Why would anyone do it? 1. The separate means method is much more flexible. It can be conducted assuming variances are equal or assuming they’re unequal. 2. The separate means method extends quite easily (once you get the hang of specifying the models) to multiple groups. 11/27/6 Estimating and Comparing Means Example 2: Analysis of variance. (This is designed as an in-class exercise.) 6 Comparing mean performance in P511 for 3 years. Boring example. SPSS Output Kept for completeness. De scriptiv es p5 11g 95 % Co nfide nce I nterva l for Me an 20 03 N 18 Me an .84 56 20 04 21 .85 29 20 05 36 .84 44 To tal 75 .84 71 Std . Deviatio n .08 298 Std . Erro r .01 956 Lo wer B ound .80 43 Up per B ound .88 68 Mi nimu m .70 Ma ximu m .98 .08 939 .01 951 .81 22 .89 35 .69 1.0 1 .07 696 .01 283 .81 84 .87 05 .65 .99 .08 097 .00 935 .82 84 .86 57 .65 1.0 1 Tes t of Homogene ity of Varia nces p51 1g Levene Statistic .40 1 df1 df2 2 Sig . .67 1 72 ANOVA p5 11g Su m of Squa res df Me an S quare Be tween Gro ups .00 1 2 .00 0 Wi thin G roup s .48 4 72 .00 7 To tal .48 5 74 F Sig . .07 4 .92 9 So, there were no significant differences in mean performance across the 3 years. The Amos output Amos chi-square was .0.1 with df=2, also not significant. 11/27/6 Estimating and Comparing Means 7 Example 3. Extending the multiple groups conceptualization to comparison of correlations. This example is from research we’ve been conducting on respondent inconsistency – the tendency of persons to give different self-reports to items from the same personality dimension. Some persons are quite consistent from item to item within the same personality dimension. Others are more inconsistent, giving different responses to items even though all the items represent the same personality dimension. Although we’d expect some differences in responses because, after all, the items are different, we’ve found that there are reliable differences in the amount of inconsistency shown by people. If a person is inconsistent in responding to one personality questionnaire, he/she’ll be inconsistent in respond to other personality scales. Inconsistency in self report appears to be a personality characteristic, one that cuts across different questionnaires. (Need a MS thesis on whether or not it cuts across time periods.) Comparing correlations across groups. A. Comparing convergent validities between corresponding measures of the Big 5 From Biderman, M. D., & Reddock, C M. (2012). The relationship of scale reliability and validity to respondent inconsistency. Personality and Individual Differences, 52, 647-651. To assess the extent to which convergent validity was related to inconsistency group, correlations between corresponding Big Five dimensions in the two questionnaires not used to define inconsistency were computed. Those correlations are presented in Table 4. Mean convergent validity across the five dimensions between the two questionnaires decreased as average inconsistency defined using Questionnaire 1 increased. The statistical significance of differences across groups was assessed using multigroup correlation analyses in Amos (Arbuckle, 1983/2010). Specifically a three-group model was created for Questionnaire 2 and Mini-Marker domain scores in which covariances between all domain scores in the two questionnaires were estimated separately for each group. This was a completely saturated model with degrees of freedom equal to 0. Then a second model was applied with covariances between the three groups constrained to equality. This constraint created a special model with 10 degrees of freedom. The chi-square difference between the two models was 20.30 (p < .05), suggesting that the covariances between corresponding domain scores were related to inconsistency. (phrase in red was omitted from the MS.) The convergent validity of Big 5 dimensions measured using two different questionnaires was compared across 3 groups differing in Inconsistency of responding. The hypothesis was that convergent validity would be greatest for less inconsistent responders and least for most inconsistent responders. The Amos file is “MDBR\1BalancedScale\Inconsistency II\Inconsistency II Amos\Table 5 convergent validity Revised 3.amw” 11/27/6 The Input model Estimating and Comparing Means 8 The parameters circled in red are the convergent validities. It was assumed that they would be largest for inconsistency group 1 (least inconsistent respondents) and smallest for group 3. The constrained model was 11/27/6 Estimating and Comparing Means Group 1 results. Mean of convergent validities = .784. Group 2 results. Mean of convergent validities = .720. 9 11/27/6 Estimating and Comparing Means 10 Group 3 results. Mean of convergent validities = .680. Convergent Validities constrained model. Chi-square difference (10 df) = 20.81. p < .05. 11/27/6 Estimating and Comparing Means 11 B. Comparing criterion-related validities between corresponding measures of the Big 5 From the article . . . Criterion-related validity defined as correlations of GPA with measures of conscientiousness from Big Five Questionnaire 2 and the Mini-Marker questionnaire were computed for each group. Table 5 presents the validity coefficients. Inspection of the table shows that validity was roughly the same for both the most consistent group and the middle group but fell off dramatically for the most inconsistent group for both scales. To provide some evidence of the significance of differences across inconsistency groups, two multigroup regression models were formed using Amos. In the first, GPA was regressed simultaneously onto Questionnaire 2 and Mini-Marker Conscientiousness domain scores allowing regression coefficients and variances to be unique within each inconsistency group, creating a saturated model. In the second, restricted model, variances and regression weights were constrained to be equal across groups, yielding eight degrees of freedom. The chisquare difference statistic was 18.10 (p < .05), suggesting that criterion-related validity was related to inconsistency. For this analysis, the convergent validities of two measures of conscientiousness were compared across three groups defined by inconsistency of respondents. The input model is The constraints are Note that the variances of the predictors are constrained to be equal across inconsistency groups as are the unstandardized regression coefficients (slopes). 11/27/6 Estimating and Comparing Means The models Group 1 results (p values not printed because Chi-square = 0 and p is undefined.) Group 2 results Group 3 results. (Worst validity.) 12 11/27/6 Estimating and Comparing Means 13 The constrained parameters result Constraining the parameters to equal across groups resulted in a signfiicantly poorer fitting model, suggesting that there were differences in criterion-related validity across groups. (Although I have to admit, the differences were not huge.)