MRA Part 3-Applications (2007)

Multiple Regression Analysis: Part 3 Use of categorical variables in MRA 1 Changing Gears  What if we wish to include categorical variables into our regression equation?  For instance, we have two categorical variables (say gender and ethnic group) and two continuous variables (say reading comprehension and visual processing speed) to predict performance 2 Regression and Mean Comparisons    Independent samples ttest: comparing two means Tests the null hypothesis of… Accomplished via the usual t-statistic formula: x1  x2 t s 2p s 2p  n1 n2 mean std. dev. group 1 5 5 4 8 6 2 5 6 5 4 5.00 1.56 group 2 8 9 4 7 7 8 10 9 9 9 8.00 1.70 3 Calculating t… Pooled variance estimate 2 2 9(1.56 )  9(1.70 ) 2 sp   2.667 10  10  2 t-ratio t 58 3   4.108 2.667 2.667 0.7303  10 10 Effect size estimate (variance accounted for) 2 2 t  4.108 r2  2   0.484 2 t  df 4.108  18 4 Our usual approach in SPSS gpid 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 score 5 5 4 8 6 2 5 6 5 4 8 9 4 7 7 8 10 9 9 9   Code groups & enter associated score Run an independent samples t-test to get the following… t-test for Equality of Means Sig. (2Mean Std. Error tailed) Difference Difference 95% CI of difference t df Lower Upper -4.108 18 0.001 -3 0.730 -4.534 -1.466 5 What if we coded groups as 0’s & 1’s?  Could construct a Point-Biserial correlation rpb  ( M Y 1  M Y 0 ) pq y (3).5 rpb   0.696 2.156 Can you guess what 0.6962 equals? gpid 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 score 5 5 4 8 6 2 5 6 5 4 8 9 4 7 7 8 10 9 9 9 dc 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 6 Pearson, r pb and regression  Taking advantage of the fact that rpb is merely a Pearson-product-moment correlation in disguise…  Let’s regress y onto our binary variable to get the following Model Summary Model 1 R R Square a .696 .484 Adjus ted R Square .455 Std. Error of the Es timate 1.63299 a. Predictors : (Constant), dummyc Coefficientsa Model 1 (Cons tant) dummyc Uns tandardized Coefficients B Std. Error 5.000 .516 3.000 .730 Standardized Coefficients Beta .696 t 9.682 4.108 Sig. .000 .001 a. Dependent Variable: score Do these look familiar? 7 This can be greatly expanded…  ANOVA can be run using…        Dummy Coding Effect Coding Orthogonal Coding Multiple categories can be modeled N-Way designs can be accommodated ANCOVA Repeated Measures 8 Adding a rd 3 group… Method 1 Method 2 Method 3 3 5 2 4 8 4 3 9 4 4 3 8 7 4 2 5 6 7 8 6 7 9 10 9 Means: 4.75 4.625 7.75 9 Dummy Coding     Series of binary variables One group receives all zeros Other two groups are differentiated by 1’s Characteristics… Method 1 1 … 1 2 2 … 2 3 3 … 3 Score 3 5 … 9 4 4 … 5 6 7 … 9 d1 0 0 0 0 0 0 0 0 1 1 1 1 d2 0 0 0 0 1 1 1 1 0 0 0 0 10 Submitting this to an MRA… ANOVAb Model 1 Regress ion Res idual Total Sum of Squares 50.083 86.875 136.958 df 2 21 23 Mean Square 25.042 4.137 F 6.053 Sig. .008 a a. Predictors : (Constant), Dummy Code s eparating 3 from 1 & 2, Dummy Code s eparating 2 from 1 & 3 b. Dependent Variable: Exam Score Coefficientsa Model 1 (Cons tant) Dummy Code s eparating 2 from 1 & 3 Dummy Code s eparating 3 from 1 & 2 Uns tandardized Coefficients B Std. Error 4.750 .719 Standardized Coefficients Beta t 6.605 Sig. .000 -.125 1.017 -.025 -.123 .903 3.000 1.017 .592 2.950 .008 a. Dependent Variable: Exam Score 11 Comparing group means   All groups explicitly compared to “uncoded” group. To make other comparisons, either   1.) re-run the analysis with a different coding scheme 2.) use the following equation: t bi  b j 1 1  SE y  y '     ni n j    Concerns over type 1 error in such comparisons remain 12 Effect Coding     Series of variables (vectors) having values of -1, 0, 1 One group receives all -1’s Other two groups differentiated by 0’s & 1’s Characteristics… Method 1 1 … 1 2 2 … 2 3 3 … 3 Score 3 5 … 9 4 4 … 5 6 7 … 9 e1 1 1 1 1 0 0 0 0 -1 -1 -1 -1 e2 0 0 0 0 1 1 1 1 -1 -1 -1 -1 13 Solution Coeffi cientsa Model 1 (Const ant) e1 e2 Unstandardized Coeffic ients B St d. Error 5.708 .415 -.958 .587 -1. 083 .587 St andardiz ed Coeffic ients Beta -.328 -.370 t 13.749 -1. 632 -1. 845 Sig. .000 .118 .079 a. Dependent Variable: Ex am Score  Characteristics:    Intercept = __________ -0.958 represents ___________ -1.083 represents ____________ 14 Recovering cell means Y’ = 5.708 + e1(-0.958) + e2(-1.083) Thus, someone in group 1… Y’ = 5.708 + 1(-0.958) + 0(-1.083) = 4.75 For group 2… Y’ = 5.708 + 0(-0.958) + 1(-1.083) = 4.625 For group 3… Y’ = 5.708 + (-1)(-0.958) + (-1)(-1.083) = 7.749 15 Recovering the missing coefficients    If e1 gives us the effect for being in group 1, and… e2 gives us the effect for being in group 2… What is the effect for being in group 3?     How do we get it? Method 1: recode the cells so that a different cell gets all -1’s. Method 2: take advantage of the fact that all b’s must sum to zero*. Thus…   e3 + (-0.958) + (-1.083) = 0 e3 = 0.958 + 1.083 = 2.041 16 Unequal group sizes due to population differences  Unequal group sizes    May represent attrition in study, or other problems May reflect existing group size differences in population If we wish to preserve information about unequal population sizes…   Use weighted effect coding Instead of one group getting all -1’s…  Group gets weighted code of –n2/n1 where n1 is the ‘baseline group and n2 is the group identified by vector 17 Two-Way ANOVA Revisited Factor B: Anxiety Level Low Medium Factor A: Task Difficulty 3 2 1 5 1 9 Easy 6 7 4 7 Difficult 0 2 0 0 3 3 8 3 3 3 High 9 9 13 6 8 0 0 0 5 0 18 Recall the cell means Factor A: Task Difficulty Factor B: Arousal Low Medium High Total Easy 3.00 6.00 9.00 6.00 Difficult 1.00 4.00 1.00 2.00 Total 2.00 3.00 5.00 4.00 19 Dummy Coding / Effect Coding  Task difficulty    Easy = 0 Difficult = 1 Anxiety  Vector 1        Task difficulty     Vector 1     Low = 0 Medium = 0 High = 1 Vector 3 = TD x Vector 1 Vector 4 = TD x Vector 2 Easy = 1 Difficult = -1 Anxiety Low = 0 Medium = 1 High = 0 Vector 2    Vector 2      Low = 1 Medium = 0 High = -1 Low = 0 Medium = 1 High = -1 Vector 3 = TD x Vector 1 Vector 4 = TD x Vector 2 20 Results using dummy coding… Coefficientsa Model 1 (Constant) Dummy Code for difficulty Dummy Code 1 for anxiety Dummy Code 2 for anxiety Dummy Code Interaction 1 Dummy Code Interaction 2 Unstandardized Coefficients B Std. Error 3.000 1.000 -2.000 1.414 Standardized Coefficients Beta -.289 t 3.000 -1.414 Sig. .006 .170 3.000 1.414 .408 2.121 .044 6.000 1.414 .816 4.243 .000 -2.4E-015 2.000 .000 .000 1.000 -6.000 2.000 -.645 -3.000 .006 a. Dependent Variable: Performance 21 Results of Effect Coding ANOVAb Model 1 Regres sion Residual Total Sum of Squares 240.000 120.000 360.000 df 5 24 29 Mean Square 48.000 5.000 F 9.600 Sig. .000a a. Predic tors: (Constant), int2, ecanx2, ecdiff, int1, ecanx1 b. Dependent Variable: Performanc e Coeffi cientsa Model 1 (Const ant) ec diff ec anx1 ec anx2 int 1 int 2 Unstandardized Coeffic ients B St d. Error 4.000 .408 -2. 000 .408 -2. 000 .577 1.000 .577 1.000 .577 1.000 .577 St andardiz ed Coeffic ients Beta -.577 -.471 .236 .236 .236 t 9.798 -4. 899 -3. 464 1.732 1.732 1.732 Sig. .000 .000 .002 .096 .096 .096 a. Dependent Variable: Performanc e 22 Combining Categorical and Continuous Variables    Type of Treatment by Pre-treatment functioning to predict Outcome Race/Ethnicity by Attitudes toward health care to predict wellness visits Recent vs. non-recent hire by openness to new experience to predict likelihood of change. 23 Example: Workplace Deviance & Moral Reasoning    Research Question: will scores on a moral reasoning measure, that reflect “maintaining norms” (~Kohlberg’s conventional level) interact with organizational injustice to produce workplace deviance? Continuous Measures: Maintaining Norms Categorical Measure: High vs. Low Organizational Injustice 24 Results Coefficientsa Model 1 (Constant) Condition MNxCon MNCENT Unstandardized Coefficients B Std. Error 2.569 .144 .391 .205 -.100 .014 -.019 .010 Standardized Coefficients Beta .186 -.007 -.266 t 17.828 1.909 -7.143 -1.886 Sig. .000 .059 .001 .062 Collinearity Statistics Tolerance VIF .998 .475 .475 1.002 2.107 2.105 a. Dependent Variable: Average Workplace Deviance Score ANOVAb Model 1 Regres sion Residual Total Sum of Squares 11.491 98.641 110.132 df 3 95 98 Mean Square 3.830 1.038 F 3.689 Sig. .015a a. Predic tors: (Constant), MNCENT, Condit ion, MNx Con b. Dependent Variable: Average W orkplace Devianc e Sc ore R2 = .104 Note: data based on an actual study, interaction effect is manufactured for purpose of illustration. 25 Graph of Interaction Interaction Chart for Dummy x Continuous Variable Interaction 6 5 Predicted Value 4 dummy code=0 dummy code=1 3 2 1 0 Low High Two Sample Points 26 And then there’s contrast coding  Recall our one-way teaching example  Two orthogonal contrasts:   M1 & M2 vs. M3 M1 vs. M2 Method 1 Method 2 Method 3 3 5 2 4 8 4 3 9 4 4 3 8 7 4 2 5 6 7 8 6 7 9 10 9 Means: 4.75 4.625 7.75 27 Accomplished by Contrast Coding   C1: -0.5M1 + -0.5M2 + 1M3 C2: 1M1 + -1M2 + 0M3 ANOVAb Model 1 Regres sion Residual Total Sum of Squares 50.083 86.875 136.958 df 2 21 23 Mean Square 25.042 4.137 F 6.053 Sig. .008a a. Predictors: (Constant), Orthogonal comparison 2 (g1=g2), Orthogonal comparison 1 .5(1)+.5(2)=3 b. Dependent Variable: Exam Score Coefficientsa Model 1 (Constant) Orthogonal comparison 1 .5(1)+.5(2)=3 Orthogonal comparison 2 (g1=g2) Unstandardized Standardized Coefficients Coefficients B Std. Error Beta 5.708 .415 t 13.749 95% Confidence Interval for B Correlations Sig. Lower Bound Upper Bound Zero-order Partial .000 4.845 6.572 Part Collinearity Statistics Tolerance VIF 2.042 .587 .604 3.477 .002 .821 3.263 .604 .604 .604 1.000 1.000 .063 .508 .021 .123 .903 -.995 1.120 .021 .027 .021 1.000 1.000 a. Dependent Variable: Exam Score 28

MRA Part 3-Applications (2007)

Related documents

Products

Support

MRA Part 3-Applications (2007)

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib