ANCOVA Examples Using SPSS IMPORT FILE='e:\510\data\htwt.por'. DESCRIPTIVES VARIABLES=age height weight /STATISTICS=MEAN STDDEV MIN MAX . Descriptive Statistics N age height weight Valid N (listwise) Minimum 13.90 50.50 50.50 237 237 237 237 Maximum 25.00 72.00 171.50 Mean 16.4430 61.3646 101.3080 Std. Deviation 1.84258 3.94540 19.44070 /*Create new variables and interactions*/ RECODE sex ('f'=1) ('m'=0) INTO female . EXECUTE . value labels female (1) 1:Female (0) 0:Male. Compute centage = age - 16.5. Compute fem_age = female* age. Compute fem_centage = female * centage. EXECUTE. /*Select Cases with Age < 19*/ USE ALL. COMPUTE filter_$=(age < 19). VARIABLE LABEL filter_$ 'age < 19 (FILTER)'. VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'. FORMAT filter_$ (f1.0). FILTER BY filter_$. EXECUTE . /*Scatter Plot of Height vs Age for Males and Females*/ GRAPH /SCATTERPLOT(BIVAR)= age WITH height BY sex /MISSING=LISTWISE . sex 75.00 f m Fit line for f Fit line for m 70.00 height 65.00 60.00 55.00 R Sq Linear = 0.547 R Sq Linear = 0.291 50.00 13.00 14.00 15.00 16.00 17.00 age 1 18.00 19.00 REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT height /METHOD=ENTER female age fem_age /SCATTERPLOT=(*SDRESID ,*ZPRED ) /RESIDUALS HIST(ZRESID) NORM(ZRESID) . Regression with Female, Age, and Interaction Term: Femage Variables Entered/Removedb Model 1 Variables Entered fem_age, age , a female Variables Removed Method . Enter a. All requested variables entered. b. Dependent Variable: height Model Summaryb Model 1 R .678a R Square .460 Adjusted R Square .452 Std. Error of the Estimate 2.79947 a. Predictors: (Constant), fem_age, age , female b. Dependent Variable: height ANOVAb Model 1 Regres sion Residual Total Sum of Squares 1432.638 1684.957 3117.595 df 3 215 218 Mean Square 477.546 7.837 a. Predic tors: (Constant), fem_age, age , female b. Dependent Variable: height 2 F 60.935 Sig. .000a Coefficientsa Model 1 (Constant) female age fem_age Unstandardized Coefficients B Std. Error 28.883 2.873 13.612 4.019 2.031 .178 -.929 .248 Standardized Coefficients Beta 1.801 .822 -2.008 t 10.052 3.387 11.435 -3.750 Sig. .000 .001 .000 .000 a. Dependent Variable: height /*ANCOVA with FEMALE, CENTAGE, FEM_CENTAGE interaction (Centered Age)*/ REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT height /METHOD=ENTER female centage fem_centage /SCATTERPLOT=(*SDRESID ,*ZPRED ) /RESIDUALS HIST(ZRESID) NORM(ZRESID) . 3 b Va riables Entered/Re moved Model 1 Variables Removed Variables Entered fem_centage, female, a centage . Method Enter a. All request ed variables entered. b. Dependent Variable: height Model Summaryb Model 1 R .678a R Square .460 Adjusted R Square .452 Std. Error of the Estimate 2.79947 a. Predictors: (Constant), fem_centage, female, centage b. Dependent Variable: height ANOVAb Model 1 Regres sion Residual Total Sum of Squares 1432.638 1684.957 3117.595 df 3 215 218 Mean Square 477.546 7.837 F 60.935 Sig. .000a a. Predic tors: (Constant), fem_centage, female, centage b. Dependent Variable: height Coefficientsa Model 1 (Constant) female centage fem_centage Unstandardized Coefficients B Std. Error 62.399 .269 -1.723 .389 2.031 .178 -.929 .248 Standardized Coefficients Beta -.228 .822 -.272 a. Dependent Variable: height /*ANCOVA model using GLM*/ UNIANOVA height BY sex WITH centage /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /CRITERIA = ALPHA(.05) /DESIGN = sex centage centage*sex . 4 t 231.949 -4.428 11.435 -3.750 Sig. .000 .000 .000 .000 Univariate Analysis of Variance: GLM on SEX, Age, and their Interaction Be twe en-Subjects Fa ctors N sex f m 103 116 Tests of Between-Subjects Effects Dependent Variable: height Source Corrected Model Intercept sex age sex * age Error Total Corrected Total Type III Sum of Squares 1432.638a 2471.764 89.897 1252.650 110.230 1684.957 818138.600 3117.595 df 3 1 1 1 1 215 219 218 Mean Square 477.546 2471.764 89.897 1252.650 110.230 7.837 F 60.935 315.396 11.471 159.838 14.065 Sig. .000 .000 .001 .000 .000 a. R Squared = .460 (Adjus ted R Squared = .452) Parameter Estimates Dependent Variable: height Parameter Intercept [sex=f ] [sex=m ] age [sex=f ] * age [sex=m ] * age B 28.883 13.612 0a 2.031 -.929 0a Std. Error 2.873 4.019 . .178 .248 . t 10.052 3.387 . 11.435 -3.750 . Sig. .000 .001 . .000 .000 . 95% Confidence Interval Lower Bound Upper Bound 23.219 34.547 5.690 21.534 . . 1.681 2.381 -1.418 -.441 . . a. This parameter is set to zero because it is redundant. Estimated Marginal Means: By default compared at Mean of other covariates Estima tes Dependent Variable: height f m Mean St d. Error 60.284 a .276 61.677 a .260 95% Confidenc e Int erval Lower Bound Upper Bound 59.740 60.828 61.164 62.189 a. Covariates appearing in the model are evaluat ed at the following values : age = 16. 1443. 5 Pairwise Comparisons Dependent Variable: height (I) f m (J) m f Mean Difference (I-J) -1.393* 1.393* Std. Error .379 .379 a Sig. .000 .000 95% Confidence Interval for a Difference Lower Bound Upper Bound -2.140 -.645 .645 2.140 Based on estimated marginal means *. The mean difference is s ignificant at the .05 level. a. Adjustment for multiple comparis ons: Least Significant Difference (equivalent to no adjustments ). Univariate Tests Dependent Variable: height Contrast Error Sum of Squares 105.766 1684.957 df 1 215 Mean Square 105.766 7.837 F 13.496 Sig. .000 The F tests the effect of . This test is based on the linearly independent pairwis e comparisons among the estimated marginal means . /*ANCOVA model on centered age values using GLM*/ UNIANOVA height BY sex WITH centage /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /EMMEANS = TABLES(sex) WITH(centage=0) COMPARE ADJ(LSD) /PRINT = PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = sex centage centage*sex . Univariate Analysis of Variance: Analysis on Centered Age, Syntax is modified (EMMEANS subcommand altered to compare means for SEX at Centage=0). Be twe en-Subjects Fa ctors N sex f m 103 116 6 Tests of Between-Subjects Effects Dependent Variable: height Source Corrected Model Intercept sex centage sex * centage Error Total Corrected Total Type III Sum of Squares 1432.638a 783822.643 153.684 1252.650 110.230 1684.957 818138.600 3117.595 df 3 1 1 1 1 215 219 218 Mean Square 477.546 783822.643 153.684 1252.650 110.230 7.837 F 60.935 100015.5 19.610 159.838 14.065 Sig. .000 .000 .000 .000 .000 a. R Squared = .460 (Adjus ted R Squared = .452) Parameter Estimates Dependent Variable: height Parameter Intercept [sex=f ] [sex=m ] centage [sex=f ] * centage [sex=m ] * centage B 62.399 -1.723 0a 2.031 -.929 0a Std. Error .269 .389 . .178 .248 . t 231.949 -4.428 . 11.435 -3.750 . Sig. .000 .000 . .000 .000 . 95% Confidence Interval Lower Bound Upper Bound 61.869 62.930 -2.490 -.956 . . 1.681 2.381 -1.418 -.441 . . a. This parameter is set to zero because it is redundant. Estimated Marginal Means: The means for SEX are compared at Centage=0, because syntax was modified Estima tes Dependent Variable: height Mean St d. Error 60.676 a .281 62.399 a .269 f m 95% Confidenc e Int erval Lower Bound Upper Bound 60.122 61.230 61.869 62.930 a. Covariates appearing in the model are evaluat ed at the following values : centage = . 00. Pairwise Comparisons Dependent Variable: height (I) f m (J) m f Mean Difference (I-J) -1.723* 1.723* Std. Error .389 .389 a Sig. .000 .000 95% Confidence Interval for a Difference Lower Bound Upper Bound -2.490 -.956 .956 2.490 Based on estimated marginal means *. The mean difference is s ignificant at the .05 level. a. Adjustment for multiple comparis ons: Least Significant Difference (equivalent to no adjustments ). 7 Univariate Tests Dependent Variable: height Contrast Error Sum of Squares 153.684 1684.957 df 1 215 Mean Square 153.684 7.837 F 19.610 Sig. .000 The F tests the effect of . This test is based on the linearly independent pairwis e comparisons among the estimated marginal means . /*Separate regressions for males and for females*/ SORT CASES BY sex . SPLIT FILE SEPARATE BY sex . REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT height /METHOD=ENTER centage /SCATTERPLOT=(*SDRESID ,*ZPRED ) /RESIDUALS HIST(ZRESID) NORM(ZRESID) . Regression: Separate Regression Model for each sex. sex =f Variables Entered/Removedb,c Model 1 Variables Entered centagea Variables Removed . Method Enter a. All requested variables entered. b. Dependent Variable: height c. sex = f Model Summaryb,c Model 1 R .525a R Square .276 Adjusted R Square .269 Std. Error of the Estimate 2.87848 a. Predictors: (Constant), centage b. Dependent Variable: height c. sex = f 8 ANOVAb,c Model 1 Regres sion Residual Total Sum of Squares 318.634 836.850 1155.484 df 1 101 102 Mean Square 318.634 8.286 F 38.456 Sig. .000a a. Predic tors: (Constant), cent age b. Dependent Variable: height c. sex =f Coeffi cientsa,b Model 1 (Const ant) centage Unstandardized Coeffic ients B St d. Error 60.676 .289 1.102 .178 St andardiz ed Coeffic ients Beta .525 a. Dependent Variable: height b. sex =f 9 t 209.845 6.201 Sig. .000 .000 sex =m Variables Entered/Removedb,c Model 1 Variables Entered centagea Variables Removed Method Enter . a. All requested variables entered. b. Dependent Variable: height c. sex = m Model Summaryb,c Model 1 R .740a R Square .547 Adjusted R Square .543 Std. Error of the Estimate 2.72755 a. Predictors: (Constant), centage b. Dependent Variable: height c. sex = m ANOVAb,c Model 1 Regres sion Residual Total Sum of Squares 1024.779 848.107 1872.886 df 1 114 115 Mean Square 1024.779 7.440 F 137.748 Sig. .000a a. Predic tors: (Constant), cent age b. Dependent Variable: height c. sex =m Coeffi cientsa,b Model 1 (Const ant) centage Unstandardized Coeffic ients B St d. Error 62.399 .262 2.031 .173 St andardiz ed Coeffic ients Beta .740 a. Dependent Variable: height b. sex =m 10 t 238.064 11.737 Sig. .000 .000 For this example, we use the cars.sav dataset /*Create new variables*/ COMPUTE c_year = year-75 . EXECUTE . COMPUTE American = origin=1 . EXECUTE . COMPUTE European = origin=2 . EXECUTE . COMPUTE Japanese = origin=3 . EXECUTE . COMPUTE Amer_year = American*year. COMPUTE Euro_year = European*year. COMPUTE Japan_year=Japanese*year. EXECUTE. /*Check frequencies of dummy variables*/ FREQUENCIES VARIABLES=origin American European Japanese /ORDER= ANALYSIS . 11 Frequencies Statistics N Valid Missing origin Country of Origin 405 1 Americ an 405 1 European 405 1 Japanese 405 1 origin Country of Origin Valid Missing Total 1 American 2 European 3 Japanes e Total System Frequency 253 73 79 405 1 406 Percent 62.3 18.0 19.5 99.8 .2 100.0 Valid Percent 62.5 18.0 19.5 100.0 Cumulative Percent 62.5 80.5 100.0 American Valid Mis sing Total .00 1.00 Total System Frequency 152 253 405 1 406 Percent 37.4 62.3 99.8 .2 100.0 Valid Percent 37.5 62.5 100.0 Cumulative Percent 37.5 100.0 European Valid Mis sing Total .00 1.00 Total System Frequency 332 73 405 1 406 Percent 81.8 18.0 99.8 .2 100.0 Valid Percent 82.0 18.0 100.0 Cumulative Percent 82.0 100.0 Japanese Valid Mis sing Total .00 1.00 Total System Frequency 326 79 405 1 406 Percent 80.3 19.5 99.8 .2 100.0 Valid Percent 80.5 19.5 100.0 /*Get a scatterplot*/ 12 Cumulative Percent 80.5 100.0 GRAPH /SCATTERPLOT(BIVAR)=year WITH mpg BY origin /MISSING=LISTWISE . /*Select cases with 4, 6, or 8 cylinders*/ USE ALL. COMPUTE filter_$=(cylinder=4 or cylinder=6 or cylinder=8). VARIABLE LABEL filter_$ 'cylinder=4 or cylinder=6 or cylinder=8 (FILTER)'. VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'. FORMAT filter_$ (f1.0). FILTER BY filter_$. EXECUTE . /*Regression model with uncentered year, dummy variables for Origin, and interactions*/ REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT mpg /METHOD=ENTER American European Year Amer_year Euro_year /SCATTERPLOT=(*SDRESID ,*ZPRED ) /RESIDUALS HIST(ZRESID) NORM(ZRESID) . Regression model with uncentered year, dummy variables for Origin, and interaction terms Regression b Va riables Ente red/Re moved Model 1 Variables Entered Euro_y ear, year Model Year (modulo 100), a Amer_year, Americ an, European Variables Removed . a. All request ed variables entered. b. Dependent Variable: mpg Miles per Gallon 13 Method Enter Model Summ aryb Model 1 R R Square .769a .591 Adjust ed R Square .586 St d. Error of the Es timate 5.036 a. Predic tors: (Constant), Euro_year, year Model Year (modulo 100), Amer_year, American, European b. Dependent Variable: mpg Miles per Gallon ANOVAb Model 1 Regres sion Residual Total Sum of Squares 14065.999 9740.534 23806.532 df 5 384 389 Mean Square 2813.200 25.366 F 110.904 Sig. .000a a. Predictors: (Constant), Euro_year, year Model Year (modulo 100), Amer_year, American, European b. Dependent Variable: mpg Miles per Gallon Coefficientsa Model 1 (Constant) American European year Model Year (modulo 100) Amer_year Euro_year Unstandardized Coefficients B Std. Error -38.288 12.473 -25.349 14.120 -13.709 18.392 Standardized Coefficients Beta -1.561 -.662 t -3.070 -1.795 -.745 Sig. .002 .073 .456 .893 .161 .422 5.559 .000 .214 .163 .183 .240 1.001 .596 1.172 .678 .242 .498 a. Dependent Variable: mpg Miles per Gallon 14 COMPUTE Amer_cyear = American*c_year. COMPUTE Euro_cyear = European*c_year. COMPUTE Japan_cyear=Japanese*c_year. EXECUTE. /*Regression model with centered year, dummy variables for Origin, and interactions*/ REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT mpg /METHOD=ENTER American European c_year Amer_cyear Euro_cyear /SCATTERPLOT=(*SDRESID ,*ZPRED ) /RESIDUALS HIST(ZRESID) NORM(ZRESID) . Regression with centered year, dummy variables for origin, and interaction terms b Va riables Ente red/Rem oved Model 1 Variables Removed Variables Entered Euro_c year, Amer_cyear, American, a European, c_year . a. All request ed variables entered. b. Dependent Variable: mpg Miles per Gallon Model Summ aryb Model 1 R R Square .769a .591 Adjust ed R Square .586 St d. Error of the Es timate 5.036 a. Predic tors: (Constant), Euro_cy ear, Amer_c year, Americ an, European, c _year b. Dependent Variable: mpg Miles per Gallon 15 Method Enter ANOVAb Model 1 Sum of Squares 14065. 999 9740.534 23806. 532 Regres sion Residual Total df 5 384 389 Mean Square 2813.200 25.366 F 110.904 Sig. .000a a. Predic tors: (Constant), Euro_cy ear, Amer_cy ear, Americ an, European, c_year b. Dependent Variable: mpg Miles per Gallon Coeffi cientsa Model 1 (Const ant) Americ an European c_year Amer_cyear Euro_c year Unstandardized Coeffic ients B St d. Error 28.704 .711 -9. 277 .782 -1. 498 .948 .893 .161 .214 .183 .163 .240 St andardiz ed Coeffic ients Beta -.571 -.072 .422 .080 .030 t 40.366 -11.868 -1. 581 5.559 1.172 .678 Sig. .000 .000 .115 .000 .242 .498 a. Dependent Variable: mpg Miles per Gallon /*ANCOVA model with year, dummy variables for Origin, and interactions. EMMEANS compared at year=75*/ UNIANOVA mpg BY origin WITH year /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /EMMEANS = TABLES(origin) WITH(year=75) /PRINT = PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = origin year origin*year . Univariate Analysis of Variance: Comparing Mean MPG at each origin with year=75. Between-Subjects Factors origin Country of Origin 1 2 3 Value Label American European Japanese N 248 67 75 16 Tests of Between-Subjects Effects Dependent Variable: mpg Miles per Gallon Source Corrected Model Intercept origin year origin * year Error Total Corrected Total Type III Sum of Squares 14065.999a 1573.067 86.071 3630.376 34.827 9740.534 240148.610 23806.532 df 5 1 2 1 2 384 390 389 Mean Square 2813.200 1573.067 43.035 3630.376 17.414 25.366 F 110.904 62.015 1.697 143.120 .686 Sig. .000 .000 .185 .000 .504 a. R Squared = .591 (Adjus ted R Squared = .586) Parameter Estimates Dependent Variable: mpg Miles per Gallon Parameter Intercept [origin=1] [origin=2] [origin=3] year [origin=1] * year [origin=2] * year [origin=3] * year B -38.288 -25.349 -13.709 0a .893 .214 .163 0a Std. Error 12.473 14.120 18.392 . .161 .183 .240 . t -3.070 -1.795 -.745 . 5.559 1.172 .678 . Sig. .002 .073 .456 . .000 .242 .498 . 95% Confidence Interval Lower Bound Upper Bound -62.812 -13.764 -53.111 2.412 -49.870 22.452 . . .577 1.209 -.145 .574 -.309 .635 . . a. This parameter is s et to zero because it is redundant. Estimated Marginal Means Calculated at Year=75 Country of Origin Dependent Variable: mpg Miles per Gallon Country of Origin 1 American 2 European 3 Japanes e Mean 19.427 a 27.206 a 28.704 a Std. Error .325 .627 .711 95% Confidence Interval Lower Bound Upper Bound 18.789 20.065 25.973 28.438 27.306 30.102 a. Covariates appearing in the model are evaluated at the following values : year Model Year (modulo 100) = 75. /*ANCOVA model with year and dummy variables for Origin. Remove non-significant)*/ UNIANOVA mpg BY origin WITH year /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /EMMEANS = TABLES(origin) WITH(year=MEAN) /PRINT = PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = origin year . 17 Univariate Analysis of Variance: GLM model without interaction Between-Subjects Factors origin Country of Origin Value Label American European Japanese 1 2 3 N 248 67 75 Te sts of Betw een-Subjects Effects Dependent Variable: mpg Miles per Gallon Source Correc ted Model Int ercept origin year Error Total Correc ted Total Ty pe III Sum of Squares 14031. 171 a 2584.372 6152.002 5712.052 9775.361 240148.610 23806. 532 df Mean Square 4677.057 2584.372 3076.001 5712.052 25.325 3 1 2 1 386 390 389 F 184.683 102.049 121.462 225.552 Sig. .000 .000 .000 .000 a. R Squared = .589 (Adjusted R Squared = .586) Parameter Estimates Dependent Variable: mpg Miles per Gallon Parameter Intercept [origin=1] [origin=2] [origin=3] year B -51.082 -8.825 -1.080 0a 1.058 Std. Error 5.495 .677 .856 . .070 t -9.296 -13.041 -1.261 . 15.018 Sig. .000 .000 .208 . .000 95% Confidence Interval Lower Bound Upper Bound -61.885 -40.278 -10.156 -7.495 -2.763 .604 . . .920 1.197 a. This parameter is s et to zero because it is redundant. Estimated Marginal Means Country of Origin Dependent Variable: mpg Miles per Gallon Country of Origin 1 American 2 European 3 Japanes e Mean 20.525 a 28.271 a 29.350 a Std. Error .321 .615 .591 95% Confidence Interval Lower Bound Upper Bound 19.894 21.155 27.061 29.480 28.188 30.512 a. Covariates appearing in the model are evaluated at the following values : year Model Year (modulo 100) = 76.01. 18