spss_ancova

advertisement
ANCOVA Examples Using SPSS
IMPORT
FILE='e:\510\data\htwt.por'.
DESCRIPTIVES
VARIABLES=age height weight
/STATISTICS=MEAN STDDEV MIN MAX .
Descriptive Statistics
N
age
height
weight
Valid N (listwise)
Minimum
13.90
50.50
50.50
237
237
237
237
Maximum
25.00
72.00
171.50
Mean
16.4430
61.3646
101.3080
Std. Deviation
1.84258
3.94540
19.44070
/*Create new variables and interactions*/
RECODE
sex
('f'=1) ('m'=0) INTO female .
EXECUTE .
value labels female (1) 1:Female (0) 0:Male.
Compute centage = age - 16.5.
Compute fem_age = female* age.
Compute fem_centage = female * centage.
EXECUTE.
/*Select Cases with Age < 19*/
USE ALL.
COMPUTE filter_$=(age < 19).
VARIABLE LABEL filter_$ 'age < 19 (FILTER)'.
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMAT filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE .
/*Scatter Plot of Height vs Age for Males and Females*/
GRAPH
/SCATTERPLOT(BIVAR)= age WITH height BY sex
/MISSING=LISTWISE .
sex
75.00
f
m
Fit line for f
Fit line for m
70.00
height
65.00
60.00
55.00
R Sq Linear = 0.547
R Sq Linear = 0.291
50.00
13.00
14.00
15.00
16.00
17.00
age
1
18.00
19.00
REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT height
/METHOD=ENTER female age fem_age
/SCATTERPLOT=(*SDRESID ,*ZPRED )
/RESIDUALS HIST(ZRESID) NORM(ZRESID) .
Regression with Female, Age, and Interaction Term: Femage
Variables Entered/Removedb
Model
1
Variables
Entered
fem_age,
age , a
female
Variables
Removed
Method
.
Enter
a. All requested variables entered.
b. Dependent Variable: height
Model Summaryb
Model
1
R
.678a
R Square
.460
Adjusted
R Square
.452
Std. Error of
the Estimate
2.79947
a. Predictors: (Constant), fem_age, age , female
b. Dependent Variable: height
ANOVAb
Model
1
Regres sion
Residual
Total
Sum of
Squares
1432.638
1684.957
3117.595
df
3
215
218
Mean Square
477.546
7.837
a. Predic tors: (Constant), fem_age, age , female
b. Dependent Variable: height
2
F
60.935
Sig.
.000a
Coefficientsa
Model
1
(Constant)
female
age
fem_age
Unstandardized
Coefficients
B
Std. Error
28.883
2.873
13.612
4.019
2.031
.178
-.929
.248
Standardized
Coefficients
Beta
1.801
.822
-2.008
t
10.052
3.387
11.435
-3.750
Sig.
.000
.001
.000
.000
a. Dependent Variable: height
/*ANCOVA with FEMALE, CENTAGE, FEM_CENTAGE interaction (Centered Age)*/
REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT height
/METHOD=ENTER female centage fem_centage
/SCATTERPLOT=(*SDRESID ,*ZPRED )
/RESIDUALS HIST(ZRESID) NORM(ZRESID) .
3
b
Va riables Entered/Re moved
Model
1
Variables
Removed
Variables Entered
fem_centage,
female,
a
centage
.
Method
Enter
a. All request ed variables entered.
b. Dependent Variable: height
Model Summaryb
Model
1
R
.678a
R Square
.460
Adjusted
R Square
.452
Std. Error of
the Estimate
2.79947
a. Predictors: (Constant), fem_centage, female, centage
b. Dependent Variable: height
ANOVAb
Model
1
Regres sion
Residual
Total
Sum of
Squares
1432.638
1684.957
3117.595
df
3
215
218
Mean Square
477.546
7.837
F
60.935
Sig.
.000a
a. Predic tors: (Constant), fem_centage, female, centage
b. Dependent Variable: height
Coefficientsa
Model
1
(Constant)
female
centage
fem_centage
Unstandardized
Coefficients
B
Std. Error
62.399
.269
-1.723
.389
2.031
.178
-.929
.248
Standardized
Coefficients
Beta
-.228
.822
-.272
a. Dependent Variable: height
/*ANCOVA model using GLM*/
UNIANOVA
height BY sex WITH centage
/METHOD = SSTYPE(3)
/INTERCEPT = INCLUDE
/CRITERIA = ALPHA(.05)
/DESIGN = sex centage centage*sex .
4
t
231.949
-4.428
11.435
-3.750
Sig.
.000
.000
.000
.000
Univariate Analysis of Variance: GLM on SEX, Age, and their
Interaction
Be twe en-Subjects Fa ctors
N
sex
f
m
103
116
Tests of Between-Subjects Effects
Dependent Variable: height
Source
Corrected Model
Intercept
sex
age
sex * age
Error
Total
Corrected Total
Type III Sum
of Squares
1432.638a
2471.764
89.897
1252.650
110.230
1684.957
818138.600
3117.595
df
3
1
1
1
1
215
219
218
Mean Square
477.546
2471.764
89.897
1252.650
110.230
7.837
F
60.935
315.396
11.471
159.838
14.065
Sig.
.000
.000
.001
.000
.000
a. R Squared = .460 (Adjus ted R Squared = .452)
Parameter Estimates
Dependent Variable: height
Parameter
Intercept
[sex=f
]
[sex=m
]
age
[sex=f
] * age
[sex=m
] * age
B
28.883
13.612
0a
2.031
-.929
0a
Std. Error
2.873
4.019
.
.178
.248
.
t
10.052
3.387
.
11.435
-3.750
.
Sig.
.000
.001
.
.000
.000
.
95% Confidence Interval
Lower Bound Upper Bound
23.219
34.547
5.690
21.534
.
.
1.681
2.381
-1.418
-.441
.
.
a. This parameter is set to zero because it is redundant.
Estimated Marginal Means: By default compared at Mean of other
covariates
Estima tes
Dependent Variable: height
f
m
Mean
St d. Error
60.284 a
.276
61.677 a
.260
95% Confidenc e Int erval
Lower Bound Upper Bound
59.740
60.828
61.164
62.189
a. Covariates appearing in the model are evaluat ed at the
following values : age = 16. 1443.
5
Pairwise Comparisons
Dependent Variable: height
(I)
f
m
(J)
m
f
Mean
Difference
(I-J)
-1.393*
1.393*
Std. Error
.379
.379
a
Sig.
.000
.000
95% Confidence Interval for
a
Difference
Lower Bound Upper Bound
-2.140
-.645
.645
2.140
Based on estimated marginal means
*. The mean difference is s ignificant at the .05 level.
a. Adjustment for multiple comparis ons: Least Significant Difference (equivalent to
no adjustments ).
Univariate Tests
Dependent Variable: height
Contrast
Error
Sum of
Squares
105.766
1684.957
df
1
215
Mean Square
105.766
7.837
F
13.496
Sig.
.000
The F tests the effect of . This test is based on the linearly independent
pairwis e comparisons among the estimated marginal means .
/*ANCOVA model on centered age values using GLM*/
UNIANOVA
height BY sex WITH centage
/METHOD = SSTYPE(3)
/INTERCEPT = INCLUDE
/EMMEANS = TABLES(sex) WITH(centage=0) COMPARE ADJ(LSD)
/PRINT = PARAMETER
/CRITERIA = ALPHA(.05)
/DESIGN = sex centage centage*sex .
Univariate Analysis of Variance: Analysis on Centered Age, Syntax
is modified (EMMEANS subcommand altered to compare means for
SEX at Centage=0).
Be twe en-Subjects Fa ctors
N
sex
f
m
103
116
6
Tests of Between-Subjects Effects
Dependent Variable: height
Source
Corrected Model
Intercept
sex
centage
sex * centage
Error
Total
Corrected Total
Type III Sum
of Squares
1432.638a
783822.643
153.684
1252.650
110.230
1684.957
818138.600
3117.595
df
3
1
1
1
1
215
219
218
Mean Square
477.546
783822.643
153.684
1252.650
110.230
7.837
F
60.935
100015.5
19.610
159.838
14.065
Sig.
.000
.000
.000
.000
.000
a. R Squared = .460 (Adjus ted R Squared = .452)
Parameter Estimates
Dependent Variable: height
Parameter
Intercept
[sex=f
]
[sex=m
]
centage
[sex=f
] * centage
[sex=m
] * centage
B
62.399
-1.723
0a
2.031
-.929
0a
Std. Error
.269
.389
.
.178
.248
.
t
231.949
-4.428
.
11.435
-3.750
.
Sig.
.000
.000
.
.000
.000
.
95% Confidence Interval
Lower Bound Upper Bound
61.869
62.930
-2.490
-.956
.
.
1.681
2.381
-1.418
-.441
.
.
a. This parameter is set to zero because it is redundant.
Estimated Marginal Means: The means for SEX are compared at
Centage=0, because syntax was modified
Estima tes
Dependent Variable: height
Mean
St d. Error
60.676 a
.281
62.399 a
.269
f
m
95% Confidenc e Int erval
Lower Bound Upper Bound
60.122
61.230
61.869
62.930
a. Covariates appearing in the model are evaluat ed at the
following values : centage = . 00.
Pairwise Comparisons
Dependent Variable: height
(I)
f
m
(J)
m
f
Mean
Difference
(I-J)
-1.723*
1.723*
Std. Error
.389
.389
a
Sig.
.000
.000
95% Confidence Interval for
a
Difference
Lower Bound Upper Bound
-2.490
-.956
.956
2.490
Based on estimated marginal means
*. The mean difference is s ignificant at the .05 level.
a. Adjustment for multiple comparis ons: Least Significant Difference (equivalent to
no adjustments ).
7
Univariate Tests
Dependent Variable: height
Contrast
Error
Sum of
Squares
153.684
1684.957
df
1
215
Mean Square
153.684
7.837
F
19.610
Sig.
.000
The F tests the effect of . This test is based on the linearly independent
pairwis e comparisons among the estimated marginal means .
/*Separate regressions for males and for females*/
SORT CASES BY sex .
SPLIT FILE
SEPARATE BY sex .
REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT height
/METHOD=ENTER centage
/SCATTERPLOT=(*SDRESID ,*ZPRED )
/RESIDUALS HIST(ZRESID) NORM(ZRESID) .
Regression: Separate Regression Model for each sex.
sex
=f
Variables Entered/Removedb,c
Model
1
Variables
Entered
centagea
Variables
Removed
.
Method
Enter
a. All requested variables entered.
b. Dependent Variable: height
c. sex = f
Model Summaryb,c
Model
1
R
.525a
R Square
.276
Adjusted
R Square
.269
Std. Error of
the Estimate
2.87848
a. Predictors: (Constant), centage
b. Dependent Variable: height
c. sex = f
8
ANOVAb,c
Model
1
Regres sion
Residual
Total
Sum of
Squares
318.634
836.850
1155.484
df
1
101
102
Mean Square
318.634
8.286
F
38.456
Sig.
.000a
a. Predic tors: (Constant), cent age
b. Dependent Variable: height
c. sex
=f
Coeffi cientsa,b
Model
1
(Const ant)
centage
Unstandardized
Coeffic ients
B
St d. Error
60.676
.289
1.102
.178
St andardiz ed
Coeffic ients
Beta
.525
a. Dependent Variable: height
b. sex
=f
9
t
209.845
6.201
Sig.
.000
.000
sex
=m
Variables Entered/Removedb,c
Model
1
Variables
Entered
centagea
Variables
Removed
Method
Enter
.
a. All requested variables entered.
b. Dependent Variable: height
c. sex = m
Model Summaryb,c
Model
1
R
.740a
R Square
.547
Adjusted
R Square
.543
Std. Error of
the Estimate
2.72755
a. Predictors: (Constant), centage
b. Dependent Variable: height
c. sex = m
ANOVAb,c
Model
1
Regres sion
Residual
Total
Sum of
Squares
1024.779
848.107
1872.886
df
1
114
115
Mean Square
1024.779
7.440
F
137.748
Sig.
.000a
a. Predic tors: (Constant), cent age
b. Dependent Variable: height
c. sex
=m
Coeffi cientsa,b
Model
1
(Const ant)
centage
Unstandardized
Coeffic ients
B
St d. Error
62.399
.262
2.031
.173
St andardiz ed
Coeffic ients
Beta
.740
a. Dependent Variable: height
b. sex
=m
10
t
238.064
11.737
Sig.
.000
.000
For this example, we use the cars.sav dataset
/*Create new variables*/
COMPUTE c_year = year-75 .
EXECUTE .
COMPUTE American = origin=1 .
EXECUTE .
COMPUTE European = origin=2 .
EXECUTE .
COMPUTE Japanese = origin=3 .
EXECUTE .
COMPUTE Amer_year = American*year.
COMPUTE Euro_year = European*year.
COMPUTE Japan_year=Japanese*year.
EXECUTE.
/*Check frequencies of dummy variables*/
FREQUENCIES
VARIABLES=origin American European Japanese
/ORDER= ANALYSIS .
11
Frequencies
Statistics
N
Valid
Missing
origin
Country of
Origin
405
1
Americ an
405
1
European
405
1
Japanese
405
1
origin Country of Origin
Valid
Missing
Total
1 American
2 European
3 Japanes e
Total
System
Frequency
253
73
79
405
1
406
Percent
62.3
18.0
19.5
99.8
.2
100.0
Valid Percent
62.5
18.0
19.5
100.0
Cumulative
Percent
62.5
80.5
100.0
American
Valid
Mis sing
Total
.00
1.00
Total
System
Frequency
152
253
405
1
406
Percent
37.4
62.3
99.8
.2
100.0
Valid Percent
37.5
62.5
100.0
Cumulative
Percent
37.5
100.0
European
Valid
Mis sing
Total
.00
1.00
Total
System
Frequency
332
73
405
1
406
Percent
81.8
18.0
99.8
.2
100.0
Valid Percent
82.0
18.0
100.0
Cumulative
Percent
82.0
100.0
Japanese
Valid
Mis sing
Total
.00
1.00
Total
System
Frequency
326
79
405
1
406
Percent
80.3
19.5
99.8
.2
100.0
Valid Percent
80.5
19.5
100.0
/*Get a scatterplot*/
12
Cumulative
Percent
80.5
100.0
GRAPH
/SCATTERPLOT(BIVAR)=year WITH mpg BY origin
/MISSING=LISTWISE .
/*Select cases with 4, 6, or 8 cylinders*/
USE ALL.
COMPUTE filter_$=(cylinder=4 or cylinder=6 or cylinder=8).
VARIABLE LABEL filter_$ 'cylinder=4 or cylinder=6 or cylinder=8 (FILTER)'.
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMAT filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE .
/*Regression model with uncentered year, dummy variables for Origin, and interactions*/
REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT mpg
/METHOD=ENTER American European Year Amer_year Euro_year
/SCATTERPLOT=(*SDRESID ,*ZPRED )
/RESIDUALS HIST(ZRESID) NORM(ZRESID) .
Regression model with uncentered year, dummy variables for
Origin, and interaction terms
Regression
b
Va riables Ente red/Re moved
Model
1
Variables Entered
Euro_y ear, year Model Year (modulo
100),
a
Amer_year, Americ an, European
Variables
Removed
.
a. All request ed variables entered.
b. Dependent Variable: mpg Miles per Gallon
13
Method
Enter
Model Summ aryb
Model
1
R
R Square
.769a
.591
Adjust ed
R Square
.586
St d. Error of
the Es timate
5.036
a. Predic tors: (Constant), Euro_year, year Model Year
(modulo 100), Amer_year, American, European
b. Dependent Variable: mpg Miles per Gallon
ANOVAb
Model
1
Regres sion
Residual
Total
Sum of
Squares
14065.999
9740.534
23806.532
df
5
384
389
Mean Square
2813.200
25.366
F
110.904
Sig.
.000a
a. Predictors: (Constant), Euro_year, year Model Year (modulo 100), Amer_year,
American, European
b. Dependent Variable: mpg Miles per Gallon
Coefficientsa
Model
1
(Constant)
American
European
year Model Year
(modulo 100)
Amer_year
Euro_year
Unstandardized
Coefficients
B
Std. Error
-38.288
12.473
-25.349
14.120
-13.709
18.392
Standardized
Coefficients
Beta
-1.561
-.662
t
-3.070
-1.795
-.745
Sig.
.002
.073
.456
.893
.161
.422
5.559
.000
.214
.163
.183
.240
1.001
.596
1.172
.678
.242
.498
a. Dependent Variable: mpg Miles per Gallon
14
COMPUTE Amer_cyear = American*c_year.
COMPUTE Euro_cyear = European*c_year.
COMPUTE Japan_cyear=Japanese*c_year.
EXECUTE.
/*Regression model with centered year, dummy variables for Origin, and interactions*/
REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT mpg
/METHOD=ENTER American European c_year Amer_cyear Euro_cyear
/SCATTERPLOT=(*SDRESID ,*ZPRED )
/RESIDUALS HIST(ZRESID) NORM(ZRESID) .
Regression with centered year, dummy variables for origin, and
interaction terms
b
Va riables Ente red/Rem oved
Model
1
Variables
Removed
Variables Entered
Euro_c year, Amer_cyear,
American,
a
European, c_year
.
a. All request ed variables entered.
b. Dependent Variable: mpg Miles per Gallon
Model Summ aryb
Model
1
R
R Square
.769a
.591
Adjust ed
R Square
.586
St d. Error of
the Es timate
5.036
a. Predic tors: (Constant), Euro_cy ear, Amer_c year,
Americ an, European, c _year
b. Dependent Variable: mpg Miles per Gallon
15
Method
Enter
ANOVAb
Model
1
Sum of
Squares
14065. 999
9740.534
23806. 532
Regres sion
Residual
Total
df
5
384
389
Mean Square
2813.200
25.366
F
110.904
Sig.
.000a
a. Predic tors: (Constant), Euro_cy ear, Amer_cy ear, Americ an, European, c_year
b. Dependent Variable: mpg Miles per Gallon
Coeffi cientsa
Model
1
(Const ant)
Americ an
European
c_year
Amer_cyear
Euro_c year
Unstandardized
Coeffic ients
B
St d. Error
28.704
.711
-9. 277
.782
-1. 498
.948
.893
.161
.214
.183
.163
.240
St andardiz ed
Coeffic ients
Beta
-.571
-.072
.422
.080
.030
t
40.366
-11.868
-1. 581
5.559
1.172
.678
Sig.
.000
.000
.115
.000
.242
.498
a. Dependent Variable: mpg Miles per Gallon
/*ANCOVA model with year, dummy variables for Origin, and interactions. EMMEANS compared
at year=75*/
UNIANOVA
mpg BY origin WITH year
/METHOD = SSTYPE(3)
/INTERCEPT = INCLUDE
/EMMEANS = TABLES(origin) WITH(year=75)
/PRINT = PARAMETER
/CRITERIA = ALPHA(.05)
/DESIGN = origin year origin*year .
Univariate Analysis of Variance: Comparing Mean MPG at each
origin with year=75.
Between-Subjects Factors
origin Country
of Origin
1
2
3
Value Label
American
European
Japanese
N
248
67
75
16
Tests of Between-Subjects Effects
Dependent Variable: mpg Miles per Gallon
Source
Corrected Model
Intercept
origin
year
origin * year
Error
Total
Corrected Total
Type III Sum
of Squares
14065.999a
1573.067
86.071
3630.376
34.827
9740.534
240148.610
23806.532
df
5
1
2
1
2
384
390
389
Mean Square
2813.200
1573.067
43.035
3630.376
17.414
25.366
F
110.904
62.015
1.697
143.120
.686
Sig.
.000
.000
.185
.000
.504
a. R Squared = .591 (Adjus ted R Squared = .586)
Parameter Estimates
Dependent Variable: mpg Miles per Gallon
Parameter
Intercept
[origin=1]
[origin=2]
[origin=3]
year
[origin=1] * year
[origin=2] * year
[origin=3] * year
B
-38.288
-25.349
-13.709
0a
.893
.214
.163
0a
Std. Error
12.473
14.120
18.392
.
.161
.183
.240
.
t
-3.070
-1.795
-.745
.
5.559
1.172
.678
.
Sig.
.002
.073
.456
.
.000
.242
.498
.
95% Confidence Interval
Lower Bound Upper Bound
-62.812
-13.764
-53.111
2.412
-49.870
22.452
.
.
.577
1.209
-.145
.574
-.309
.635
.
.
a. This parameter is s et to zero because it is redundant.
Estimated Marginal Means Calculated at Year=75
Country of Origin
Dependent Variable: mpg Miles per Gallon
Country of Origin
1 American
2 European
3 Japanes e
Mean
19.427 a
27.206 a
28.704 a
Std. Error
.325
.627
.711
95% Confidence Interval
Lower Bound Upper Bound
18.789
20.065
25.973
28.438
27.306
30.102
a. Covariates appearing in the model are evaluated at the following
values : year Model Year (modulo 100) = 75.
/*ANCOVA model with year and dummy variables for Origin. Remove non-significant)*/
UNIANOVA
mpg BY origin WITH year
/METHOD = SSTYPE(3)
/INTERCEPT = INCLUDE
/EMMEANS = TABLES(origin) WITH(year=MEAN)
/PRINT = PARAMETER
/CRITERIA = ALPHA(.05)
/DESIGN = origin year .
17
Univariate Analysis of Variance: GLM model without interaction
Between-Subjects Factors
origin Country
of Origin
Value Label
American
European
Japanese
1
2
3
N
248
67
75
Te sts of Betw een-Subjects Effects
Dependent Variable: mpg Miles per Gallon
Source
Correc ted Model
Int ercept
origin
year
Error
Total
Correc ted Total
Ty pe III Sum
of Squares
14031. 171 a
2584.372
6152.002
5712.052
9775.361
240148.610
23806. 532
df
Mean Square
4677.057
2584.372
3076.001
5712.052
25.325
3
1
2
1
386
390
389
F
184.683
102.049
121.462
225.552
Sig.
.000
.000
.000
.000
a. R Squared = .589 (Adjusted R Squared = .586)
Parameter Estimates
Dependent Variable: mpg Miles per Gallon
Parameter
Intercept
[origin=1]
[origin=2]
[origin=3]
year
B
-51.082
-8.825
-1.080
0a
1.058
Std. Error
5.495
.677
.856
.
.070
t
-9.296
-13.041
-1.261
.
15.018
Sig.
.000
.000
.208
.
.000
95% Confidence Interval
Lower Bound Upper Bound
-61.885
-40.278
-10.156
-7.495
-2.763
.604
.
.
.920
1.197
a. This parameter is s et to zero because it is redundant.
Estimated Marginal Means
Country of Origin
Dependent Variable: mpg Miles per Gallon
Country of Origin
1 American
2 European
3 Japanes e
Mean
20.525 a
28.271 a
29.350 a
Std. Error
.321
.615
.591
95% Confidence Interval
Lower Bound Upper Bound
19.894
21.155
27.061
29.480
28.188
30.512
a. Covariates appearing in the model are evaluated at the following
values : year Model Year (modulo 100) = 76.01.
18
Download