Analyze, Missing Values Analysis
MVA VARIABLES=Gender Ideal Statoph Nucoph SATM Year Eye
/MAXCAT=25
/CATEGORICAL=Eye
/TTEST PROB PERCENT=5
/MPATTERN
/EM(TOLERANCE=0.001 CONVERGENCE=0.0001 ITERATIONS=25).
[DataSet] C:\Users\Vati\Documents\StatData\IntroQ\IntroQ.sav
See IntroQ Questionnaire for a description of the questions asked.
N Mean
Univariate Statistics
Std.
Deviation
Missing
Count Percent
No. of Extremes
Low High a
Gender
Ideal
Statoph
Nucoph
667
662
658
665
1.26
70.32
6.29
58.04
.438
3.854
2.315
22.423
0
5
9
2
.0
.7
1.3
.3
SATM
Year
529 505.29
667 1997.52
666
96.080
8.621
138
0
20.7
.0
Eye 1 .1 a. Number of cases outside the range (Q1 - 1.5*IQR, Q3 + 1.5*IQR).
0
8
11
26
1
0
0
0
0
0
2
0
Here we see that almost 21% of the cases are missing data on the SATM variable.
Summary of Estimated Means
Gender Ideal Statoph Nucoph SATM Year
All
Values
EM
1.26 70.32
1.26 70.32
6.29 58.04 505.29 1997.52
6.29 58.02 504.60 1997.51
Please read David Howell’s document on the Expectation-Maximization algorithm . The table above and that below show the results of SPSS’ EM procedure. The algorithm leads to an estimated mean of 504.6 and standard deviation of 95.77 for SATM, not much different from the observed means for those cases on which we do have data.
Summary of Estimated Standard Deviations
Gender Ideal Statoph Nucoph SATM Year
All
Values
.438 3.854 2.315 22.423 96.080 8.621
EM .437 3.851 2.308 22.431 95.770 8.620
627
628
629
631
660
661
662
626
630
632
633
SATM t df
P(2-tail)
# Present
# Missing
Mean(Present)
Separate Variance t Tests a
Gender
1.5
228.9
.132
529
138
1.27
Ideal
.3
209.3
.787
527
135
70.34
Statoph
-2.7
221.6
.007
525
133
6.17
Nucoph
-.1
222.7
.952
527
138
58.01
Mean(Missing) 1.21 70.24 6.75 58.14
For each quantitative variable, pairs of groups are formed by indicator variables
(present, missing).
a a. Indicator variables with less than 5% missing are not displayed.
SATM
.
529
.
.
0
505.29
.
Year
-1.8
234.1
.066
529
138
1997.23
1998.65
These t tests compare the group of cases with data on SATM to the group of cases without data on
SATM. Notice that those who did not answer the SATM question scored significantly higher on the Statophobia item than did those who did answer the SATM question. There is also a hint that the frequency of failure to answer the SATM question has increased over the years.
I have cut out most of the table below, but left in enough to show you how SPSS groups cases by the pattern of missing values. The most frequent pattern was missing data on SATM but not on any other variables. Cases 646 through 629 and case 631 were missing data on SATM and Statophobia.
Case 630 (and others) were missing data only on Statophobia, and so on.
Missing Patterns (cases with missing values)
Case # Missing % Missing Missing and Extreme Value Patterns a
Gender Year Eye Nucoph Ideal Statoph SATM
2
2
2
2
1
2
1
1
1
1
1
14.3
14.3
14.3
28.6
28.6
28.6
28.6
28.6
14.3
14.3
14.3
-
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
444
221
552
311
194
665
558
78
2
2
1
2
1
1
2
1
28.6
14.3
14.3
14.3
14.3
28.6
28.6
28.6
S
-
S
S
S
S
S
S
S
S
S
S
S
- indicates an extreme low value, while + indicates an extreme high value. The range used is (Q1 - 1.5*IQR, Q3 + 1.5*IQR). a. Cases and variables are sorted on missing patterns.
EM Estimated Statistics
Here we have estimated means, covariances , and correlation coefficients. Little’s MCAR test the null that the missing data are Missing Completely At Random. Since it is significant, we conclude that the data are
NOT missing completely at random. The majority opinion is that EM estimates are not trustworthy when the data at not missing completely at random.
EM Means a
Gender Ideal Statoph Nucoph SATM Year
1.26 70.32 6.29 58.02 504.60 1997.51 a. Little's MCAR test: Chi-Square = 61.477, DF = 32, Sig. =
.001
EM Covariances a
Gender
Ideal
Statoph
Nucoph
Gender Ideal Statoph Nucoph SATM
.191
-.949 14.833
-.146 .698 5.326
-.815 8.702 1.749 503.155
SATM
Year a. Little's MCAR test: Chi-Square = 61.477, DF = 32, Sig. = .001
Year
2.480 -18.295 -73.667 63.812 9171.847
.032 -.483 -3.288 1.114 259.774 74.301
Gender
EM Correlations a
Gender Ideal Statoph Nucoph SATM Year
1
Ideal
Statoph
-.563
-.145
1
.079 1
Nucoph
SATM
Year
-.083
.059
.008
.101
-.050
-.015
.034
-.333
-.165
1
.030
.006
1
.315 a. Little's MCAR test: Chi-Square = 61.477, DF = 32, Sig. = .001
1
ECU Users: Curiously, the SPSS (20) installation provided for ECU faculty to use on campus does not contain the missing values and multiple imputation modules, but that provided for use off campus does. Go figure.
Karl L. Wuensch , December, 2012
Return to Wuensch’s SPSS Lessons Page