A User-Friendly Demonstration of Principal Components Analysis as

advertisement
“A User-Friendly
Demonstration of Principal
Components Analysis as a
Data Reduction Method”
R. Michael Haynes, PhD
Assistant Vice President
Student Life Studies
Tarleton State University
Keith Lamb, MBA
Associate Vice President
Student Affairs
Midwestern State University
What is Principal Components
Analysis (PCA)?
 A member of the general linear model (GLM) where all




analyses are correlational
Term often used interchangeably with “factor analysis”,
however, there are slight differences
A method of reducing large data sets into more
manageable “factors” or “components”
A method of identifying the most useful variables in a
dataset
A method of identifying and classifying variables across
common themes, or constructs that they represent
Before we get started, a GLOSSARY of
terms we’ll be using today:

















Bartletts’s Test of Sphericity
Communality coefficients
Construct
Correlation matrix
Cronbach’s alpha coefficient
Effect sizes (variance accounted for)
Eigenvalues
Extraction
Factor or component
Kaiser criterion for retaining factors
Kaiser-Meyer-Olkin Measure of Sampling Adequacy
Latent
Reliability
Rotation
Scree plot
Split-half reliability
Structure coefficients
Desired outcomes from
today’s session
 Understand:
 The terminology associated with principal components analysis
(PCA)
 When using PCA is appropriate
 Conducting PCA using SPSS 17.0
 Interpreting a correlation matrix
 Interpreting a communality matrix
 Interpreting a components matrix and the methods used in
determining how many components to retain
 Analyzing a component to determine which variables to include
and why
 The concept of reliability and why it is important in survey
research
LETS GET STARTED!!
When is using PCA appropriate?
 When your data is interval or ratio level
 When you have at least 5 observations per variable and
at least 100 observations (i.e.…20 variables>100
observations)
 When trying to reduce the number of variables to be
used in another GLM technique (i.e....regression,
MANOVA, etc...)
 When attempting to identify latent constructs that are
being measured by observed variables in the absence of
a priori theory.
HUERISTIC DATA
 Responses to the Developing Purpose Inventory (DPI)





collected at a large, metropolitan university between
2004-2006 (IRB approval received)
45 questions related to Chickering’s developing purpose
stage
Responses on 5 interval scale; 1=”always true” to
5=”never true”
Sample size = 998 participants
SUGGESTION: always visually inspect data for missing
cases and potential outliers! (APA Task Force on
Statistical Inference, 1999).
Multiple ways of dealing with missing data, but that’s for
another day!
SPSS 17.0
 Make sure your set-up in “Variable View” is complete to
accommodate your data
 Names, labels, possible values of the data, and type of measure
SPSS 17.0
 Analyze>Dimension Reduction>Factor
SPSS 17.0 SYNTAX
Orange indicates sections specific to your analysis!
DATASET ACTIVATE DataSet1.
FACTOR
/VARIABLES question1 question2 question3 question4 question5 question6 question7 question8
question9 question10 question11 question12 question13 question14 question15 question16
question17 question18 question19 question20 question21 question22 question23 question24
question25 question26 question27 question28 question29 question30 question31 question32
question33 question34 question35 question36 question37 question38 question39 question40
question41 question42 question43 question44 question45
/MISSING LISTWISE
/ANALYSIS question1 question2 question3 question4 question5 question6 question7 question8
question9 question10 question11 question12 question13 question14 question15 question16
question17 question18 question19 question20 question21 question22 question23 question24
question25 question26 question27 question28 question29 question30 question31 question32
question33 question34 question35 question36 question37 question38 question39 question40
question41 question42 question43 question44 question45
/PRINT INITIAL CORRELATION SIG KMO EXTRACTION ROTATION FSCORE
/FORMAT SORT BLANK(.000)
/PLOT EIGEN
/CRITERIA MINEIGEN(1) ITERATE(25)
/EXTRACTION PC
/CRITERIA ITERATE(25)
/ROTATION VARIMAX
/SAVE AR(ALL)
/METHOD=CORRELATION.
OUTPUT COMPONENTS
 Correlation Matrix
 Pearson R between the individual variables
 Variables range from -1.0 to +1.0; strong, modest, weak; positive,
negative
 Correlations of 1.00 on the diagonal; every variable is “perfectly and
positively” correlated with itself!
 It is this information that is the basis for PCA! In other words, if you
have only a correlation matrix, you can conduct PCA!
Question 1 - ARI
Question 1 - ARI
Question 2 - VI
Question 3 - SL
Question 4 - ARI
Question 5 - VI
1.000
.157
.077
.165
.069
Question 2 - VI
.157
1.000
.261
.109
.211
Question 3 - SL
.077
.261
1.000
.157
.017
Question 4 - ARI
.165
.109
.157
1.000
.098
Question 5 - VI
.069
.211
.017
.098
1.000
OUTPUT COMPONENTS
 KMO Measure of Sampling Adequacy and Bartlett’s
Test of Sphericity
 KMO values closer to 1.0 are better
 Kaiser (1970 & 1975; as cited by Meyers, Gamst, & Guarino, 2006)
states that a value of .70 is considered adequate.
 Bartlett’s Test: you want a statistically significant value
 Reject the null hypothesis of a lack of sufficient correlation between the
variables.
Kaiser-Meyer-Olkin Measure of Sampling Adequacy.
Bartlett's Test of
Sphericity
Approx. Chi-Square
.861
9193.879
df
990
Sig.
.000
OUTPUT COMPONENTS
 Communality Coefficients
 amount of variance in the
variable accounted for
by the components
 higher coefficients
=stronger variables
 lower coefficients
=weaker variables
Initial
Extraction
Question 1 - ARI
1.000
.560
Question 2 - VI
1.000
.446
Question 3 - SL
1.000
.773
Question 4 - ARI
1.000
.519
Question 5 - VI
1.000
.539
Question 6 - SL
1.000
.439
Question 7 - ARI
1.000
.605
Question 8 - VI
1.000
.527
Question 9 - SL
1.000
.537
Question 10 - ARI
1.000
.775
Question 11 - VI
1.000
.635
Question 12 - SL
1.000
.476
Question 13 - ARI
1.000
.542
Question 14 - VI
1.000
.435
Question 15 - SL
1.000
.426
OUTPUT COMPONENTS
 Total Variance Explained Table
 Lists the individual components (remember, you have as
many components as you have variables) by eigenvalue and
variance accounted for
 How do we determine how many components to retain?
Initial Eigenvalues
% of Variance
Extraction Sums of Squared Loadings
Cumulative %
Total
% of Variance
Cumulative %
Rotation Sums of Squared Loadings
Component
Total
Total
% of Variance
Cumulative %
1
7.216
16.035
16.035
7.216
16.035
16.035
3.666
8.147
8.147
2
3.107
6.904
22.938
3.107
6.904
22.938
2.649
5.887
14.034
3
2.455
5.456
28.395
2.455
5.456
28.395
2.597
5.771
19.806
4
1.846
4.103
32.498
1.846
4.103
32.498
2.555
5.677
25.482
5
1.690
3.755
36.253
1.690
3.755
36.253
2.243
4.984
30.466
6
1.458
3.239
39.493
1.458
3.239
39.493
2.189
4.865
35.331
7
1.307
2.906
42.398
1.307
2.906
42.398
1.746
3.880
39.212
8
1.180
2.623
45.021
1.180
2.623
45.021
1.578
3.507
42.719
9
1.107
2.461
47.482
1.107
2.461
47.482
1.555
3.455
46.174
10
1.064
2.364
49.846
1.064
2.364
49.846
1.314
2.919
49.093
11
1.024
2.275
52.121
1.024
2.275
52.121
1.221
2.712
51.805
12
1.014
2.253
54.374
1.014
2.253
54.374
1.156
2.569
54.374
13
.976
2.170
56.544
OUTPUT COMPONENTS
 Total Variance Explained Table
 Kaiser Criterion (K1 Rule): retain only those components with
an eigenvalue of greater than 1; can lead to retaining more
components than necessary
Initial Eigenvalues
% of Variance
Extraction Sums of Squared Loadings
Cumulative %
Total
% of Variance
Cumulative %
Rotation Sums of Squared Loadings
Component
Total
Total
% of Variance
Cumulative %
1
7.216
16.035
16.035
7.216
16.035
16.035
3.666
8.147
8.147
2
3.107
6.904
22.938
3.107
6.904
22.938
2.649
5.887
14.034
3
2.455
5.456
28.395
2.455
5.456
28.395
2.597
5.771
19.806
4
1.846
4.103
32.498
1.846
4.103
32.498
2.555
5.677
25.482
5
1.690
3.755
36.253
1.690
3.755
36.253
2.243
4.984
30.466
6
1.458
3.239
39.493
1.458
3.239
39.493
2.189
4.865
35.331
7
1.307
2.906
42.398
1.307
2.906
42.398
1.746
3.880
39.212
8
1.180
2.623
45.021
1.180
2.623
45.021
1.578
3.507
42.719
9
1.107
2.461
47.482
1.107
2.461
47.482
1.555
3.455
46.174
10
1.064
2.364
49.846
1.064
2.364
49.846
1.314
2.919
49.093
11
1.024
2.275
52.121
1.024
2.275
52.121
1.221
2.712
51.805
12
1.014
2.253
54.374
1.014
2.253
54.374
1.156
2.569
54.374
13
.976
2.170
56.544
OUTPUT COMPONENTS
 Total Variance Explained Table
 Retain as many factors as will account for a pre-determined
amount of variance, say 70%; can lead to retention of
components that are variable specific (Stevens, 2002)
Initial Eigenvalues
% of Variance
Extraction Sums of Squared Loadings
Cumulative %
Total
% of Variance
Cumulative %
Rotation Sums of Squared Loadings
Component
Total
Total
% of Variance
Cumulative %
1
7.216
16.035
16.035
7.216
16.035
16.035
3.666
8.147
8.147
2
3.107
6.904
22.938
3.107
6.904
22.938
2.649
5.887
14.034
3
2.455
5.456
28.395
2.455
5.456
28.395
2.597
5.771
19.806
4
1.846
4.103
32.498
1.846
4.103
32.498
2.555
5.677
25.482
5
1.690
3.755
36.253
1.690
3.755
36.253
2.243
4.984
30.466
6
1.458
3.239
39.493
1.458
3.239
39.493
2.189
4.865
35.331
7
1.307
2.906
42.398
1.307
2.906
42.398
1.746
3.880
39.212
8
1.180
2.623
45.021
1.180
2.623
45.021
1.578
3.507
42.719
9
1.107
2.461
47.482
1.107
2.461
47.482
1.555
3.455
46.174
10
1.064
2.364
49.846
1.064
2.364
49.846
1.314
2.919
49.093
11
1.024
2.275
52.121
1.024
2.275
52.121
1.221
2.712
51.805
12
1.014
2.253
54.374
1.014
2.253
54.374
1.156
2.569
54.374
13
.976
2.170
56.544
OUTPUT COMPONENTS
 Scree Plot
 Plots eigenvalues on Y axis and component number on X
axis
 Recommendation is to
retain all components
in the descent before
the first one on the line
where it levels off
(Cattell, 1966; as cited
by Stevens, 2002).
Other Retention Methods
 Velicer’s Minimum Average Partial (MAP) test
 Seeks to determine what components are common
 Does not seek “cut-off” point, but rather to find a more
“comprehensive” solution
 Components that have high number of highly correlated
variables are retained
 However, variable based decisions can result in
underestimating the number of components to retain
(Ledesma & Valero-Mora, 2007)
Other Retention Methods
 Horn’s Parallel Analysis (PA)
 Compares observed eigenvalues with “simulated”
eigenvalues
 Retain all components with an eigenvalue greater than the
“mean” of the simulated eigenvalues
 Considered highly accurate and exempt from extraneous
factors
(Ledesma & Valero-Mora, 2007)
OUTPUT COMPONENTS
 Component Matrix
 Column values are structure coefficients, or the
correlation between the test question and the synthetic
component; REMEMBER: squared structure
coefficients inform us of how well the item can
reproduce the effect in the component!
Rotated Component Matrixa
Component
1
2
3
4
5
6
7
8
9
10
Question 42 - SL
.781
-.060
.000
.117
.034
.071
.055
-.062
.093
-.002
.032
.025
Question 39 - SL
.778
-.132
.107
.109
.008
.024
-.025
.018
.044
-.010
.022
-.025
Question 33 - SL
.765
-.042
.115
.098
.034
.090
-.035
-.035
.011
.013
-.012
.020
Question 9 - SL
.672
-.103
.127
.092
.050
.126
.005
-.119
-.002
-.063
-.034
-.114
Question 37 - ARI
.462
-.173
.193
-.103
.075
.197
.345
-.018
.024
.232
.009
.119
Question 15 - SL
.406
-.002
.340
.038
.050
.091
.120
-.007
.067
-.152
-.127
-.273
Question 36 - SL
.395
-.067
.212
-.104
.225
.125
.365
-.089
.110
.168
-.037
.221
Question 44 - VI
.375
-.033
.360
.128
.175
.091
.221
-.023
.177
-.035
-.027
-.001
Question 26 - VI
-.022
.660
-.113
.009
.021
-.063
-.096
.089
.044
.034
-.060
.174
Question 27 - SL
-.158
.652
-.088
.032
.069
-.091
.040
.193
-.032
-.150
-.019
.003
Question 38 - VI
-.058
.501
-.109
-.171
.032
-.276
-.051
.078
-.042
.255
-.016
-.097
Question 20 - VI
-.240
.489
.016
.076
.036
-.092
-.052
.434
-.102
.071
-.079
.056
Question 32 - VI
-.101
.488
-.134
.084
-.074
-.415
-.010
.046
.025
-.057
-.050
.020
Question 45 - SL
-.144
.443
-.049
-.097
-.105
-.026
-.097
.078
-.031
.057
.421
-.013
Question 29 - VI
-.006
.439
.154
-.114
.007
.231
.238
-.196
.145
-.098
.089
-.138
Question 41 - VI
-.019
.421
-.087
-.210
.006
-.107
.333
-.005
.125
.091
.300
-.082
Question 24 - SL
.129
-.067
.720
.101
.147
.119
-.003
.011
.005
.011
-.012
.203
Question 21 - SL
.125
-.164
.676
-.056
.161
.047
.160
-.044
-.012
.137
-.006
.029
Question 23 - VI
.313
-.164
.537
.286
.063
.007
.076
-.094
.119
.049
.123
.031
Question 17 - VI
.076
-.050
.459
.187
.040
.136
.314
.048
.120
-.212
.083
-.140
Question 30 - SL
.120
.114
.420
.287
-.081
.309
-.109
-.165
.061
.328
-.107
.161
Question 22 - ARI
.042
.075
.364
.045
.087
-.081
-.135
-.353
.324
.216
.016
-.188
Question 34 - ARI
.187
.042
.067
.791
-.002
.075
-.031
-.019
.012
.063
-.050
-.036
Question 1 - ARI
-.002
-.062
.082
.722
.055
-.018
.008
-.014
.039
.132
.015
-.075
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
11
12
Rotated Component Matrixa , continued
Component
Question 35 - VI
.113
.077
.015
.569
.030
.439
-.053
-.140
.067
-.089
.095
.105
Question 40 - ARI
.194
-.161
.176
.553
.033
.057
-.041
.016
.186
-.086
.216
.147
Question 10 - ARI
.029
.016
.144
.033
.860
.036
.010
-.074
.032
.063
-.010
.006
Question 3 - SL
.069
-.015
.197
.050
.848
-.029
.025
-.011
.067
-.026
-.003
.004
Question 12 - SL
.297
.069
.072
.000
.488
.137
.282
.024
.033
.091
.082
.158
Question 13 - ARI
-.046
.058
-.118
.045
.447
-.102
.321
.069
.128
.368
-.222
-.033
Question 11 - VI
.151
-.021
.024
.361
.115
.663
.000
-.006
-.124
-.028
.021
.104
Question 5 - VI
.154
-.134
.201
.042
-.057
.652
.020
.028
-.019
.124
.039
-.092
Question 8 - VI
-.090
.250
-.017
.010
.000
-.623
-.034
.115
-.105
.141
.120
.088
Question 18 - SL
.034
.003
.095
-.055
.092
-.039
.686
-.026
.015
.006
-.024
.036
Question 14 - VI
.241
-.157
.289
-.007
.132
.221
.418
.061
-.057
-.006
.122
-.080
Question 28 - ARI
-.232
.248
.051
.181
-.128
-.237
.357
-.112
.043
.074
-.144
.240
Question 16 - ARI
-.069
.213
-.008
.062
-.006
-.075
.033
.678
-.051
-.101
-.103
.023
Question 19 - ARI
.001
.054
-.042
-.241
-.033
-.010
-.112
.630
.147
-.010
.127
.036
Question 43 - ARI
.138
-.011
.067
.255
.017
.045
-.091
.086
.756
.024
-.074
.075
Question 31 - ARI
.062
.045
.069
-.048
.122
-.040
.186
-.053
.721
.140
-.077
.033
Question 4 - ARI
.023
-.057
.119
.100
.132
.007
.034
-.131
.184
.643
.020
-.088
Question 6 - SL
-.186
.177
-.039
.065
-.051
-.066
.087
.372
-.059
.390
.230
-.080
Question 7 - ARI
.024
-.059
.047
.149
.010
.005
.016
-.017
-.133
.008
.736
.126
Question 2 - VI
.234
-.198
.246
.175
.233
.094
.203
.086
.179
-.161
.254
-.162
-.048
.063
.119
.021
.073
-.049
.064
.085
.078
-.123
.108
.767
Question 25 - ARI
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
OUTPUT COMPONENTS
 Component Matrix
 Column values are structure coefficients, or the
correlation between the test question and the synthetic
component; REMEMBER: squared structure
coefficients inform us of how well the item can
reproduce the effect in the component!
 Rule of thumb, include all items with structure
coefficients with an absolute value of .300 or greater
Rotated Component Matrixa
Component
1
2
3
4
5
6
7
8
9
10
Question 42 - SL
.781
-.060
.000
.117
.034
.071
.055
-.062
.093
-.002
.032
.025
Question 39 - SL
.778
-.132
.107
.109
.008
.024
-.025
.018
.044
-.010
.022
-.025
Question 33 - SL
.765
-.042
.115
.098
.034
.090
-.035
-.035
.011
.013
-.012
.020
Question 9 - SL
.672
-.103
.127
.092
.050
.126
.005
-.119
-.002
-.063
-.034
-.114
Question 37 - ARI
.462
-.173
.193
-.103
.075
.197
.345
-.018
.024
.232
.009
.119
Question 15 - SL
.406
-.002
.340
.038
.050
.091
.120
-.007
.067
-.152
-.127
-.273
Question 36 - SL
.395
-.067
.212
-.104
.225
.125
.365
-.089
.110
.168
-.037
.221
Question 44 - VI
.375
-.033
.360
.128
.175
.091
.221
-.023
.177
-.035
-.027
-.001
Question 26 - VI
-.022
.660
-.113
.009
.021
-.063
-.096
.089
.044
.034
-.060
.174
Question 27 - SL
-.158
.652
-.088
.032
.069
-.091
.040
.193
-.032
-.150
-.019
.003
Question 38 - VI
-.058
.501
-.109
-.171
.032
-.276
-.051
.078
-.042
.255
-.016
-.097
Question 20 - VI
-.240
.489
.016
.076
.036
-.092
-.052
.434
-.102
.071
-.079
.056
Question 32 - VI
-.101
.488
-.134
.084
-.074
-.415
-.010
.046
.025
-.057
-.050
.020
Question 45 - SL
-.144
.443
-.049
-.097
-.105
-.026
-.097
.078
-.031
.057
.421
-.013
Question 29 - VI
-.006
.439
.154
-.114
.007
.231
.238
-.196
.145
-.098
.089
-.138
Question 41 - VI
-.019
.421
-.087
-.210
.006
-.107
.333
-.005
.125
.091
.300
-.082
Question 24 - SL
.129
-.067
.720
.101
.147
.119
-.003
.011
.005
.011
-.012
.203
Question 21 - SL
.125
-.164
.676
-.056
.161
.047
.160
-.044
-.012
.137
-.006
.029
Question 23 - VI
.313
-.164
.537
.286
.063
.007
.076
-.094
.119
.049
.123
.031
Question 17 - VI
.076
-.050
.459
.187
.040
.136
.314
.048
.120
-.212
.083
-.140
Question 30 - SL
.120
.114
.420
.287
-.081
.309
-.109
-.165
.061
.328
-.107
.161
Question 22 - ARI
.042
.075
.364
.045
.087
-.081
-.135
-.353
.324
.216
.016
-.188
Question 34 - ARI
.187
.042
.067
.791
-.002
.075
-.031
-.019
.012
.063
-.050
-.036
Question 1 - ARI
-.002
-.062
.082
.722
.055
-.018
.008
-.014
.039
.132
.015
-.075
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
11
12
Rotated Component Matrixa , continued
Component
Question 35 - VI
.113
.077
.015
.569
.030
.439
-.053
-.140
.067
-.089
.095
.105
Question 40 - ARI
.194
-.161
.176
.553
.033
.057
-.041
.016
.186
-.086
.216
.147
Question 10 - ARI
.029
.016
.144
.033
.860
.036
.010
-.074
.032
.063
-.010
.006
Question 3 - SL
.069
-.015
.197
.050
.848
-.029
.025
-.011
.067
-.026
-.003
.004
Question 12 - SL
.297
.069
.072
.000
.488
.137
.282
.024
.033
.091
.082
.158
Question 13 - ARI
-.046
.058
-.118
.045
.447
-.102
.321
.069
.128
.368
-.222
-.033
Question 11 - VI
.151
-.021
.024
.361
.115
.663
.000
-.006
-.124
-.028
.021
.104
Question 5 - VI
.154
-.134
.201
.042
-.057
.652
.020
.028
-.019
.124
.039
-.092
Question 8 - VI
-.090
.250
-.017
.010
.000
-.623
-.034
.115
-.105
.141
.120
.088
Question 18 - SL
.034
.003
.095
-.055
.092
-.039
.686
-.026
.015
.006
-.024
.036
Question 14 - VI
.241
-.157
.289
-.007
.132
.221
.418
.061
-.057
-.006
.122
-.080
Question 28 - ARI
-.232
.248
.051
.181
-.128
-.237
.357
-.112
.043
.074
-.144
.240
Question 16 - ARI
-.069
.213
-.008
.062
-.006
-.075
.033
.678
-.051
-.101
-.103
.023
Question 19 - ARI
.001
.054
-.042
-.241
-.033
-.010
-.112
.630
.147
-.010
.127
.036
Question 43 - ARI
.138
-.011
.067
.255
.017
.045
-.091
.086
.756
.024
-.074
.075
Question 31 - ARI
.062
.045
.069
-.048
.122
-.040
.186
-.053
.721
.140
-.077
.033
Question 4 - ARI
.023
-.057
.119
.100
.132
.007
.034
-.131
.184
.643
.020
-.088
Question 6 - SL
-.186
.177
-.039
.065
-.051
-.066
.087
.372
-.059
.390
.230
-.080
Question 7 - ARI
.024
-.059
.047
.149
.010
.005
.016
-.017
-.133
.008
.736
.126
Question 2 - VI
.234
-.198
.246
.175
.233
.094
.203
.086
.179
-.161
.254
-.162
-.048
.063
.119
.021
.073
-.049
.064
.085
.078
-.123
.108
.767
Question 25 - ARI
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
OUTPUT COMPONENTS
 Component Matrix
 For heuristic purposes, we’re retaining the first X
components; what variables should we include in the
components?
 Column values are structure coefficients, or the
correlation between the test question and the synthetic
component; REMEMBER: squared structure
coefficients inform us of how well the item can
reproduce the effect in the component!
 Rule of thumb, include all items with structure
coefficients with an absolute value of .300 or greater
 Stevens’ recommends a better way!
Critical Values for a Correlation Coefficient at α =
.01 for a Two-Tailed Test
n
CV
n
CV
50
.361
180
.192
80
.286
200
.182
100
.256
250
.163
140
.217
300
.149
(Stevens, 2002, pp. 394)
n
400
600
800
1000
CV
.129
.105
.091
.081
 Test the structure coefficient for statistical significance against a
two-tailed table based on sample size and a critical value (CV); for
our sample size of 998, the CV would be |.081| doubled (two-tailed),
or |.162|.
Rotated Component Matrixa
Component
1
2
3
4
5
6
7
8
9
10
Question 42 - SL
.781
-.060
.000
.117
.034
.071
.055
-.062
.093
-.002
.032
.025
Question 39 - SL
.778
-.132
.107
.109
.008
.024
-.025
.018
.044
-.010
.022
-.025
Question 33 - SL
.765
-.042
.115
.098
.034
.090
-.035
-.035
.011
.013
-.012
.020
Question 9 - SL
.672
-.103
.127
.092
.050
.126
.005
-.119
-.002
-.063
-.034
-.114
Question 37 - ARI
.462
-.173
.193
-.103
.075
.197
.345
-.018
.024
.232
.009
.119
Question 15 - SL
.406
-.002
.340
.038
.050
.091
.120
-.007
.067
-.152
-.127
-.273
Question 36 - SL
.395
-.067
.212
-.104
.225
.125
.365
-.089
.110
.168
-.037
.221
Question 44 - VI
.375
-.033
.360
.128
.175
.091
.221
-.023
.177
-.035
-.027
-.001
Question 26 - VI
-.022
.660
-.113
.009
.021
-.063
-.096
.089
.044
.034
-.060
.174
Question 27 - SL
-.158
.652
-.088
.032
.069
-.091
.040
.193
-.032
-.150
-.019
.003
Question 38 - VI
-.058
.501
-.109
-.171
.032
-.276
-.051
.078
-.042
.255
-.016
-.097
Question 20 - VI
-.240
.489
.016
.076
.036
-.092
-.052
.434
-.102
.071
-.079
.056
Question 32 - VI
-.101
.488
-.134
.084
-.074
-.415
-.010
.046
.025
-.057
-.050
.020
Question 45 - SL
-.144
.443
-.049
-.097
-.105
-.026
-.097
.078
-.031
.057
.421
-.013
Question 29 - VI
-.006
.439
.154
-.114
.007
.231
.238
-.196
.145
-.098
.089
-.138
Question 41 - VI
-.019
.421
-.087
-.210
.006
-.107
.333
-.005
.125
.091
.300
-.082
Question 24 - SL
.129
-.067
.720
.101
.147
.119
-.003
.011
.005
.011
-.012
.203
Question 21 - SL
.125
-.164
.676
-.056
.161
.047
.160
-.044
-.012
.137
-.006
.029
Question 23 - VI
.313
-.164
.537
.286
.063
.007
.076
-.094
.119
.049
.123
.031
Question 17 - VI
.076
-.050
.459
.187
.040
.136
.314
.048
.120
-.212
.083
-.140
Question 30 - SL
.120
.114
.420
.287
-.081
.309
-.109
-.165
.061
.328
-.107
.161
Question 22 - ARI
.042
.075
.364
.045
.087
-.081
-.135
-.353
.324
.216
.016
-.188
Question 34 - ARI
.187
.042
.067
.791
-.002
.075
-.031
-.019
.012
.063
-.050
-.036
Question 1 - ARI
-.002
-.062
.082
.722
.055
-.018
.008
-.014
.039
.132
.015
-.075
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
11
12
Rotated Component Matrixa , continued
Component
Question 35 - VI
.113
.077
.015
.569
.030
.439
-.053
-.140
.067
-.089
.095
.105
Question 40 - ARI
.194
-.161
.176
.553
.033
.057
-.041
.016
.186
-.086
.216
.147
Question 10 - ARI
.029
.016
.144
.033
.860
.036
.010
-.074
.032
.063
-.010
.006
Question 3 - SL
.069
-.015
.197
.050
.848
-.029
.025
-.011
.067
-.026
-.003
.004
Question 12 - SL
.297
.069
.072
.000
.488
.137
.282
.024
.033
.091
.082
.158
Question 13 - ARI
-.046
.058
-.118
.045
.447
-.102
.321
.069
.128
.368
-.222
-.033
Question 11 - VI
.151
-.021
.024
.361
.115
.663
.000
-.006
-.124
-.028
.021
.104
Question 5 - VI
.154
-.134
.201
.042
-.057
.652
.020
.028
-.019
.124
.039
-.092
Question 8 - VI
-.090
.250
-.017
.010
.000
-.623
-.034
.115
-.105
.141
.120
.088
Question 18 - SL
.034
.003
.095
-.055
.092
-.039
.686
-.026
.015
.006
-.024
.036
Question 14 - VI
.241
-.157
.289
-.007
.132
.221
.418
.061
-.057
-.006
.122
-.080
Question 28 - ARI
-.232
.248
.051
.181
-.128
-.237
.357
-.112
.043
.074
-.144
.240
Question 16 - ARI
-.069
.213
-.008
.062
-.006
-.075
.033
.678
-.051
-.101
-.103
.023
Question 19 - ARI
.001
.054
-.042
-.241
-.033
-.010
-.112
.630
.147
-.010
.127
.036
Question 43 - ARI
.138
-.011
.067
.255
.017
.045
-.091
.086
.756
.024
-.074
.075
Question 31 - ARI
.062
.045
.069
-.048
.122
-.040
.186
-.053
.721
.140
-.077
.033
Question 4 - ARI
.023
-.057
.119
.100
.132
.007
.034
-.131
.184
.643
.020
-.088
Question 6 - SL
-.186
.177
-.039
.065
-.051
-.066
.087
.372
-.059
.390
.230
-.080
Question 7 - ARI
.024
-.059
.047
.149
.010
.005
.016
-.017
-.133
.008
.736
.126
Question 2 - VI
.234
-.198
.246
.175
.233
.094
.203
.086
.179
-.161
.254
-.162
-.048
.063
.119
.021
.073
-.049
.064
.085
.078
-.123
.108
.767
Question 25 - ARI
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
Obtaining Continuous Component Values
for Use in Further Analysis
 Sum the interval values for the responses of all
questions included in the retained component
 Obtain mean values for the responses of all questions
included in the retained component…hint…you’ll get the
same R, R², ß, and structure coefficients as with the
sums!
 Use SPSS to obtain factor scores for the component

Choose “Scores” button when setting up your PCA

Options include calculating scores based on regression, Bartlett, or
Anderson-Rubin methodologies…be sure and check “Save as
Variables”

Factor scores will appear in your data set and can be used as
variables in other GLM analyses
RELIABILITY
 The extent to which scores on a test are consistent
across multiple administrations of the test; the amount of
measurement error in the scores yielded by a test (Gall,
Gall, & Borg, 2003).
 While validity is important in ensuring our tests are really
measuring what we intended to measure; “You wouldn’t
administer an English literature test to assess math
competency, would you?”
 Can be measured several ways using SPSS 17.0
A Visual Explanation of
Reliability and Validity
RELIABILITY
RELIABILITY
RELIABILITY
Cronbach’s Alpha Coefficient
RELIABILITY
/VARIABLES=question1 question2 question3 question4 question5 question6 question7 question8
question9 question10 question11 question12 question13 question14 question15 question16 question17
question18 question19 question20 question21 question22 question23 question24 question25 question26
question27 question28 question29 question30 question31 question32 question33 question34 question35
question36 question37 question38 question39 question40 question41 question42 question43 question44
question45
/SCALE('ALL VARIABLES') ALL
/MODEL=ALPHA.
Split-Half Coefficient
RELIABILITY
/VARIABLES=question1 question2 question3 question4 question5 question6 question7 question8
question9 question10 question11 question12 question13 question14 question15 question16 question17
question18 question19 question20 question21 question22 question23 question24 question25 question26
question27 question28 question29 question30 question31 question32 question33 question34 question35
question36 question37 question38 question39 question40 question41 question42 question43 question44
question45
/SCALE('ALL VARIABLES') ALL
/MODEL=SPLIT.
RELIABILITY
Cronbach’s Alpha Coefficient
Reliability Statistics
Cronbach's Alpha
.749
Benchmarks for Alpha
• .9 & up = very good
• .8 to .9 = good
• .7 to .8 = acceptable
• .7 & below = suspect.
N of Items
45
“… don’t refer to the
test as ‘reliable’, but
scores from this
administration of the
test yielded reliable
results”….Kyle Roberts
RELIABILITY
Split-Half Coefficient
Reliability Statistics
Cronbach's Alpha
Part 1
Value
N of Items
Part 2
Value
N of Items
Total N of Items
Spearman-Brown
Coefficient
.620
23a
.623
22b
45
Correlation Between
Forms
.518
Equal Length
.683
Unequal Length
.683
Guttman Split-Half
Coefficient
.683
a. The items are: Question 1 - ARI, Question 2 - VI, Question 3 - SL, Question 4
- ARI, Question 5 - VI, Question 6 - SL, Question 7 - ARI, Question 8 - VI,
Question 9 - SL, Question 10 - ARI, Question 11 - VI, Question 12 - SL,
Question 13 - ARI, Question 14 - VI, Question 15 - SL, Question 16 - ARI,
Question 17 - VI, Question 18 - SL, Question 19 - ARI, Question 20 - VI,
Question 21 - SL, Question 22 - ARI, Question 23 - VI.
b. The items are: Question 23 - VI, Question 24 - SL, Question 25 - ARI,
Question 26 - VI, Question 27 - SL, Question 28 - ARI, Question 29 - VI,
Question 30 - SL, Question 31 - ARI, Question 32 - VI, Question 33 - SL,
Question 34 - ARI, Question 35 - VI, Question 36 - SL, Question 37 - ARI,
Question 38 - VI, Question 39 - SL, Question 40 - ARI, Question 41 - VI,
Question 42 - SL, Question 43 - ARI, Question 44 - VI, Questiton 45 - SL.
RELATED LINKS
 http://faculty.chass.ncsu.edu/garson/PA765/factor.ht
m
 http://www.uic.edu/classes/epsy/epsy546/Lecture%2
04%20--%20notes%20on%20PRINCIPAL%20COMPONENT
S%20ANALYSIS%20AND%20FACTOR%20ANALYS
IS1.pdf
 http://www.ats.ucla.edu/stat/Spss/output/factor1.htm
 http://www.statsoft.com/textbook/principalcomponents-factor-analysis/
REFERENCES







Gall, M.D., Gall, J.P., & Borg, W.R. (2003). Educational research: An introduction 7th
ed.). Boson: Allyn and Bacon.
Ledesma, R.D., & Valero-Mora, P. (2007). Determining the number of factors to
retain in EFA: an easy-to-use computer program for carrying out parallel analysis.
Practical Assessment, Research, & Evaluation, 12(2).
Meyers, L.S., Gamst, G., & Guarino, A.J. (2006). Applied multivariate research:
Design and interpretation. Thousand Oaks, CA: Sage.
Stevens, J. P. (2002). Applied multivariate statistics for the social sciences (4th ed.).
Mahwaw, NJ: Lawrence Erlbaum Associates.
University of California at Los Angeles Academic Technology Services (2009).
Annotated SPSS output: Factor analysis. Retrieved January 11, 2010 from
http://www.ats.ucla.edu/stat/Spss/output/factor1.htm
University of Illinois at Chicago (2009). Principal components analysis and factor
analysis. Retrieved January 11, 2010 from
http://www.uic.edu/classes/epsy/epsy546/Lecture%204%20--%20notes%20on%20PRINCIPAL%20COMPONENTS%20ANALYSIS%20AND%20F
ACTOR%20ANALYSIS1.pdf
Wilkinson, L. & Task Force on Statistical Inference. (1999). Statistical methods in
psychology journals: Guidelines and explanation. American Psychologist, 54, 594604.
Download