Guidelines for Reliability, Confirmatory and Exploratory Factor

advertisement
Guidelines for Reliability, Confirmatory and Exploratory Factor Analysis
Diana Suhr, Ph.D., University of Northern Colorado, Greeley CO
Mary Shay, Cherry Creek Schools, Denver CO
Abstract
Reliability refers to accuracy and precision of a measurement instrument. Confirmatory factor analysis (CFA) is a
statistical technique used to verify the factor structure of a measurement instrument. EFA, traditionally, is used to
explore the possible underlying factor structure of a measurement instrument.
Guidelines for reliability, confirmatory and exploratory factor analysis will be discussed. Examples using the Race and
Schooling Instrument (Revised, Shay 2008) and PROC CORR, PROC CALIS and PROC FACTOR illustrate
reliability, CFA and EFA statistical techniques.
Introduction
CFA and EFA are powerful statistical techniques. An example of CFA and EFA could occur with the development of
measurement instruments, e.g. a satisfaction scale, attitudes toward health, customer service questionnaire. During
the process, a blueprint is developed, questions written, a response scale determined, the instrument pilot tested,
data collected, and CFA completed. The blueprint identifies the factor structure or what we think it is. If questions do
not measure what we thought they should, the factor structure does not follow our blueprint, the factor structure is not
confirmed, and EFA is the next step. EFA helps us determine what the factor structure looks like according to
participant responses. Exploratory factor analysis is essential to determine underlying constructs for a set of
measured variables.
Measurement Instrument
The example for this presentation analyzes data collected with the Race and Schooling School-Based Instrument
(Shay, 2008). The 23-item instrument measures the essential principles of culturally responsive teaching and
learning. Each question, aligning to a principle and category, provides insight into participants’ levels of involvement
in culturally responsive teaching practices and attitudes and beliefs toward confronting institutional bias and
discrimination in schools (Shay, in press).
The Race and Schooling School-Based Instrument was developed by Singleton (2003) to include seven categories.
Permission to use and adjust the instrument for research purposes was granted by Singleton to Shay (2008). See
Appendix A to review the survey instrument.
The original survey instrument had a 5-point Likert scale with (1) Rarely, (3) Sometimes, and (5) Often as response
choices to survey questions. In the revised version, responses were changed to (1) Almost Never; (2) Seldom; (3)
Sometimes; (4) Frequently; and (5) Almost Always. Additional questions were added to allow respondents to provide
information on demographics, e.g., years of experience in the district and in the field of education.
Data was collected in a metropolitan School District in Midwestern United States. Of the 282 respondents, 16% were
male (n=42) and 84% were female (n=220). The majority of respondents were white (89.1%, n=229) with 3.5%
Hispanic (n=9), 3.5% Black (n=9), 2.0% Asian (n=5) and 2.0% identified their race as Other.
Reliability
Reliability refers to the accuracy and precision of a measurement procedure (Thorndike, Cunningham, Thorndike, &
Hagen, 1991). Reliability may be viewed as an instrument’s relative lack of error. In addition, reliability is a function of
properties of the underlying construct being measured, the test itself, the groups being assessed, the testing
environment, and the purpose of assessment. Reliability answers the question, How well does the instrument
measure what it purports to measure?
Some degree of inconsistency is present in all measurement procedures. The variability in a set of item scores is due
to the actual variation across individuals in the phenomenon that the scale measures, made up of true score and
error. Therefore, each observation of a measurement (X) is equal to true score (T) plus measurement error (e), or X =
T + e. Another way to think about total variation is that it has two components: “signal” (i.e., true differences in the
latent construct) and “noise” (i.e., differences caused by everything but true differences).
1
Sources of measurement inconsistency could be due to
1) a person changing from one testing to the next
a. the amount of time between tests may have resulted in growth or change
b. motivation to perform may be different at each testing
c. the individual may be physically better able to perform, e.g., more rested
d. the individual may have received tutoring between testings
2) the task being different from one testing to the next
a. different environment
b. different administrator
c. different test items on parallel forms
3) the sample of behavior resulting in an unstable or undependable score
a. the sample of behavior and evaluation of it are subject to chance or random influences
b. a small sample of behavior does not provide a stable and dependable characterization of an individual
c. for example, the average distance of 100 throws of a football would provide a more stable and accurate
index of ability than a single throw.
Reliability may be expressed
1) as an individual’s position in a group (correlation between first and second measurements; the more nearly
individuals are ranked in the same order, the higher the correlation and the more reliable the test)
2) within a set of repeated measures for an individual (internal consistency, how consistency are items answered)
Reliability can be assessed by
1) repeating the same test or measure (test-retest)
2) administering an equivalent form (parallel test forms)
3) using single-administration methods
a.
b.
subdividing the test into two or more equivalent parts
internal consistency – measured with Cronbach’s coefficient alpha.
Internal Consistency
Internal consistency is a procedure to estimate the reliability of a test from a single administration of a single form.
Internal consistency depends on the individual’s performance from item to item based on the standard deviation of
the test and the standard deviations of the items.
∀ =
where
( n )
( n - 1)
( SDt2 - ΕSDi2 )
(
SDt2
)
(1)
∀ is the estimate of reliability,
n is the number of items in the test,
SDt is the standard deviation of the test scores
Ε means “take the sum” and covers the n items,
SDi is the standard deviation of the scores from a group of individuals on an item.
KR20, Kuder-Richardson Formula 20, is a special form of coefficient alpha that applies when items are dichotomous
(e.g., yes/no, true/false) or are scored as right or wrong.
Factors Influencing Reliability
Many factors can affect the reliability of a measurement instrument. They are the
1) range of the group
a. pooling a wider range of grades or ages produces a reliability coefficient of higher magnitude
b. take into account the sample on which the reliability coefficient is based when comparing
instruments
2) level of ability in the group
a. precision of measurement can be related to ability level of the people being measured
b. report the standard error of measurement for different score levels
3) methods used for estimating reliability
a. amount of time between administrations
b. method of calculating reliability
2
4)
length of the test
a. when reliability is moderately high, it takes a considerable increase in test length to increase
reliability
b. relationship of reliability to length of test can be expressed with
rkk =
krtt
1 + ( k – 1) rtt
(2)
where rkk is the reliability of the test k times as
long as the original test,
rtt is the reliability of the original test, and
k is the factor by which the length of the
test is changed.
For example, If reliability is .60 for a 10-item instrument, what is reliability for a 20-item instrument?
rkk = 2(.60) / (1 + (2 – 1)(.60)) = 1.20 / 1.60 = 0.75
Levels of Reliability
Acceptable levels of reliability depend on the purpose of the instrument. Acceptable reliability of instruments
developed for research purposes can be as low as 0.60. An acceptable reliability level of a diagnostic instrument
used for making decisions about individuals (e.g., a psychological measure) should be much higher, e.g., 0.95.
Comparisions
The reliability coefficient provides a basis for assessment instrument comparison when measurement is expressed in
different scales. The assessment with the higher reliability coefficient could provide a more consistent measurement
of individuals.
Statistical Power
An often overlooked benefit of more reliable scales is that they increase statistical power for a given sample size (or
allow smaller sample size to yield equivalent power), relative to less reliable measures. A reliable measure, like a
larger sample, contributes relatively less error to the statistical analysis. The power gained from improving reliability
depends on a number of factors including
1) the initial sample size
2) the probability level set for detecting a Type I error
3) the effect size (e.g., mean difference) that is considered significant
4) the proportion of error variance that is attributable to measure unreliability rather than sample heterogeneity
or other sources.
To raise the power, substitute a highly reliable scale for a substantially poorer one. For example, when n = 50, two
scales with reliabilities of 0.38 have a correlation of 0.24, p < 0.10, and would be significant at p < 0.01 if their
reliabilities were increased to 0.90 or if the sample was more than twice as large (n > 100).
PROC CORR and options for Reliability
DATA=
OUTP=
ALPHA
NOMISS
NOCORR
NOSIMPLE
input data set
output data set with Pearson correlation statistics
compute Cronbach’s coefficient alpha
exclude observations with missing analysis values
suppresses printing Pearson correlations
suppresses printing descriptive statistics
Confirmatory Factor Analysis
CFA allows the researcher to test the hypothesis that a relationship between the observed variables and their
underlying latent construct(s) exists. The researcher uses knowledge of the theory, empirical research, or both,
postulates the relationship pattern a priori and then tests the hypothesis statistically.
3
PROC CALIS and options for CFA
DATA =
OUTSTAT=
COV
CORR
METHOD=
MAXITER=
KURTOSIS
MODIFICATION
specifies dataset to be analyzed
output statistic
analyzes covariance matrix
analyzes correlation matrix
estimation method
max iterations
compute and display kurtosis
modification indices
Exploratory Factor Analysis
Psychologists searching for a neat and tidy description of human intellectual abilities lead to the development of
th
th
factor analytic methods. Galton, a scientist during the 19 and 20 centuries, laid the foundations for factor analytic
methods by developing quantitative methods to determine the interdependence between 2 variables. Karl Pearson
was the first to explicitly define factor analysis. In 1902, Macdonnell was the first to publish an application of factor
analysis. His study compared physical characteristics of 3000 criminals and 1000 Cambridge undergraduates.
Factor analysis could be described as orderly simplification of interrelated measures. Traditionally factor analysis has
been used to explore the possible underlying structure of a set of interrelated variables without imposing any
preconceived structure on the outcome (Child, 1990). By performing exploratory factor analysis (EFA), the number of
constructs and the underlying factor structure are identified.
Goals of factor analysis are
1) to help an investigator determine the number of latent constructs underlying a set of items (variables)
2) to provide a means of explaining variation among variables (items) using few newly created variables (factors),
e.g., condensing information
3) to define the content or meaning of factors, e.g., latent constructs
Assumptions underlying EFA are
•
Interval or ratio level of measurement
•
Random sampling
•
Relationship between observed variables is linear
•
A normal distribution (each observed variable)
•
A bivariate normal distribution (each pair of observed variables)
•
Multivariate normality
Limitations of EFA are
• the correlations, the basis of factor analysis, describe relationships. No causal inferences can be made from
correlations alone.
• the reliabilty of measurement instrument (avoid instrument with low reliability)
• sample size ( larger sample Æ larger correlation)
o minimal number for reliable results is greater than 100 and 5 times the number of items
o since some subjects may not answer every item, a larger sample is desirable, e.g., for 30 items, at least
150 subjects (5*30), a sample of 200 subjects would allow for missing data.
• sample selection
o Representative of population
o Do not pool populations
• variables could be sample specific, e.g., a unique quality possessed by a group does not generalize to the
population
• nonnormal distribution of data
4
Factor Extraction
Factor analysis seeks to discover common factors. The technique for extracting factors attempts to take out as much
common variance as possible in the first factor. Subsequent factors are, in turn, intended to account for the maximum
amount of the remaining common variance until, hopefully, no common variance remains.
Direct extraction methods obtain the factor matrix directly from the correlation matrix by application of specified
mathematical models. Most factor analysts agree that direct solutions are not sufficient. Adjustment to the frames of
reference by rotation methods improves the interpretation of factor loadings by reducing some of the ambiguities
which accompany the preliminary analysis (Child, 1990). The process of manipulating the reference axes is known as
rotation. The results of rotation methods are sometimes referred to as derived solution because they are obtained as
a second stage from the results of direct solutions.
Rotation applied to the reference axes means the axes are turned about the origin until some alternative position has
o
been reached. The simplest case is when the axes are held at 90 to each other, orthogonal rotation. Rotating the
o
axes through different angles gives an oblique rotation (not at 90 to each other).
Methods
As an aside, names given to factor extraction methods have some interesting origins.
•
•
Procrustes was a highwayman who tied his victims to a bed and shaped them to its structure either by stretching
them or by cutting off their limbs. In factor analysis, the Procrustes technique/method involves testing data to see
how close they fit a hypothesized factor structure.
The plasmode method is taken from well-established areas (e.g., physics, chemistry) so that the factor structure
is predictable.
Criteria for Extracting Factors
Determining the number of factors to extract in a factor analytic procedure is dependent on meeting appropriate
criteria. They are
1) Kaiser’s criterion, suggested by Guttman and adapted by Kaiser, considers factors with an eigenvalue greater
than one as common factors (Nunnally, 1978)
2) Cattell’s (1966) scree test. The name is based on an analogy between the debris, called scree, that collects at
the bottom of a hill after a landslide, and the relatively meaningless factors that result from overextraction. On a
scree plot, because each factor explains less variance than the preceding factors, an imaginary line connecting
the markers for successive factors generally runs from top left of the graph to the bottom right. If there is a point
below which factors explain relatively little variance and above which they explain substantially more, this usually
appears as an “elbow” in the plot. This plot bears some physical resemblance to the profile of a hillside. The
portion beyond the elbow corresponds to the rubble, or scree, that gathers. Cattell’s guidelines call for retaining
factors above the elbow and rejecting those below it. This amounts to keeping the factors that contribute most to
the variance
3) Proportion of variance accounted for keeps a factor if it accounts for a predetermined amount of the variance
(e.g., 5%, 10%).
4) Interpretability criteria
a. Are there at least 3 items with significant loadings (>0.30)?
b. Do the variables that load on a factor share some conceptual meaning?
c. Do the variables that load on different factors seem to measure different constructs?
d. Does the rotated factor pattern demonstrate simple structure? Are there relatively
i. high loadings on one factor?
ii. low loadings on other factors?
Statistics
EFA decomposes an adjusted correlation matrix. Variables are standardized in EFA, e.g., mean=0, standard
deviation=1, diagonals are adjusted for unique factors, 1-u. The amount of variance explained is equal to the trace of
the matrix, the sum of the adjusted diagonals or communalities. Squared multiple correlations (SMC) are used as
communality estimates on the diagonals. Observed variables are a linear combination of the underlying and unique
factors. Factors are estimated, (X1 = b1F1 + b2F2 + . . . e1 where e1 is a unique factor).
Factors account for common variance in a data set. The amount of variance explained is the trace (sum of the
diagonals) of the decomposed adjusted correlation matrix. Eigenvalues indicate the amount of variance explained by
each factor. Eigenvectors are the weights that could be used to calculate factor scores. In common practice, factor
scores are calculated with a mean or sum of measured variables that “load” on a factor.
5
The EFA Model is Y = Xβ+ E
where Y is a matrix of measured variables
X is a matrix of common factors
β is a matrix of weights (factor loadings)
E is a matrix of unique factors, error variation
Communality is the variance of observed variables accounted for by a common factor. A large communality value
indicates a strong influence by an underlying construct. Community is computed by summing squares of factor
loadings
2
d1 = 1 – communality = % variance accounted for by the unique factor
d1 = square root (1-community) = unique factor weight (parameter estimate)
EFA Steps
1)
initial extraction
each factor accounts for a maximum amount of variance that has not previously been accounted for by the
other factors
•
factors are uncorrelated
•
eigenvalues represent amount of variance accounted for by each factor
determine number of factors to retain
•
scree test, look for elbow
•
proportion of variance
•
prior communality estimates are not perfectly accurate, cumulative proportion must equal 100% so some
eigenvalues will be negative after factors are extracted, e.g., if 2 factors are extracted, cumulative proportion
equals 100% with 6 items, then 4 items have negative eigenvalues
•
interpretability
•
at least 3 observed variables per factor with significant factors
•
common conceptual meaning
•
measure different constructs
•
rotated factor pattern has simple structure (no cross loadings)
rotation – a transformation
interpret solution
calculate factor scores
results in a table
prepare results, paper
•
2)
3)
4)
5)
6)
7)
PROC FACTOR and options for EFA
DATA =
PRIORS =SMC
METHOD =ML,ULS
ROTATE =
SCREE
N =
MINEIGEN=1
OUT =
FLAG =
REORDER =
specifies dataset to be analyzed
squared multiple correlations used as adjusted diagonals of the correlation matrix
specifies maximum likelihood and unweighted least squares methods
PROMAX (ORTHOGONAL), VARIMAX(OBLIQUE)
requests a scree plot of the eigenvalues
specifies number of factors
specifies select factors with eigenvalues greater than 1
data and estimated factor scores, use raw data and N=
include a flag (*) for factor loadings above a specified value
arranges factor loadings from largest to smallest for each factor
An example of SAS code to run EFA. priors specify the prior communality estimate
proc factor method=ml priors=smc
maximum likelihood factor analysis
method=uls priors=smc
unweighted least squares factor analysis
method=prin priors=smc
principal factor analysis.
6
Similarities between CFA and EFA
•
•
•
•
Both techniques are based on linear statistical models.
Statistical tests associated with both methods are valid if certain assumptions are met.
Both techniques assume a normal distribution.
Both incorporate measured variables and latent constructs.
Differences between CFA and EFA
CFA requires specification of
•
a model a priori
•
the number of factors
•
which items load on each factor
•
a model supported by theory or previous research
•
error explicitly
EFA
•
determines the factor structure (model)
•
explains a maximum amount of variance
Statistical Anaysis
With background knowledge of reliability, exploratory and confirmatory factor analysis, we’re ready to proceed to the
statistical analysis!
Reliablity Analysis with PROC CORR (7 factors)
Data was analyzed with PROC CORR to determine the degree of internal consistency (reliability). A Cronbach alpha
statistic indicates the level of reliability. Values could range from 0.0 to 1.0 with a value closer to 1.0 indicating a
higher level of reliability.
Reliability ranged from 0.62 to 0.88 on the total and category subscales for the sample data on the Race and
Schooling School-Based Instrument. For research purposes, reliability was acceptable (see Table 1).
Table 1. The Race and Schooling School-Based
Instrument - Reliability
Items
Alpha
(Standardized)
1, 2, 3, 4, 5
0.80
Relationships
6, 7, 8
0.63
Engagement
9, 10, 11
0.70
Learning and Teaching
12, 13, 14
0.73
Achievement Results
15, 16, 17
0.62
Community
18, 19, 20
0.63
School Culture
21, 22, 23
0.79
1-23
0.88
Factor (category)
Professional Development
Total
7
SAS Code
PROC CORR procedure calculates a Cronbach Alpha statistics for the reliability analysis. The following code
calculates Cronbach Alpha for the total scale (questions 1-23) and for each category subscale.
proc corr data=rawsub2
var q1-q23;
proc corr data=rawsub2
var q1-q5;
proc corr data=rawsub2
var q6-q8;
proc corr data=rawsub2
var q9-q11;
proc corr data=rawsub2
var q12-q14;
proc corr data=rawsub2
var q15-q17;
proc corr data=rawsub2
var q18-q20;
proc corr data=rawsub2
var q21-q23;
nocorr alpha nomiss;
nocorr alpha nomiss;
nocorr alpha nomiss;
nocorr alpha nomiss;
nocorr alpha nomiss;
nocorr alpha nomiss;
nocorr alpha nomiss;
nocorr alpha nomiss;
Confirmatory Factor Analysis (7 factor)
Category subscale scores (factors) were created with the MEAN function. The factors were subjected to a
confirmatory factor analysis (CFA) with PROC CALIS. Figure 1 illustrates the CFA model. The SAS code below tests
the underlying factor structure.
proc calis data=rawsub2 kurtosis;
lineqs
Professional_Development=s1 F1 + e1,
Relationships
=s2 F1 + e2,
Engagement
=s3 F1 + e3,
Learning_Teaching
=s4 F1 + e4,
AchResults
=s5 F1 + e5,
Community
=s6 F1 + e6,
SchoolCulture
=s7 F1 + e7;
std
e1-e7=vare1-vare7,
F1=1;
var Professional_Development Relationships Engagement Learning_Teaching
AchResults Community SchoolCulture;
Another step that could be taken in the confirmatory approach is to test the structure for each of the 7 factors. The
following SAS code would test a model for the Professional Development factor.
proc calis data=rawsub2 kurtosis; **modification;
lineqs
q1= p1 F1 + e1,
q2= p2 F1 + e2,
q3= p3 F1 + e3,
q4= p4 F1 + e4,
q5= p5 F1 + e5;
std
e1-e5=vare1-vare5,
F1=1;
var q1-q5;
If each factor can be confirmed, then a combined model is evaluated. If this model is confirmed, then a model
including one latent construct (race and schooling) with a direct relationship to each factor is evaluated. Each factor
influences responses to items. For example, the latent construct professional development influences responses to
items 1, 2, 3, 4 and 5.
8
Professional Development Relationships Engagement Learning and Teaching Race and Schooling
Achievement Results Community School Culture Figure 1. Confirmatory Factor Analysis
Results
PROC CALIS procedure provides the number of observations, variables, estimated parameters, and informations
(related to model specification), descriptive statistics, and multivariate kurtosis. PROC CALIS procedure also
indicates the observations with the largest contribution to kurtosis (see Figure 2).
Fit Statistics
Determine criteria a priori to access model fit and confirm the factor structure. Some of the criteria indicate
acceptable model fit while other are close to meeting values for acceptable fit (see Figure 3).
•
Chi-square describes similarity of the observed and expected matrices. Acceptable model fit. Is indicated by a
chi-square probability greater than or equal to 0.05. For this CFA model, the chi-square value is not close to
zero (Chi-square=35.7367) and p = 0.0011, not greater than 0.05.
•
RMSEA indicates the amount of unexplained variance or residual. A value of 0.743 RMSEA value is larger than
the 0.06 or less criteria.
•
CFI (0.9472), NNI (0.9208), and NFI (0.9174) values meet the criteria (0.90 or larger) for acceptable model fit.
9
The CALIS Procedure
Covariance Structure Analysis: Maximum Likelihood Estimation
Observations
Variables
Informations
Variable
Professional_Development
Relationships
Engagement
Learning_Teaching
AchResults
Community
SchoolCulture
282
7
28
Mean
Model Terms
Model Matrices
Parameters
Std Dev
6.50000
6.44681
6.49291
5.83688
7.68440
7.01418
9.32979
1.39712
1.48504
1.54008
1.51681
1.43037
1.40404
1.04068
Mardia's Multivariate Kurtosis
Relative Multivariate Kurtosis
Normalized Multivariate Kurtosis
Mardia Based Kappa (Browne, 1982)
Mean Scaled Univariate Kurtosis
Adjusted Mean Scaled Univariate Kurtosis
1
4
14
Skewness
Kurtosis
-0.44553
-0.22743
-0.23384
-0.07231
-0.34269
-0.02535
-1.47618
-0.08904
-0.18439
0.12440
-0.00706
-0.01039
-0.06631
1.39821
3.9089
1.0620
2.9239
0.0620
0.0555
0.0555
Observation Numbers with Largest Contribution to Kurtosis
145
238
87
45
455.0658
290.2460
282.2438
270.1272
47
203.2169
Figure 2. Descriptives and Multivariate Kurtosis
The CALIS Procedure
Covariance Structure Analysis: Maximum Likelihood Estimation
Fit Function
Goodness of Fit Index (GFI)
. . .
Chi-Square
Chi-Square DF
Pr > Chi-Square
. . .
RMSEA Estimate
. . .
Bentler's Comparative Fit Index
. . .
Bentler & Bonett's (1980) Non-normed Index
Bentler & Bonett's (1980) NFI
. . .
0.1272
0.9621
35.7367
14
0.0011
0.0743
0.9472
0.9208
0.9174
Figure 3. Fit Statistics
For purposes of this example, 3 fit statistics indicate acceptable fit and 2 fit statistics indicate unacceptable fit. The
CFA analysis has not confirmed the factor structure. If the analysis indicates unacceptable model fit, the factor
structure cannot be confirmed, an exploratory factor analysis is the next step.
The factor structure is not confirmed. No further investigation of the confirmatory model is necessary, parameter
estimates, variances, covariances. Proceed with exploratory factor analysis to determine the factor structure.
10
Exploratory factor analysis with PROC FACTOR
•
•
•
method is maximum likelihood
diagonals of the correlation matrix are equal to squared multiple correlations
criteria set a priori is each factor retained will explain at least 10% of the variance
SAS Code
proc factor data=rawsub2 method=ml priors=smc rotate=varimax reorder;
var q1-q23;
Preliminary Eigenvalues: Total = 19.0588542
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Average = 0.82864583
Eigenvalue
Difference
Proportion
11.3729772
3.6419565
2.1325504
1.6888314
1.0239633
0.7563391
0.6567901
0.3993095
0.1894706
0.1391390
0.1049819
0.0119620
-0.0246090
-0.0625760
-0.1113500
-0.2146331
-0.2450015
-0.2575795
-0.3326418
-0.3994169
-0.4345544
-0.4570144
-0.5200403
7.7310207
1.5094061
0.4437191
0.6648680
0.2676242
0.0995490
0.2574806
0.2098389
0.0503316
0.0341571
0.0930199
0.0365710
0.0379670
0.0487740
0.1032831
0.0303684
0.0125781
0.0750623
0.0667751
0.0351375
0.0224600
0.0630259
0.5967
0.1911
0.1119
0.0886
0.0537
0.0397
0.0345
0.0210
0.0099
0.0073
0.0055
0.0006
-0.0013
-0.0033
-0.0058
-0.0113
-0.0129
-0.0135
-0.0175
-0.0210
-0.0228
-0.0240
-0.0273
Cumulative
0.5967
0.7878
0.8997
0.9883
1.0420
1.0817
1.1162
1.1371
1.1471
1.1544
1.1599
1.1605
1.1592
1.1560
1.1501
1.1388
1.1260
1.1125
1.0950
1.0741
1.0513
1.0273
1.0000
5 factors will be retained by the PROPORTION criterion. Figure 4. Eigenvalues
Results show 5 factors. Three factors each explain at least 10% of the variance (59.67%, 19.11%, 11.19%) while 2
factors each explain less than 10% of the variance (8.86%, 5,37%). According to criteria set a priori, each factor
retained will explain at least 10% of the variance, three factors will be retained. The analysis will be rerun specifying
three factors.
proc factor data=rawsub2 method=ml priors=smc rotate=varimax reorder n=3;
var q1-q23;
Figure 5 indicates that both hypothesis tests are rejected, no common factors and 3 factors are sufficient.
In practice, we want to reject the first hypotheses and accept the second hypothesis.
Tucker and Lewis’s Reliability Coefficient indicates good reliability. Reliability is a value between 0 and 1 with a
larger value indicating better reliability.
11
Significance Tests Based on 282 Observations
Test
H0:
HA:
H0:
HA:
No common factors
At least one common factor
3 Factors are sufficient
More factors are needed
DF
Chi-Square
Pr >
ChiSq
253
2392.4292
<.0001
187
522.6155
<.0001
Chi-Square without Bartlett's Correction
Akaike's Information Criterion
Schwarz's Bayesian Criterion
Tucker and Lewis's Reliability Coefficient
542.90188
168.90188
-512.13474
0.78776 Figure 5. Hypothesis Tests, Reliability
Results show items 21, 22, 23 as a factor, items 1, 2, 3, 4 as a factor, and items 5-20 as a factor. Two of the factors
are the same or close to the original factor structure. Items 21, 22, 23 are the School Culture factor. Items 1, 2, 3, 4, 5
are the Professional Development factor in the original factor structure. Item 5 is not included in the revised factor
structure.
Interpretability
Is there some conceptual meaning for each factor? Could the factors be given a name?
Factor1 could be called Professional Development (items 1, 2, 3, 4).
Factor2 could be called Attitudes (items 5-20).
Factor3 could be called School Culture (items 21, 22, 23).
Rotated Factor Pattern
q10
q12
q14
q11
q7
q13
q9
q16
q20
q8
q15
q6
q19
q5
q17
q18
q3
q4
q2
q1
q22
q23
q21
Factor1
Factor2
Factor3
0.72350
0.63153
0.57995
0.57515
0.56258
0.55962
0.54358
0.54234
0.49741
0.49444
0.47474
0.45938
0.43811
0.42258
0.37788
0.34111
0.10059
0.16327
0.25783
0.26039
0.07401
0.13707
0.07095
0.22819
0.32713
0.18075
0.14190
0.27168
0.11049
0.07180
0.27697
0.17016
0.01762
0.15822
0.34502
0.09902
0.42156
0.10460
0.08108
0.86961
0.75458
0.64261
0.41033
0.01893
-0.07665
-0.03185
0.05126
-0.01767
0.03112
-0.06608
0.04450
0.02234
0.08688
0.04569
0.12907
0.02129
0.16368
0.08328
0.12647
-0.02872
0.06759
0.17346
-0.06967
-0.11327
0.05330
0.03917
0.76583
0.75179
0.70232
Figure 6. Factor Loadings
12
Reliability with PROC CORR tested the reliability for the revised factor structure. The SAS code for reliability is
proc corr data=rawsub2 nocorr alpha nomiss;
var q1-q4;
proc corr data=rawsub2 nocorr alpha nomiss;
var q5-q20;
proc corr data=rawsub2 nocorr alpha nomiss;
var q21-q23;
Reliability with Cronbach Alpha for items 21, 22, 23 is 0.79, for items 1, 2, 3, 4 is 0.79, and for items 5-20 is 0.88. The
range is 0.79 to 0.88 for the revised factor structure. The range for the original factor structure is 0.62 to 0.80
Conclusion
Confirmatory and Exploratory Factor Analysis are powerful statistical techniques. The techniques have similarities
and differences. Determine the type of analysis a priori to answer research questions and maximize your knowledge.
WAM (Walk away message)
Select CFA to verify the factor structure and EFA to determine the factor structure.
References
Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1, 245-276.
Child, D. (1990). The essentials of factor analysis, second edition. London: Cassel Educational Limited.
DeVellis, R. F. (1991). Scale Development: Theory and Applications. Newbury Park, California: Sage Publications.
®
Hatcher, L. (1994). A step-by-step approach to using the SAS System for factor analysis and structural equation
modeling. Cary, NC: SAS Institute Inc.
Hoyle, R. H. (1995). The structural equation modeling approach: Basic concepts and fundamental issues. In
Structural equation modeling: Concepts, issues, and applications, R. H. Hoyle (editor). Thousand Oaks, CA: Sage
Publications, Inc., pp. 1-15.
Hu, L. & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria
versus new alternatives. Structural Equation Modeling, 6(1), 1-55.
Joreskog, K. G. (1969). A general approach to confirmatory maximum likelihood factor analysis, Psychometrika, 34,
183-202.
Kline, R. B. (1998). Principles and Practice of Structural Equation Modeling. New York: The Guilford Press.
nd
Nunnally, J. C. (1978). Psychometric theory, 2 edition. New York: McGraw-Hill.
®
SAS Language and Procedures, Version 6, First Edition. Cary, N.C.: SAS Institute, 1989.
®
SAS Online Doc 9. Cary, N.C.: SAS Institute.
SAS® Procedures, Version 6, Third Edition. Cary, N.C.: SAS Institute, 1990.
®
SAS/STAT User’s Guide, Version 6, Fourth Edition, Volume 1. Cary, N.C.: SAS Institute, 1990.
SAS/STAT® User’s Guide, Version 6, Fourth Edition, Volume 2. Cary, N.C.: SAS Institute, 1990.
Schumacker, R. E. & Lomax, R. G. (1996). A Beginner’s Guide to Structural Equation Modeling. Mahwah, New
Jersey: Lawrence Erlbaum Associates, Publishers.
Shay, Mary. (in press). An Investigation of the Attitudes, Beliefs, and Values of Elementary School Teachers Toward
Race and Schooling. Greeley CO: University of Northern Colorado.
Shay, Mary (2008). Race and Schooling Instrument.
Thorndike, R. M., Cunningham, G. K., Thorndike, R. L., & Hagen E. P. (1991). Measurement and evaluation in
psychology and education. New York: Macmillan Publishing Company.
Truxillo, Catherine. (2003). Multivariate Statistical Methods: Practical Research Applications Course Notes. Cary,
N.C.: SAS Institute.
Contact Information
Diana Suhr, Ph.D., Statistical Analyst
Office of Budget & Institutional Analysis
University of Northern Colorado
Greeley,CO 80639, diana.suhr@unco.edu
Mary Shay, Principal
Cottonwood Creek Elementary School
Cherry Creek School District
Denver CO 80237
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS
Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are
registered trademarks or trademarks of their respective companies.
13
Appendix A
Race and Schooling Instrument
Revised – Shay, (2008)
Professional Staff Development
1. To what extent do teachers and
administrators in your school talk
openly and constructively about race
with each other?
2. To what extent do staff development
activities help educators understand the
ways in which race influences student
behavior?
3. To what extent do staff development
activities help educators acquire
knowledge about the history and culture
of various racial groups?
4. To what extent do staff development
activities help educators become
knowledgeable about the diverse racial
perspectives on historical and current
events?
5. To what extent are successful efforts
being made in your school to attract,
retain, and advance teachers and
administrators of color?
1
Almost Never
…
2
Seldom
…
3
Sometimes
…
4
Frequently
…
5
Almost Always
…
1
Almost Never
…
2
Seldom
…
3
Sometimes
…
4
Frequently
…
5
Almost Always
…
1
Almost Never
…
2
Seldom
…
3
Sometimes
…
4
Frequently
…
5
Almost Always
…
1
Almost Never
…
2
Seldom
…
3
Sometimes
…
4
Frequently
…
5
Almost Always
…
1
Almost Never
…
2
Seldom
…
3
Sometimes
…
4
Frequently
…
5
Almost Always
…
Relationships
6. To what extent do teachers in your
school talk openly and constructively
about race with students?
7. To what extent do teachers encourage
students to have open and constructive
conversations about being victims of
racial discrimination and about possessing
racial power and white privilege?
8. To what extent do teachers structure
interracial cooperative groups that enable
students of different racial groups to
become acquainted with each other?
1
Almost Never
…
1
Almost Never
…
2
Seldom
…
2
Seldom
…
3
Sometimes
…
3
Sometimes
…
4
Frequently
…
4
Frequently
…
5
Almost Always
…
5
Almost Always
…
1
Almost Never
…
2
Seldom
…
3
Sometimes
…
4
Frequently
…
5
Almost Always
…
Engagement
9. To what extent is decision-making in
the school widely shared among
administrators, teachers, parents and
students of a variety of racial groups?
10. To what extent do teachers encourage
students of different racial groups to talk
with each other openly and constructively
about race?
11. To what extent are deliberate actions
taken by administrators and teachers to
insure that students of color are
represented proportionately in extracurricular activities and leadership roles?
1
Almost Never
…
2
Seldom
…
3
Sometimes
…
4
Frequently
…
5
Almost Always
…
1
Almost Never
…
2
Seldom
…
3
Sometimes
…
4
Frequently
…
5
Almost Always
…
1
Almost Never
…
2
Seldom
…
3
Sometimes
…
4
Frequently
…
5
Almost Always
…
14
Learning and Teaching
12. To what extent do teachers help
students acquire the knowledge and skills
needed to have thoughtful, constructive,
heartfelt discussions about race?
13. To what extent are students provided
factual information in social studies and
other subject areas that contradicts
misconceptions about people of color?
14. To what extent does the school
curriculum include a focus on racial power
and white privilege through examples in
history, art, science, and other disciplines?
1
Almost Never
…
2
Seldom
…
3
Sometimes
…
4
Frequently
…
5
Almost Always
…
1
Almost Never
…
2
Seldom
…
3
Sometimes
…
4
Frequently
…
5
Almost Always
…
1
Almost Never
…
2
Seldom
…
3
Sometimes
…
4
Frequently
…
5
Almost Always
…
Achievement Results
15. To what extent do teachers use a
variety of assessment devices to ensure
that students of all racial groups meet
rigorous standards across the curriculum?
16. To what extent do teachers use a
variety of assessment devices to measure
improved race relations between students
of different racial groups?
17. To what extent are students of all
racial groups consistently exposed to and
supported in the most rigorous curricular
opportunities?
1
Almost Never
…
2
Seldom
…
3
Sometimes
…
4
Frequently
…
5
Almost Always
…
1
Almost Never
…
2
Seldom
…
3
Sometimes
…
4
Frequently
…
5
Almost Always
…
1
Almost Never
…
2
Seldom
…
3
Sometimes
…
4
Frequently
…
5
Almost Always
…
18. To what extent do teachers have
regular personal contact (e.g. phone, face
to face) with Black and Hispanic families
to advise them on the achievement of their
children?
19. To what extent are Black and Hispanic
children specifically invited to become a
part of school-wide activities, committees,
and councils?
20. To what extent are families of Black
and Hispanic families visible in school and
leadership positions?
1
Almost Never
…
2
Seldom
…
3
Sometimes
…
4
Frequently
…
5
Almost Always
…
1
Almost Never
…
2
Seldom
…
3
Sometimes
…
4
Frequently
…
5
Almost Always
…
1
Almost Never
…
2
Seldom
…
3
Sometimes
…
4
Frequently
…
5
Almost Always
…
Community
School Culture
21. To what extent do you believe that
Black and Hispanic students should
perform at least as well as White
students?
22. To what extent do you believe it is our
responsibility as educators to make this
level of achievement occur for our Black
and Hispanic students?
23. To what extent do you believe that our
Black and Hispanic students can actually
reach this level of achievement?
1
Almost Never
…
2
Seldom
…
3
Sometimes
…
4
Frequently
…
5
Almost Always
…
1
Almost Never
…
2
Seldom
…
3
Sometimes
…
4
Frequently
…
5
Almost Always
…
1
Almost Never
…
2
Seldom
…
3
Sometimes
…
4
Frequently
…
5
Almost Always
…
Note: Information questions are not shown in Appendix A.
15
Download