The Hull Method for Selecting the Number of Common Factors

advertisement
This article was downloaded by: [University of Groningen]
On: 17 November 2011, At: 06:45
Publisher: Psychology Press
Informa Ltd Registered in England and Wales Registered Number: 1072954
Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH,
UK
Multivariate Behavioral
Research
Publication details, including instructions for
authors and subscription information:
http://www.tandfonline.com/loi/hmbr20
The Hull Method for Selecting
the Number of Common Factors
a
b
Urbano Lorenzo-Seva , Marieke E. Timmerman &
Henk A. L. Kiers
b
a
Research Center for Behavior Assessment,
Universitat Rovira i Virgili
b
University of Groningen
Available online: 19 Apr 2011
To cite this article: Urbano Lorenzo-Seva, Marieke E. Timmerman & Henk A. L. Kiers
(2011): The Hull Method for Selecting the Number of Common Factors, Multivariate
Behavioral Research, 46:2, 340-364
To link to this article: http://dx.doi.org/10.1080/00273171.2011.564527
PLEASE SCROLL DOWN FOR ARTICLE
Full terms and conditions of use: http://www.tandfonline.com/page/termsand-conditions
This article may be used for research, teaching, and private study purposes.
Any substantial or systematic reproduction, redistribution, reselling, loan,
sub-licensing, systematic supply, or distribution in any form to anyone is
expressly forbidden.
The publisher does not give any warranty express or implied or make any
representation that the contents will be complete or accurate or up to
date. The accuracy of any instructions, formulae, and drug doses should be
independently verified with primary sources. The publisher shall not be liable
for any loss, actions, claims, proceedings, demand, or costs or damages
Downloaded by [University of Groningen] at 06:45 17 November 2011
whatsoever or howsoever caused arising directly or indirectly in connection
with or arising out of the use of this material.
Multivariate Behavioral Research, 46:340–364, 2011
Copyright © Taylor & Francis Group, LLC
ISSN: 0027-3171 print/1532-7906 online
DOI: 10.1080/00273171.2011.564527
Downloaded by [University of Groningen] at 06:45 17 November 2011
The Hull Method for Selecting the
Number of Common Factors
Urbano Lorenzo-Seva
Research Center for Behavior Assessment
Universitat Rovira i Virgili
Marieke E. Timmerman and Henk A. L. Kiers
University of Groningen
A common problem in exploratory factor analysis is how many factors need to
be extracted from a particular data set. We propose a new method for selecting
the number of major common factors: the Hull method, which aims to find a
model with an optimal balance between model fit and number of parameters. We
examine the performance of the method in an extensive simulation study in which
the simulated data are based on major and minor factors. The study compares the
method with four other methods such as parallel analysis and the minimum average
partial test, which were selected because they have been proven to perform well
and/or they are frequently used in applied research. The Hull method outperformed
all four methods at recovering the correct number of major factors. Its usefulness
was further illustrated by its assessment of the dimensionality of the Five-Factor
Personality Inventory (Hendriks, Hofstee, & De Raad, 1999). This inventory has
100 items, and the typical methods for assessing dimensionality prove to be useless:
the large number of factors they suggest has no theoretical justification. The Hull
method, however, suggested retaining the number of factors that the theoretical
background to the inventory actually proposes.
To date, exploratory factor analysis (EFA) is still a widely applied statistical
technique in the social sciences. For example, Costello and Osborne (2005)
carried out a literature search in PsycINFO, and found over 1,700 studies that
Correspondence concerning this article should be addressed to Urbano Lorenzo-Seva, Centre
de Reserca en Avalució i Mesura de la Conducta, Universitat Rovira i Virgili, Departament de
Psicologia, Carretera Valls s/n, 43007 Tarragona, Spain. E-mail: urbano.lorenzo@urv.cat
340
Downloaded by [University of Groningen] at 06:45 17 November 2011
THE HULL METHOD
341
used some form of EFA over a 2-year period. A common problem in EFA is
to decide how many factors need to be extracted. This decision is not trivial
because an incorrect number of extracted factors can have substantial effects
on EFA results (see, e.g., Comrey, 1978; Fava & Velicer, 1992; Levonian &
Comrey, 1966; Wood, Tataryn, & Gorsuch, 1996). Underfactoring, which is an
underestimation of the number of factors, makes the factor loading patterns
artificially complex and, therefore, the factors difficult to interpret (see Comrey,
1978). On the other hand, overfactoring, which means overestimating the number
of factors, might suggest the existence of latent variables with little theoretical
or substantive meaning (Fabrigar, Wegener, MacCallum, & Strahan, 1999, Fava
& Velicer, 1992; Wood et al., 1996).
An objective procedure would be of great help in taking the important decision about the number of factors. This sort of procedure has been available since
the early stages of factor analysis. In empirical practice, it is generally advised to
use an objective procedure combined with reasoned reflection (see, e.g., Henson
& Roberts, 2006; Kahn, 2006). This implies that objective procedures should be
viewed as tools that can help to find potentially interesting solution(s) and that
the final selection decision should be based on substantive information and the
interpretability of the model parameters.
In order to select a sensible procedure for identifying the number of common
factors, one first needs to decide what the optimal number of common factors
is. From a purely theoretical perspective of the common factor model, one
could argue that this number is the number of common factors required to
reproduce the reduced population correlation matrix (i.e., the correlation matrix
with communalities on its diagonal). However, in empirical practice, it is rather
unlikely that a common factor model with a limited number of factors (i.e.,
much fewer than the number of observed variables) would hold exactly at
the population level. To formalize this idea of model error, one could use
the framework based on major and minor factors, which was put forward by
MacCallum and Tucker (1991). That is, each observed variable is considered
to be composed of a part that is consistent with the common factor model
(with a limited number of factors) and a part that is not. The latter model error
can be represented by minor factors (based on the middle model of Tucker,
Koopman, & Linn, 1969). From this perspective, the optimal number of factors
to be extracted coincides with the number of major factors. This appears to be
highly reasonable because common factor models in the social sciences are an
approximation to reality (Browne & Cudeck, 1992), as is now widely recognized.
The task, therefore, seems to identify the number of major factors rather than
the number of major and minor factors.
In empirical data sets, minor factors are easily found because (small) subsets
of observed variables have common parts that are independent of the remaining
observed variables. Cattell and Tsujioka (1964) referred to this phenomenon
Downloaded by [University of Groningen] at 06:45 17 November 2011
342
LORENZO-SEVA, TIMMERMAN, KIERS
as bloated specifics. For example, consider two typical items in a personality
inventory: (a) “When travelling by train, I avoid speaking to unknown people”
and (b) “When traveling by train, I feel anxious.” Item (a) would typically be
related to an Extraversion dimension, and item (b) to an Emotional stability
dimension. The items pertain to different substantive dimensions, but they share
a common part because both items contain the same situation in which the
action happens. To fully account for this dependency in a common factor model,
one would need a “train factor” as well as the Extroversion and Emotional
stability factors. Commonly, only the latter two would be of substantive interest, and it would be desirable to ignore the train factor. If such a factor
indeed results from a relatively small subset of observed variables, the factor
accounts for a relatively small portion of the common variance, from which the
term “minor factor” stems. This property of minor factors enables them to be
identified.
In this article, we propose that the Hull method be used to select the number
of major factors. The method is based on the principle that a model should
have an optimal balance of fit and number of parameters. The performance
of the Hull method is assessed in a simulation study and compared with four
other procedures. In addition, its usefulness is illustrated with the assessment of
the dimensionality of the Five-Factor Personality Inventory (Hendriks, Hofstee,
& De Raad, 1999). This inventory has 100 items (20 per factor), and typical
methods to assess its dimensionality prove to be useless: they suggest to retain
a large number of factors that would have not a theoretical interpretation. The
Hull method, however, suggested to retain the number of factors that actually
the theoretical background of the inventory proposes.
REVIEW OF PROCEDURES FOR SELECTING THE
NUMBER OF FACTORS
Several procedures have been proposed for selecting the number of common
factors, based either on common factor models or on principal components
(PCs). From a theoretical perspective, PC-based methods should not be used in a
common factor analytic context. However, the fundamental differences between
PCs and common factors are often hidden in empirical analyses. This is so
because principal component analysis (PCA) and EFA can yield similar results,
for example, when the ratio between the number of observed variables per
factor (component) is large (Bentler & Kano, 1990) and communalities are high
(Widaman, 1993). In empirical practice, PC-based methods appear to be even
more popular for indicating the number of common factors than their common
factor-based counterparts (Costello & Osborne, 2005). Therefore, we review
both types of methods in the following sections.
THE HULL METHOD
343
Downloaded by [University of Groningen] at 06:45 17 November 2011
Kaiser’s Rule
Kaiser (1960) proposed that only those PCs associated with eigenvalues larger
than one should be interpreted. Kaiser (1960) observed that, according to Guttman’s
(1954) proofs, the eigenvalues-greater-than-1.00 criterion ought to provide a
conservative estimate of the number of PCs. However, it is well known that
Kaiser’s rule tends to overestimate the number of PCs and common factors
quite severely (e.g., Fabrigar et al., 1999; Lance, Butts, & Michels, 2006;
Lawrence & Hancock, 1999; Velicer, Eaton, & Fava, 2000; Zwick & Velicer,
1986). This is due to sampling error because Kaiser’s rule assumes that the
population correlation matrix is available, which is only reasonable when the
sample size approaches infinity (Glorfeld, 1995). Moreover, when it is used to
determine the number of common factors, the additional, usually unreasonable,
assumption that the unique variances are zero, must hold. The rule was expected
to systematically result in reliable components, but Cliff (1988) proved that this
was not true. It is therefore surprising that this criterion is quite popular in
applied research: Costello and Osborne (2005) found that most of the 1,700
studies that used some form of EFA applied Kaiser’s rule. As Thompson and
Daniel (1996) noted, “This extraction rule is the default option in most statistics
packages and therefore may be the most widely used decision rule, also by
default” (p. 200).
Parallel Analysis
Parallel Analysis (PA; Horn, 1965) can be regarded as a modification of Kaiser’s
rule, which takes into account sampling variability (Kaiser, 1960). The central
idea of Horn’s PA is that PCs are associated with eigenvalues that are larger than
the ones associated with the dimensions derived from random data. PA requires
a number of randomly drawn data matrices to be generated, usually from a
normal distribution of the same order as the empirical data matrix. Then, the
eigenvalues of the associated correlation matrices are computed. Finally, each
empirical eigenvalue is compared with the distribution of random eigenvalues
for the same position (i.e., the first empirical eigenvalue is compared with the
distribution of the first random eigenvalue, the second empirical eigenvalue is
compared with the distribution of the second random eigenvalue, etc.). PCs
related to empirical eigenvalues larger than a given threshold in the distribution of random eigenvalues should be extracted. Because the average of the
distribution (Horn, 1965) tends to overestimate the number of factors (Glorfeld,
1995; Harshman & Reddon, 1983), the more conservative 95th percentile is
more popular at present. Horn’s PA has been refined so that it can be used to
analyze multiple correlation matrices and average eigenvalues across matrices
(e.g., Hayton, Allen, & Scarpello, 2004). It can also be used with sampling
Downloaded by [University of Groningen] at 06:45 17 November 2011
344
LORENZO-SEVA, TIMMERMAN, KIERS
bootstrap techniques instead of Monte Carlo analyses (Lattin, Carroll, & Green,
2003, pp. 114–116).
Different variants of PA have been extensively studied. In general, Horn’s
(1965) PA seems to perform well (Peres-Neto, Jackson, & Somers, 2005; Velicer
et al., 2000; Zwick & Velicer, 1986) and is therefore a highly recommended
procedure in applied research (e.g., Hayton et al., 2004; Kahn, 2006; Lance et al.,
2006; Worthington & Whittaker, 2006). Even academic journals have editorialized in favor of PA (Educational and Psychological Measurement; Thompson
& Daniel, 1996).
From a technical point of view, it is not difficult to generalize PA to the context
of EFA. Humphreys and Ilgen (1969) proposed a variant of PA for assessing the
number of common factors. In this variant, eigenvalues of the reduced correlation
matrix are computed for the empirical and the randomly generated matrices,
where the squared multiple correlations are used as estimates of communalities.
However, although Humphreys and Montanelli (1975) reported that this variant
performed well in a simulation study, some authors have questioned its use (e.g.,
Crawford & Koopman, 1973). Steger (2006) concluded that not enough is known
about the PA of the common factor model to recommend it for general use. In
empirical practice, researchers also tend to apply Horn’s (1965) PA when they
intend to identify the number of common factors (e.g., Finch & West, 1997;
Reddon, Marceau, & Jackson, 1982).
Scree Test
The scree test (Cattell, 1966) is based on the principle that a model should have
an optimal balance of fit and number of parameters. It uses a so-called scree
plot, which is a plot of the successive eigenvalues of the observed correlation
matrix versus the number of eigenvalues. As such, the fit is defined in terms of
percentage of explained variance of PCs and the number of parameters as the
number of PCs. The elbow in the scree plot indicates the optimal number of
components.
One of the main drawbacks of the scree test is that it is subjective, and
overestimation is frequent (Fava & Velicer, 1992). Furthermore, the optimal
number of PCs is sometimes not clearly indicated in empirical practice as scree
plots may lack a clear elbow (Joliffe, 1972, 1973). Finally, although the scree
test generally indicates the number of components accurately, its performance
is rather variable (Zwick & Velicer, 1986).
In empirical practice, the scree test is used to indicate the number of common
factors, even though it is not designed for this purpose. The scree test can
be adapted to the context of EFA by considering the eigenvalues from the
reduced correlation matrix. It is important for the reduced correlation matrix
to be nonnegative definite because otherwise the successive eigenvalues are not
THE HULL METHOD
345
directly related to the variance explained by the successive factor. To ensure a
nonnegative definite reduced correlation matrix, Minimum Rank Factor Analysis
(Ten Berge & Kiers, 1991, Ten Berge & Socan, 2004) could be used.
Downloaded by [University of Groningen] at 06:45 17 November 2011
Minimum Average Partial Test
The Minimum Average Partial (MAP) test (Velicer, 1976) is a statistical method
for determining the number of PCs. It uses the partial correlation matrix between
residuals after a particular number of PCs have been extracted. The main idea
is that if the number of extracted PCs is optimal, all nonresidual variation
is captured by the extracted PCs, and the average partial correlation among
residuals would be small. Therefore, Velicer proposed to extract the number of
PCs that gives the minimum average partial correlation. Although the MAP test
can only be used to identify the number of PCs, it is frequently used to identify
the number of common factors (e.g., Eklöf, 2006). In a simulation study, the
MAP test was generally good at indicating the number of components (Zwick
& Velicer, 1986).
Information Criteria
To determine the number of common factors in a maximum likelihood (ML)
factor analysis, Akaike (1973) proposed an Information Criterion-based method
(AIC). Because the number of factors appeared to be overestimated in large
samples, Schwarz (1978) proposed a corrected version known as the Bayesian
Information Criterion (BIC). Both the AIC and BIC are based on the likelihood
of the model, which expresses the goodness-of-fit, and the number of parameters.
The model associated with the lowest AIC or BIC is considered the optimal
model. In order to use AIC and BIC to select the number of factors, it is common
to fit a series of common factor models, starting with a number of factors equal
to one. The number of factors is successively increased with one until the value
of AIC or BIC starts to increase. The AIC and BIC can be obtained with any
ML extraction algorithm (e.g., Jennrich & Robinson, 1969; Jöreskog, 1977).
Multivariate normality is an essential assumption in ML analysis, so AIC and
BIC should not be applied to data if the assumption does not hold. BIC is
frequently used to assess the fit of the model in the context of confirmatory
factor analysis (see, e.g., Worthington & Whittaker, 2006). In a simulation
study of common pathway models, Markon and Krueger (2004) found that BIC
outperforms AIC in larger samples and in the comparison of more complex
models. To the best of our knowledge, to date no simulation study of EFA
has examined how well AIC and BIC can determine the number of common
factors.
346
LORENZO-SEVA, TIMMERMAN, KIERS
Downloaded by [University of Groningen] at 06:45 17 November 2011
HULL METHOD: CONVEX HULL-BASED HEURISTIC
FOR SELECTING THE NUMBER OF MAJOR FACTORS
None of the methods described so far appear to be generally good at determining the number of common factors. In this section, we propose to follow an
alternative approach, which has led to successful model selection methods in,
for instance, multiway data analysis and PCA. The approach, then, has been
applied before, but the application that we propose here in the context of EFA
is new.
A fundamental problem in multiway data analysis is the selection of an appropriate model and the associated number or numbers of components. For example,
in three-way component analysis, a number of components has to be selected
for each mode in the data array (e.g., pertaining to different subjects, measured
on different variables and in different conditions). To solve this intricate model
selection problem, Ceulemans and Kiers (2006) proposed a numerical convex
hull-based heuristic. The heuristic aims at identifying that multiway model with
the best balance between goodness-of-fit and the degrees of freedom. The name
hull was used because the method literally deals with the convex hull of a
series of points in a two-dimensional graph. In fact, the main idea is that the
fit values are plotted against the number of parameters (or components) in the
two-dimensional graph. This results in plots with many points not nicely placed
on a monotonically increasing curve. The idea is then to first find the convex
hull (i.e., the curve consisting of line segments touching the cloud on the top
of it). A next step then is to study the hull for a big jump followed by a
small jump (elbow), and the number of components to retain is indicated by the
sharpest such elbow. The hull heuristic has been shown to be a powerful tool
for solving various deterministic model selection problems (Ceulemans & Kiers,
2009; Ceulemans, Timmerman, & Kiers, 2011; Ceulemans & Van Mechelen,
2005; Schepers, Ceulemans, & Van Mechelen, 2008), suggesting that the hull
heuristic is a promising approach in the context of EFA as well.
Therefore, we propose a new method that uses the hull heuristic approach
to find an optimal balance between fit and degrees of freedom, combined with
a statistical approach to guarantee that no overextraction takes place. As we
explain shortly, the method is specifically designed to identify the number of
common factors. We denote this method Hull, after the hull heuristic.
The numerical convex hull-based heuristic (Ceulemans & Kiers, 2006) can
be regarded as a generalization of the scree test. Consider a screelike plot of
a goodness-of-fit measure (indicated as f value in what follows) versus the
degrees of freedom (df); this plot is denoted as the hull plot here. An example
of a hull plot can be found in Figure 1. Each point in the plot represents the
combination of goodness-of-fit and degrees of freedom obtained in the different
Downloaded by [University of Groningen] at 06:45 17 November 2011
THE HULL METHOD
347
FIGURE 1 Heuristic of the numerical convex hull-based model for the five-factor
personality inventory (FFPI) in a Dutch sample.
possible factor solutions (i.e., the factor solution when a single factor is extracted,
when two factors are extracted, and so on until the maximum number of factors
considered). Solutions in the hull plot that are on or close to an elbow in the
higher boundary of the convex hull of the plot are favored.
The Hull analysis is performed in four main steps:
1.
2.
3.
4.
The range of factors to be considered is determined,
The goodness-of-fit of a series of factor solutions is assessed,
The degrees of freedom of the series of factor solutions is computed, and
The elbow is located in the higher boundary of the convex hull of the hull
plot.
The number of factors extracted in the solution associated with the elbow is
considered the optimal number of common factors.
To implement the method, we have to select the range of numbers of common
factors to be considered, a goodness-of-fit measure, the degrees of freedom, and
a heuristic to locate the elbow (see the following section).
The Range of Number of Factors to Be Considered
In order to be able to detect an elbow in the hull plot it is necessary to have a
certain trend on either side of an elbow. For example, to detect that there is an
elbow for the one-factor solution, the one-factor solution must be compared with
the zero-factor and two-factor solutions. In addition, to guard against under- and
Downloaded by [University of Groningen] at 06:45 17 November 2011
348
LORENZO-SEVA, TIMMERMAN, KIERS
overextraction, it is necessary to consider only a convenient range of number of
factors; obviously the elbow identified on the convex hull depends on the range
of factor solutions considered.
Before defining the convenient range (i.e., the lowest and highest numbers of
factors to consider), we note that by definition the elbow cannot be located in
either the first or the last position of the boundary of the convex hull. Therefore,
the Hull method will never indicate the lowest and highest numbers of factors in
the range. Clearly, the lowest number of factors to consider must be zero because
in this case the Hull method can only potentially indicate a single factor solution.
To guard against overfactoring, we propose to identify the highest number
of factors that need to be considered based on Horn’s (1965) PA. The idea is
that, whatever the optimal number of factors is finally assumed to be, factors
associated to eigenvalues lower than random eigenvalues should never be retained. As in Reddon et al. (1982), this is done by taking the number of factors
suggested by Horn’s PA as the upper bound. Then, the highest number of factors
to consider in the hull plot is taken as the number of factors indicated by Horn’s
PA plus one.
Indices for Assessing the Goodness-of-Fit of a
Factor Solution
In EFA, a large number of fit indices may be considered when applying the Hull
method. Here, we discuss four indices associated with different factor extraction
methods and test them in a simulation study (see later). We selected three wellknown fit indices based on the chi-square statistic, and we propose a fourth,
which can be used with any extraction method. Note that the Hull method is
not restricted to these measures, and alternatives can be considered.
Bentler (1990) proposed the comparative fit index (CFI) as a measure of the
improvement of fit by comparing the hypothesized model with a more restricted
baseline model. The commonly used baseline model is a null or independent
model in which the observed variables are uncorrelated. The values of CFI are
in the range [0, 1], where a value of 1 indicates a perfect fit. Steiger and Lind
(1980) proposed the root mean square error of approximation (RMSEA): it is
based on the analysis of residuals and reflects the degree of misfit in the proposed
model. The values of RMSEA range from zero to positive infinity, with a value
of zero indicating exact model fit and larger values indicating poorer model fit.
The computation of these indices is restricted to methods in which a chi-square
statistic is available, that is, when an ML or unweighted least square (ULS)
factor extraction method is applied.
Another popular index is the standardized root mean square residual (SRMR;
see, e.g., Hu & Bentler, 1998). SRMR is a measure of the mean absolute correlation residual: the overall difference between the observed and the predicted
Downloaded by [University of Groningen] at 06:45 17 November 2011
THE HULL METHOD
349
correlations. Ideally, these residuals should be zero for acceptable model fit.
The SRMR can be used with any factor extraction method, including ML, ULS,
principal axis factor analysis, and minimum rank factor analysis (Ten Berge &
Kiers, 1991; Ten Berge & Sočan, 2004).
We now propose an alternative goodness-of-fit index that can be used with
any extraction method. This index expresses the extent to which the common
variance in the data is captured in the common factor model. The index is
denoted as CAF (common part accounted for). To explain CAF, we first consider
the decomposition used in a common factor analysis. Let R be an observed correlation matrix of order m m. A common factor analysis aims at decomposing
R as
Oq C
O q;
R D Aq A0q C U C q D R
(1)
where q is the number of factors extracted, Aq is a loading matrix of order
O q is a residual matrix,
m q, U is a diagonal matrix of unique variances, 
O
and Rq is the model-implied correlation matrix. The off-diagonal elements of
O q can be regarded as an estimate of the model error at
the residual matrix 
the population level when q factors are extracted. At the population level, the
residual matrix will be free of common variance if q is larger than or equal to
the true number of factors. Therefore, a measure that expresses the amount of
common variance in the residual matrix reflects the fit of the common factor
model with q components, and larger amounts of common variance indicate a
worse fit.
A measure that expresses the amount of common variance in a matrix is
found in the KMO (Kaiser, Meyer, Olkin) index (see Kaiser, 1970; Kaiser &
Rice, 1974). The KMO index is commonly used to assess whether a particular
correlation matrix R is suitable for common factor analysis (i.e., if there is
enough common variance to justify a factor analysis). The KMO index is given
as
m
m
X
X
rij2
g.R/ D
i D1 j D1.i ¤j /
m
m
X
X
rij2 C
aij2
i D1 j D1.i ¤j /
i D1 j D1.i ¤j /
m
X
m
X
;
(2)
where rij is the correlation among the variables i and j , and aij is their
partial correlation. Hence, KMO expresses the relative magnitude of the observed
correlation coefficients and the partial correlation coefficients. Values of g.R/
close to one indicate that there is a substantial amount of common variance
in the correlation matrix, whereas values equal to zero mean that there is no
common variance.
350
LORENZO-SEVA, TIMMERMAN, KIERS
Now, we propose to express the common part accounted for by a common
factor model with q common factors as 1 minus the KMO index of the estimated
residual matrix, namely,
Downloaded by [University of Groningen] at 06:45 17 November 2011
CAFq D 1
O q /:
g.
(3)
The values of CAFq are in the range [0, 1] and if they are close to zero it means
that a substantial amount of common variance is still present in the residual
matrix after the q factors have been extracted (implying that more factors should
be extracted). Values of CAFq close to one mean that the residual matrix is free
of common variance after the q factors have been extracted (i.e., no more factors
should be extracted). A noteworthy characteristic of CAFq is that it enables the
goodness-of-fit to be assessed when q D 0 (implying that no factors still have
to be extracted). In this situation the correlation matrix and the residual matrix
Oq D
O q.
would coincide, that is, R D 0 C 
Degrees of Freedom
The degrees of freedom in an EFA of a correlation matrix that involves m
variables and q factors is the number of unique variances plus the number of
entries in the pattern matrix minus the number of entries that are set to zero by
an orthogonal transformation of the factors (see, e.g., Basilevsky, 1994, p. 356).
The degrees of freedom when factor analyzing a correlation matrix is
df D mq
1
q.q
2
1/;
(4)
where the factors are assumed to be orthogonal. If a covariance matrix is
considered, the degrees of freedom would be df C m. However, as the constant
m would be added to all the possible factor solutions (i.e., for any value of q),
this difference would not affect the performance of Hull.
A Heuristic to Locate the Elbow
To select the elbow on the higher boundary of the convex hull, we follow the
proposal made by Ceulemans and Kiers (2006), namely, to maximize
sti D
.fi fi 1 /=.df i df i 1 /
:
.fi C1 fi /=.df i C1 df i /
(5)
The largest value of st indicates that allowing for df i degrees of freedom (instead
of df i 1 ) increases the fit of the model considerably, whereas allowing for more
than df i hardly increases it at all. The i -th solution is at the elbow of the convex
THE HULL METHOD
351
hull because it is the solution after which the increase in fit levels off. The i -th
solution is related to a break or discontinuity in the convex hull and is detected
as an elbow.
Downloaded by [University of Groningen] at 06:45 17 November 2011
Stepwise Overview of the Procedure for Selecting the
Optimal Number of Factors
The numerical convex hull-based method for selection can be summarized in
eight steps:
1. A PA is performed to identify an upper bound for the number of factors J
to be considered. To decrease the probability of underfactoring, we suggest
a conservative threshold, such as the 95th percentile. Take J to be equal
to the PA-indicated number plus 1.
2. Factor analyses are performed for the range of dimensions .j D 0; : : : ; J /,
and the fj and df j values are computed for each factor solution, where
fj is a goodness-of-fit index and df j the degrees of freedom.
3. The n solutions are sorted by their df values (this is equivalent to sorting
the solutions by the number of extracted factors) and denoted by si .i D
1; : : : ; n/.
4. All solutions si are excluded for which a solution sj .j < i / exists such
that fj > fi . The aim is to eliminate the solutions that are not on the
boundary of the convex hull.
5. Then all the triplets of adjacent solutions are considered consecutively. The
middle solution is excluded if its point is below or on the line connecting
its neighbors in a plot of the goodness-of-fit versus the degrees of freedom.
6. Step 5 is repeated until no solution can be excluded.
7. The sti values of the “hull” solutions are determined.
8. The solution with the highest sti value is selected.
SIMULATION STUDY
To compare the performance of our newly proposed Hull method with existing
methods, we performed a simulation study. We aim to control the amount of
common variance in the artificial data sets and introduce a certain amount of
model error due to the presence of minor dimensions.
Manipulation of Conditions in the Simulation Study
In this simulation study we manipulated five experimental factors: (a) the number
of major factors, (b) the number of measured variables, (c) the sample size,
352
LORENZO-SEVA, TIMMERMAN, KIERS
Downloaded by [University of Groningen] at 06:45 17 November 2011
(d) the level of interfactor correlations, and (e) the level of common variance.
The levels for each factor were the following:
(a) The number of major factors .q/ ranged from 1 to 5.
(b) The number of measured variables .m/ manipulated was related to the
number of major factors. Two levels were considered: small .m=q D 5/
and large .m=q D 20/.
(c) Samples were of four sizes: very small .N D 100/, small .N D 200/,
medium .N D 500/, and large .N D 2,000/.
(d) The interfactor correlations were of three sizes: no correlation .® D :0/,
low correlation .® D :20/, and moderate correlation .® D :40/.
(e) The common variance of major factors had two levels: low (36% of the
overall variance was common variance) and high (64% of the overall
variance was common variance).
In the following section, we explain how the different characteristics were
introduced into the artificial data sets.
Data Construction
The simulation study constructs random data matrices X of N observations on
m variables, which are related to q major factors. The data were generated
according to the middle model of Tucker et al. (1969), which is rather popular
in simulation research (e.g., MacCallum & Tucker, 1991; MacCallum, Tucker,
& Briggs, 2001; Stuive, Kiers, & Timmerman, 2009; Zwick & Velicer, 1986).
The simulated data were constructed as
X D wmajorƒmajor € major C wminorƒminor € minor C wunicity ;
(6)
where ƒmajor and ƒminor are loading matrices; € major and € minor are factor score
matrices;  is a matrix containing unique scores; and, finally, wmajor, wminor,
and wunicity are weighting values that are used to manipulate the variances of
the major, minor, and unique parts of the data. Matrix ƒmajor .m q/ relates
the measured variables to the major factors and consists of only zeros and ones.
Each row contains a single value of one, denoting the relationship between
the observed variable with a single major factor. For reasons of simplicity, all
columns had the same number of ones, so each major factor was defined by
the same number of measured variables. Matrix ƒminor .m c/, with c being
the number of minor factors, is the minor loading matrix. The number of minor
factors was taken as c D 3 q. The elements of ƒminor were randomly drawn
from a uniform distribution on the interval [ 1, 1]. The variance related to the
Downloaded by [University of Groningen] at 06:45 17 November 2011
THE HULL METHOD
353
minor factors was kept constant and the minor factors accounted for 15% of the
overall variance.
The matrices € major .q N / and € minor .c N / contain the random scores on
the major and minor factors, respectively. As explained earlier, the interfactor
correlation was manipulated. For a given level of interfactor correlation, it
was kept constant across all factors. The matrix  .m N / contains the
random unique scores. All random scores were drawn from multivariate normal
distributions N.0; I/.
The five conditions were manipulated in an incompletely crossed design,
with 500 replicates per cell. For the conditions m=q D 5 and N D 100, only
three major factor values were considered (q D 1, q D 2, and q D 3): the
conditions with q D 4 and q D 5 were excluded because the ratio between the
number of variables (minimally 20) and sample size was so unfavorable that the
factor extraction method frequently failed to converge. The remaining conditions
were examined in a fully crossed design, which yielded (5 (number of major
factors) 2 (number of observed variables) 4 (sample sizes) 3 (interfactor
correlations) 2 (major factor variance) 500 (replicates))
2 (number of
major factors) 1 (number of observed variables) 1 (sample size) 3
(interfactor correlations) 2 (major factor variance) 500 (replicates)) D
114,000 simulated data matrices.
Methods for Determining the Number of Factors in Data
The aim of the study was to compare the performances of the Hull method
and existing methods in identifying the number of major factors. We selected
existing methods that are promising and/or frequently used in empirical practice.
We excluded those methods that had already been proven to perform poorly (i.e.,
Kaiser’s [1960] rule) and which require subjective judgement to be applied (i.e.,
the scree test). Each simulated data matrix was analyzed with the following
methods:
1. The Hull method, using the four goodness-of-fit indices presented earlier:
CFI (Hull-CFI), RMSEA (Hull-RMSEA), SRMR (Hull-SRMR), and CAF
(Hull-CAF).
2. The MAP test (Velicer, 1976). Although the MAP test is designed to
assess the number of PCs, it is frequently applied to assess the number
of common factors. To assess whether this practice is useful, we included
the MAP test in our comparison.
3. Horn’s (1965) PA. PA can be implemented in several different ways. As
not enough is known about PA of the common factor model to recommend
its general use (Steger, 2006), and PCA-related implementations are much
more popular, we used the latter. We considered a variant of PA that
Downloaded by [University of Groningen] at 06:45 17 November 2011
354
LORENZO-SEVA, TIMMERMAN, KIERS
appeared to be highly efficient in Peres-Neto et al. (2005) and which they
refer to as Rnd-Lambda. It is implemented as follows: (a) 999 random
correlation matrices are computed as the correlation matrices of the randomized values within variables in the data matrix, (b) the p value for each
empirical eigenvalue is then estimated as (number of random values equal
to or larger than the observed C1)/1000, and (c) the number of eigenvalues
related to a p value lower than a given threshold are considered the optimal
number of dimensions to be retained. In our study, we used a threshold p
value < .05.
4. AIC (Akaike, 1973) and BIC (Schwarz, 1978).
Some of the methods required an EFA to be carried out on the correlation
matrix and, to this end, we systematically used exploratory maximum likelihood
factor analysis.
Results
For each simulated data matrix, we recorded whether each of the analysis
methods indicated the correct number of major factors (i.e., the value of q).
To clarify the presentation of the results, we first focus on the methods that
showed an overall performance that was moderate to good (that is, an overall
success rate higher than 70%). These methods appeared to be PA, BIC, HullCAF, Hull-CFI, and Hull-RMSEA. Table 1 shows the percentage of successes for
those five overall moderate-to-good performing methods within each condition
of the simulation experiment and across all conditions.
As can be seen in Table 1, the overall success rate of PA and BIC was
moderate: 81% and 76%, respectively. The success rates of PA were good
(between 92% and 98%) when m=q was low, the sample size was very small
or small (i.e., 100 or 200, respectively), and/or the communality of the major
factors was high. BIC performed well (i.e., 91% or more correct) when sample
sizes were small or medium (i.e., 200 and 500, respectively).
The initial inspection of the results revealed that the three Hull variants
(Hull-CAF, Hull-CFI, and Hull-RMSEA) performed relatively well, with overall
success rates between 89% and 96% (not shown in Table 1). Of the three, the
Hull-CFI was the most successful. A close inspection of the results revealed that
Hull variants sometimes indicated that 0 factors should be extracted. Hull-CAF
was the most affected variant and showed this outcome in 3,911 of the 114,000
data matrices analyzed. This was mainly observed when the m=q ratio was
small, and the number of major factors was 1 (3,414 times) or 2 (374 times). If
Hull variants were forced to indicate at least one factor, the overall success of
Hull-CAF improved from 89% to 92%. All the results in Tables 1 and 2 were
computed when Hull variants were forced to indicate at least one factor.
THE HULL METHOD
355
TABLE 1
Percentages of Correctly Indicated Number of Major Factors for Methods With an Overall
Success Rate Higher Than 70%
Condition
Level
PA
BIC
Hull-CAF
Hull-CFI
Hull-RMSEA
h2 majors
Low
High
Very small
Small
Medium
Large
Low
Large
0
20
40
1
2
3
4
5
70
92
93
93
78
62
98
62
82
82
79
83
82
82
80
79
81
71
81
79
92
91
43
82
69
79
77
72
73
81
80
73
72
76
88
96
85
92
94
95
88
97
94
93
89
96
89
92
92
91
92
95
100
93
97
98
99
95
100
99
99
93
100
99
98
96
93
97
85
98
77
90
97
99
85
99
95
94
86
100
92
92
88
84
91
Downloaded by [University of Groningen] at 06:45 17 November 2011
N
m=q
ˆ
q
Overall
Note. PA D parallel analysis; BIC D Bayesian information criterion; Hull-CAF D hull based
on the Common Part Accounted For index; Hull-CFI D hull based on the Comparative Fit Index;
Hull-RMSEA D hull based on the Root Mean Square Error of Approximation index.
As can be seen in Table 1, the experimental conditions appeared to affect
the four Hull variants in different ways. The performance of Hull-RMSEA was
particularly affected (success rates between 77% and 84%) when the sample
size was very low or when the number of major factors was large (five). The
performances of Hull-CFI and Hull-CAF also decreased when the number of
major factors and interfactor correlations increased, but they were affected to
a lesser extent than Hull-RMSEA. Hull-CFI was especially successful (100%)
when m=q was large, the communality of the major factors was large, and the
number of major factors was one.
One interesting outcome is that no method in the simulation study was
systematically superior across all the simulation conditions considered. For
example, even though the overall success of PA was limited, it was still the most
successful method when m=q was low or when the sample size was very small.
The sample size condition proved to have considerable impact on the performance of the methods. The performances of the three Hull variants improved as
the sample size increased, as would be expected of a statistically proper behaving
statistic. The performance of BIC dropped dramatically from medium (500) to
356
LORENZO-SEVA, TIMMERMAN, KIERS
TABLE 2
Percentages of Correctly Indicated Number of Major Factors for Methods With an Overall
Success Rate Higher than 70%: Interaction Among Conditions
m/q D 5
N
q
PA
BIC
Hull-CAF
Hull-CFI
1
2
3
4
5
200 1
2
3
4
5
500 1
2
3
4
5
2,000 1
2
3
4
5
100
98
94
88
78
100
100
100
98
96
100
100
100
100
100
100
100
100
100
100
100
92
68
44
31
100
99
90
73
60
97
100
99
94
93
56
66
86
97
100
100
75
77
76
71
100
79
86
87
84
100
84
90
92
90
100
86
93
94
92
100
94
90
83
76
100
98
96
93
87
100
100
98
95
92
100
100
100
98
95
Downloaded by [University of Groningen] at 06:45 17 November 2011
100
m/q D 20
Hull-RMSEA PA
100
59
59
54
47
100
83
82
77
71
100
98
96
91
86
100
100
99
97
92
93
95
97
—
—
80
85
88
93
94
55
53
55
57
61
32
22
19
22
23
BIC
Hull-CAF
Hull-CFI
Hull-RMSEA
100
100
100
—
—
97
100
100
100
100
33
95
100
100
100
0
0
0
1
22
94
96
94
—
—
93
97
99
99
98
93
97
98
99
100
90
96
98
99
100
99
100
99
—
—
100
100
100
100
100
100
100
100
100
100
99
100
100
100
100
98
98
98
—
—
99
100
100
99
94
100
100
100
100
99
100
100
100
100
100
Note. PA D parallel analysis; BIC D Bayesian information criterion; Hull-CAF D hull based on the Common
Part Accounted For index; Hull-CFI D hull based on the Comparative Fit Index; Hull-RMSEA D hull based on
the Root Mean Square Error of Approximation index. Dashes indicate the conditions that are not computed in the
simulation study.
large (2,000) sample sizes. The performance of PA also consistently decreased
with increasing sample size.
The number of variables for each major factor (i.e., m=q) also had a considerable impact: when the proportion was low, PA was the most successful method.
However, when the proportion was large, only the Hull variants were successful.
The correlation among the major factors also affected method performance: when
the correlations were low or moderate, only the three Hull-based methods were
reasonably successful (i.e., success rates larger than 85%). Of the three, Hull-CFI
performed the best (i.e., success rates of 99% and 93%, respectively). Finally,
a large number of major factors had a negative impact on all the methods, and
only Hull-CFI and Hull-CAF were successful (i.e., success rates larger than
90%) when there were five major factors.
Applied researchers may be interested in knowing whether there is any
interaction between the sample size, the expected number of major factors,
and the proportion m=q. Table 2 shows the performance of PA, BIC, and the
three successful Hull variants. As the interfactor correlations and communality
of major factors cannot be manipulated for empirical data, we aggregated the
Downloaded by [University of Groningen] at 06:45 17 November 2011
THE HULL METHOD
357
data across these experimental conditions. As can be observed in Table 2, the
five methods were affected quite differently by the experimental manipulations.
It is interesting, for example, to note that BIC performed well, except when the
sample was large (and much worse if the proportion m=q was also large). PA
performed very well when the m=q ratio was small. However, when the m=q
ratio was large, PA performed increasingly worse as the sample size increased.
Hull-CFI performed excellently (success rates between 99% and 100%) when
the m=q ratio was large and very well when the m=q ratio was small and the
sample size larger than 100 (success rates between 87% and 100%). Hull-CAF
performed very well (success rates between 90% and 100%) when the proportion
m=q was large and reasonably when the m=q ratio was small and the sample size
larger than 100 (success rates between 79% and 100%). Hull-RMSEA performed
very well (success rates between 94% and 100%) when the m=q ratio was large
but relatively badly when the m=q ratio was low, the sample size lower than
500, and there was more than one major factor (success rates between 47% and
83%).
Finally, we studied the three methods that showed rather low overall success
rates: Hull-SRMR (55%), MAP (51%), and AIC (44%). Even though their
overall success rate was low, it is interesting that they still performed very
well in some of the experimental conditions. Hull-SRMR was very successful
(success rates between 84% and 100%) when m=q was large, the number of
major factors was below three, and the sample was large .N > 100/. MAP was
particularly successful (success rates between 85% and 100%) when m=q was
low and the sample large .N > 200/. AIC showed good success rates (success
rates between 94% and 97%) when m=q was low, the sample was very small,
and the number of major factors was between one and three.
ILLUSTRATIVE EXAMPLE
The Five-Factor Personality Inventory (FFPI) is a Dutch questionnaire developed
to assess the Big Five model of personality (Hendriks et al., 1999). The FFPI
consists of 100 Likert-type items rated on a 5-point scale. The FFPI assesses
a person’s scores on the following content dimensions: Extraversion, Agreeableness, Conscientiousness, Emotional Stability, and Autonomy. The FFPI is
a balanced questionnaire designed so that variance due to acquiescence response can be isolated in an independent dimension (Lorenzo-Seva & RodríguezFornells, 2006). Overall, six dimensions are usually extracted, which have a clear
substantive meaning (one dimension related to acquiescence response and five
dimensions related to content factors). Hendriks et al. constructed the test using
a sample of 2,451 individuals. This data set is used here to test the Hull method
for selecting the number of factors.
Downloaded by [University of Groningen] at 06:45 17 November 2011
358
LORENZO-SEVA, TIMMERMAN, KIERS
We performed Horn’s (1965) PA with the 95th percentile threshold. The
PA analysis suggested that 11 dimensions were related to a variance larger
than random variance. As a result, we considered the solutions ranging from 0
common factors to 12 common factors. We analyzed the interitem correlation
matrices using minimum rank factor analysis (Ten Berge & Kiers, 1991). As a
goodness-of-fit index, we used the CAF.
Figure 1 (already introduced in the Hull Method section) shows the hull plot
related to this data set: the plot is based on the CAF versus the degrees of
freedom for the 13 solutions. When the numerical convex hull-based method for
selection was applied to this plot, five “hull” solutions were provided (i.e., five
solutions were placed on the convex hull), which are indicated by larger points
in Figure 1. The scree test values sti the three “hull” solutions are given in
Table 3. The solution with six common factors is associated with the maximal
sti value, indicating that this solution should be selected.
The Spanish version of the FFPI was validated using a sample of 567 undergraduate college students enrolled on an introductory course in psychology at a
Spanish university (Rodríguez-Fornells, Lorenzo-Seva, & Andrés-Pueyo, 2001).
We also applied the Hull method to this data set to assess whether the method
TABLE 3
Number of Factors q, Goodness-of-Fit Values f,
Degrees of Freedom df, and Scree Test Values st
for the 5 Solutions on the Higher Boundary
of the Convex Hull in Figure 1
q
0
1
2
3
4
5
6
7
8
9
10
11
12
*
*
*
*
*
f
df
st
.052
.096
.129
.169
.223
.300
.377
.398
.408
.417
.427
.437
.443
0
100
199
297
394
590
585
679
772
864
955
1,045
1,134
—
—
—
—
—
—
2.542
2.054
—
—
—
1.755
—
Note. The Scree Test value (st) is only computed for the
solutions placed on the convex hull—the dashes indicate the
solutions that are not placed on the convex hull; * D solutions
on the higher boundary.
THE HULL METHOD
359
Downloaded by [University of Groningen] at 06:45 17 November 2011
still suggested that 6 dimensions should be extracted if a different, smaller
sample was selected. With this data set, Horn’s (1965) PA suggested that 11
dimensions were related to a variance larger than random variance. Applying
the numerical convex hull-based model for selection to this plot yielded four
“hull” solutions. The scree test values st indicated that the sixth dimensional
factor solution should be selected.
DISCUSSION
We have proposed a new method for assessing the number of major factors
that underlies a data set. The Hull method is based on the numerical convex
hull-based heuristic (Ceulemans & Kiers, 2006). It uses Horn’s (1965) PA to
determine an upper bound for the number of factors to be considered as possible
optimal solutions. The Hull method can be based on different goodness-of-fit
indices, which yield different Hull variants. We have illustrated the Hull method
with an empirical example. To assess its comparative performance, we carried
out a simulation study in which both major and minor factors were considered.
In empirical research, it is reasonable to expect that data sets will contain these
two kinds of factors. Four variants of the Hull method were compared with four
other methods for indicating the number of common factors. Of these four, two
(MAP and PA) were designed to indicate the number of principal components.
They were included because they are frequently used in applied research in the
context of EFA.
The Hull variants were very successful and detected the number of major
factors on a high number of occasions (between 85% and 94%). Hull-CFI (i.e.,
Hull based on CFI) and, to a somewhat lesser extent, Hull-CAF (i.e., Hull based
on KMO) were the most successful methods. Hull-CFI can be applied only when
ML and ULS factor extraction are used, whereas Hull-CAF can be used with
any factor extraction method. In empirical practice, we would advise that HullCFI be applied with ML or ULS extraction methods and Hull-CAF with any
other extraction method to indicate the number of major factors, except when the
number of observed variables per factor is small (five in our simulation study).
In these cases, PA seems to be an excellent alternative.
As mentioned earlier, Hull assesses the goodness-of-fit of a series of factor
solutions. In principle, the Hull method is not necessarily tied to a particular
index, and different fit indices could be used. We considered four indices (CAF,
CFI, RMSEA, and SRMR). Our results indicate that the fit index used affects
the performance of the Hull method. The CFI and CAF appeared to be the
most successful indices, whereas the performance of SRMR was unacceptable
(55% of overall success). Our advice is to use Hull in conjunction with one
of these two indices mainly because in our intensive tests they performed well.
Downloaded by [University of Groningen] at 06:45 17 November 2011
360
LORENZO-SEVA, TIMMERMAN, KIERS
However, researchers could use other fit indices as long as they are properly
tested.
As researchers do not usually consider extracting as many factors as observed
variables, we propose that Hull should do the same. Hull, therefore, takes as
an upper bound the number of factors suggested by PA plus one. To check
the appropriateness of this decision, in a previous simulation study we also
computed Hull-CAF with all the possible numbers of factors to be extracted (i.e.,
any value between 1 and m 1). The overall success rate (defined as whether
the correct number of major factors was indicated) was extremely low (46%).
This outcome reinforces our suggestion that Hull must look for a relatively low
number of factors and that taking PA C1 as the upper bound of the number of
factors to be extracted is a useful decision.
Of the methods considered, Horn’s (1965) PA was also moderately successful
(81%), especially when the number of measured variables per factor was low.
It should be pointed out that although our implementation was related not to
the common factor model but to the principal component model, it performed
relatively well (especially when the number of variables per factor was low).
This is an interesting result because applied researchers assessing the number of
common factors usually use this type of implementation. In our simulation study,
we used equal communalities across variables in the population to ensure that
the major factors were well defined in the data set (i.e., we did not want major
factors to be easily confused with minor factors). Because a PCA solution yields
biased loadings (compared with EFA loadings), where bias increases consistently
with decreasing communalities, the overall loading pattern is the same as the one
obtained in EFA (when communalities are similar, as they were in our case).
In this situation, our implementation of PA based on PCA could in fact be
expected to perform well even if we proposed a common factor model. Further
research is needed to study the performance of this implementation of PA in
a common factor context when the measured variables have different levels of
communalities.
BIC was not very successful overall. This was due to the fact that it performed
very badly when the sample size was large (2,000 in our simulation study).
As AIC (Akaike, 1973) had been observed to overestimate the number of
dimensions in large samples, Schwarz (1978) proposed BIC as a refined version
of AIC. BIC, then, was expected to improve the performance of AIC (44%
of success in our simulation study) and, although in our study it was a clear
improvement on AIC, in large samples it still seems to have difficulties.
MAP (Velicer, 1976) performed badly in the overall results (51% of success).
It should be pointed out that MAP had not been previously tested in the context of
the common factor model or in situations in which major and minor dimensions
were present in the data. This method was mainly successful when m=q was
low and the sample was large .N > 200/.
Downloaded by [University of Groningen] at 06:45 17 November 2011
THE HULL METHOD
361
We do not wish to end without one final comment. We have proposed an
objective procedure for assessing the dimensionality of data sets and have
tested it with other procedures. We believe that these objective criteria are
useful for inexperienced researchers and in the early stages of exploratory factor
analysis. However, they should never be interpreted rigidly: they are helpful tools
for making an initial selection of interesting solutions. For the final selection
decision, substantive information and the interpretability of the results should
also be taken into account. To blindly retain a particular number of factors just
on the say-so of a criterion is bad practice and should always be avoided.
To facilitate the use of all the four variants of Hull considered here in
empirical practice, we have developed a Windows program. A copy of this
program, a demo, and a short manual can be obtained from us free of charge or
downloaded from the following site: http://psico.fcep.urv.cat/utilitats/factor/
ACKNOWLEDGMENTS
Research was partially supported by a grant from the Catalan Ministry of
Universities, Research and the Information Society (2009SGR1549) and by a
grant from the Spanish Ministry of Education and Science (PSI2008-00236). We
are obliged to Jos ten Berge for his helpful comments about an earlier version
of this article.
REFERENCES
Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In
B. N. Petrox & F. Caski (Eds.), Second International Symposium on Information Theory (p. 267).
Budapest, Hungary: Akademiai Kiado.
Basilevsky, A. (1994). Statistical factor analysis and related methods: Theory and applications.
New York, NY: Wiley.
Bentler, P. M., & Kano, Y. (1990). On the equivalence of factors and components. Multivariate
Behavioral Research, 25, 67–74.
Browne, M. W., & Cudeck, R. (1992). Alternative ways of assessing model fit. Sociological Methods
and Research, 21, 230–258.
Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral Research,
1, 245–276.
Cattell, R. B., & Tsujioka, B. (1964). The importance of factor-trueness and validity, versus homogeneity and orthogonality, in test scales. Educational and Psychological Measurement, 24,
3–30.
Ceulemans, E., & Kiers, H. A. L. (2006). Selecting among three-mode principal component models
of different types and complexities: A numerical convex hull based method. British Journal of
Mathematical and Statistical Psychology, 59, 133–150.
Ceulemans, E., & Kiers, H. A. L. (2009). Discriminating between strong and weak structures in threemode principal component analysis. British Journal of Mathematical and Statistical Psychology,
62, 601–620.
Downloaded by [University of Groningen] at 06:45 17 November 2011
362
LORENZO-SEVA, TIMMERMAN, KIERS
Ceulemans, E., Timmerman, M. E., & Kiers, H. A. L. (2011). The CHull procedure for selecting
among multilevel component solutions. Chemometrics and Intelligent Laboratory System, 106,
12–20.
Ceulemans, E., & Van Mechelen, I. (2005). Hierarchical classes models for three-way three-mode
binary data: Interrelations and model selection. Psychometrika, 70, 461–480.
Cliff, N. (1988). The eigenvalues-greater-than-one rule and the reliability of components. Psychological Bulletin, 103, 276–279.
Comrey, A. L. (1978). Common methodological problems in factor analytic studies. Journal of
Consulting and Clinical Psychology, 46, 648–659.
Costello, A. B., & Osborne, J. W. (2005). Best practices in exploratory factor analysis: Four
recommendations for getting the most from your analysis. Practical Assessment, Research &
Evaluation, 10(7), 1–9.
Crawford, C. B., & Koopman, P. (1973). A note on Horn’s test for the number of factors in factor
analysis. Multivariate Behavioral Research, 8, 117–125.
Eklöf, H. (2006). Development and validation of scores from an instrument measuring student
test-taking motivation. Educational and Psychological Measurement, 66, 643–656.
Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of
exploratory factor analysis in psychological research. Psychological Methods, 4, 272–299.
Fava, J. L., & Velicer, W. F. (1992). The effects of overextraction on factor and component analysis.
Multivariate Behavioral Research, 27, 387–415.
Finch, J. F., & West, S. G. (1997). The investigation of personality structure: Statistical models.
Journal of Research in Personality, 31, 439–485.
Glorfeld, L. W. (1995). An improvement on Horn’s parallel analysis methodology for selecting the
correct number of factors to retain. Educational and Psychological Measurement, 55, 377–393.
Guttman, L. (1954). Some necessary conditions for common-factor analysis. Psychometrika, 19,
149–161.
Harshman, R. A., & Reddon, J. R. (1983). Determining the number of factors by comparing real
with random data: A serious flawand some possible corrections. Proceedings of the Classification
Society of North America at Philadelphia, 14–15.
Hayton, J. C., Allen, D. G., & Scarpello, V. (2004). Factor retention decisions in exploratory factor
analysis: A tutorial on parallel analysis. Organizational Research Methods, 7, 191–205.
Hendriks, A. A. J., Hofstee, W. K. B., & De Raad, B. (1999). The Five-Factor Personality Inventory
(FFPI). Personality and Individual Differences, 27, 307–325.
Henson, R. K., & Roberts, J. K. (2006). Use of exploratory factor analysis in published research:
Common errors and some comment on improved practice. Educational and Psychological Measurement, 66, 393–416.
Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika,
30, 179–185.
Hu, L., & Bentler, P. M. (1998). Fit indices in covariance structure modeling: Sensitivity to underparameterized model misspecification. Psychological Methods, 3, 424–453.
Humphreys, L. G., & Ilgen, D. R. (1969). Note on a criterion for the number of common factors.
Educational and Psychological Measurement, 29, 571–578.
Humphreys, L. G., & Montanelli, R. G. (1975). An investigation of the parallel analysis criterion
for determining the number of common factors. Multivariate Behavioral Research, 10, 193–205.
Jennrich, R. I., & Robinson, S. M. (1969). A Newton-Raphson algorithm for maximum likelihood
factor analysis. Psychometrika, 34, 111–123.
Joliffe I. T. (1972). Discarding variables in a principal component analysis I: Artificial data. Applied
Statistics, 21, 160–173.
Joliffe I. T. (1973). Discarding variables in a principal component analysis II: Real data. Applied
Statistics, 22, 21–31.
Downloaded by [University of Groningen] at 06:45 17 November 2011
THE HULL METHOD
363
Jöreskog, K. G. (1977). Factor analysis by least-squares and maximum-likelihood methods. In K.
Enslein, A. Ralston, & H. S. Wilf (Eds.), Statistical methods for digital computers (Vol. 3, pp.
127–153). New York, NY: Wiley.
Kahn, J. H. (2006). Factor analysis in counseling psychology research, training, and practice. The
Counseling Psychologist, 34, 684–718.
Kaiser, H. F. (1960). The application of electronic computers to factor analysis. Educational and
Psychological Measurement, 20, 141–151.
Kaiser H. F. (1970). A second generation of Little Jiffy. Psychometrika, 35, 401–415.
Kaiser, H. F., & Rice, J. (1974). Little Jiffy, Mark IV. Educational and Psychological Measurement,
34, 111–117.
Lance, C. E., Butts, M. M., & Michels, L. C. (2006). The sources of four commonly reported cutoff
criteria: What did they really say? Organizational Research Methods, 9, 202–220.
Lattin, J., Carroll, D. J., & Green, P. E. (2003). Analyzing multivariate data. Belmont, CA: Duxbury
Press.
Lawrence, F. R., & Hancock, G. R. (1999). Conditions affecting integrity of a factor solution under
varying degrees of overextraction. Educational and Psychological Measurement, 59, 549–578.
Levonian, E., & Comrey, A. L. (1966). Factorial stability as a function of the number of orthogonally
rotated factors. Behavioral Science, 11, 400–404.
Lorenzo-Seva, U., & Rodríguez-Fornells, A. (2006). Acquiescent responding in balanced multidimensional scales and exploratory factor analysis. Psychometrika, 71, 769–777.
MacCallum, R. C., & Tucker, L. R. (1991). Representing sources of error in the common-factor
mode: Implications for theory and practice. Psychological Bulletin, 109, 502–511.
MacCallum, R. C., Tucker, L. R., & Briggs, N. E. (2001). An alternative perspective on parameter
estimation in factor analysis and related methods. In R. Cudeck, S. du Toit, & D. Söorbom
(Eds.), Structural equation modeling: Present and future (pp. 39–57). Linkolnwood, IL: Scientific
Software International, Inc.
Markon, K. E., & Krueger, R, F. (2004). An empirical comparison of information: Theoretic selection
criteria for multivariate behavior genetic models. Behavior Genetics, 34, 593–609.
Peres-Neto, P. R., Jackson, D. A., & Somers, K. M. (2005). How many principal components?:
Stopping rules for determining the number of non-trivial axes revisited. Computational Statistics
Data Analysis, 49, 974–997.
Reddon, J. R., Marceau, R., & Jackson, D. N. (1982). An application of singular value decomposition
to the factor analysis of MMPI items. Applied Psychological Measurement, 6, 275–283.
Rodríguez-Fornells, A., Lorenzo-Seva, U., & Andrés-Pueyo, A. (2001). Psychometric properties of
the Spanish adaptation of the Five-Factor Personality Inventory. European Journal of Psychological Assessment, 17, 133–145.
Schepers, J., Ceulemans, E., & Van Mechelen, I. (2008). Selecting among multi-mode partitioning
models of different complexities: A comparison of four model selection criteria. Journal of
Classification, 25, 67–85.
Schwarz, R. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.
Steger, M. F. (2006). An illustration of issues in factor extraction and identification of dimensionality
in psychological assessment data. Journal of Personality Assessment, 86, 263–272.
Steiger, J. H., & Lind, J. M. (1980, June). Statistically based tests for the number of common factors.
Paper presented at the annual meeting of the Psychometric Society, Iowa City, IA.
Stuive, I., Kiers, H. A. L. & Timmerman, M. E. (2009). Comparison of methods for adjusting
incorrect assignments of items to subtests: Oblique multiple group method versus confirmatory
common factor method. Educational and Psychological Measurement, 69, 948–965. doi:10.1177/
0013164409332226
Ten Berge, J. M. F., & Kiers, H. A. L. (1991). A numerical approach to the approximate and the
exact minimum rank of a covariance matrix. Psychometrika, 56, 309–315.
Downloaded by [University of Groningen] at 06:45 17 November 2011
364
LORENZO-SEVA, TIMMERMAN, KIERS
Ten Berge, J. M. F., & Socan, G. (2004). The greatest lower bound to the reliability of a test and
the hypothesis of unidimensionality. Psychometrika, 69, 613–625.
Thompson, B., & Daniel, L. G. (1996). Factor analytic evidence for the construct validity of scores:
A historical overview and some guidelines. Educational and Psychological Measurement, 56, 2,
197–208.
Tucker, L. B., Koopman, R. F., & Linn, R. L. (1969). Evaluation of factor analytic research
procedures by means of simulated correlation matrices. Psychometrika, 34, 421.
Velicer, W. F. (1976). Determining the number of components from the matrix of partial correlations.
Psychometrika, 41, 321–327.
Velicer, W. F., Eaton, C. A., & Fava, J. L. (2000). Construct explication through factor or component
analysis: A review and evaluation of alternative procedures for determining the number of factors
or components. In R. D. Goffin & E. Helmes (Eds.), Problems and solutions in human assessment:
Honoring Douglas N. Jackson at seventy (pp. 41–71). Boston, MA: Kluwer.
Widaman, K. F. (1993). Common factor analysis versus principal component analysis: Differential
bias in representing model parameters? Multivariate Behavioral Research, 28, 263–311.
Wood, J. M., Tataryn, D. J., & Gorsuch, R. L. (1996). Effects of under- and overextraction on
principal axis factor analysis with varimax rotation. Psychological Methods, 1, 354–365.
Worthington, R. L., & Whittaker, T. A. (2006). Scale development research: A content analysis and
recommendations for best practices. The Counseling Psychologist, 34, 806–838.
Zwick, W. R., & Velicer, W. F. (1986). Comparison of five rules for determining the number of
components to retain. Psychological Bulletin, 99, 432–442.
Download