Running Head: Taxometrics On the Detection of Latent Structures:

advertisement
Running Head: Taxometrics
On the Detection of Latent Structures:
An Analytic consideration of Meehl’s MAXCOV,
MAMBAC, and MAXSLOPE procedures
Michael Maraun, Peter Halpin,
and Stephanie Gabriel
Department of Psychology
Simon Fraser University
Correspondence: Michael Maraun, Dept. of Psychology, Simon Fraser
University, Burnaby, B.C., Canada V5A 1S6 Fax: (604) 291-3427
1
Abstract
In a series of papers, psychometrician Paul Meehl and colleagues have developed
what he calls taxometrics, a set of procedures which, he claims, can be used to effectively
detect taxonic latent structures (TLSs; i.e., structures in which the latent variable θ is not
continuously, but rather Bernoulli, distributed) when, in fact, they do exist. Three of the
most widely used taxometric procedures are MAXCOV, MAMBAC, and MAXSLOPE.
As implied by a number of comments made by McDonald (2003) in his review of
Waller and Meehl’s (1998) Multivariate taxometric procedures: Distinguishing types
from continua, while Meehl’s taxometric procedures are the product of a true
spirit of inventiveness, the work that he has published in support of these
procedures is perhaps short on proofs of statistical assertions, and, more
generally, psychometric theory. Given that usage of these procedures is on the rise, it
is imperative that an analytical consideration of the claims that MAXCOV, MAMBAC,
and MAXSLOPE are detectors of the taxonic latent structure, be undertaken. This is the
task undertaken in the current work
2
On the Detection of Latent Structures: An Analytic consideration of Meehl’s
MAXCOV, MAMBAC, and MAXSLOPE procedures
In a series of papers, psychometrician Paul Meehl and colleagues introduced to
the researcher what he calls taxometrics, a set of procedures which, he claims, can be
used to effectively detect taxonic latent structures (TLSs; i.e., structures in which the
latent variable θ is not continuously, but rather Bernoulli, distributed). Examples of
these procedures are MAXCOV (Meehl & Yonce, 1996), MAMBAC (Meehl & Yonce,
1994), MAXSLOPE (Grove & Meehl, 2003), and MAXEIG (Waller & Meehl, 1998). In
support of their claims that these procedures are detectors of latent taxa, Meehl and
colleagues have offered up a mixture of mathematical deductions, reasoned
arguments, and Monte Carlo simulations, and their generally positive
assessments of the performances of their procedures would appear to have been
corroborated by an ever increasing number of independent Monte Carlo studies.
On the other hand, as implied by a number of comments made by McDonald (2003)
in his review of Waller and Meehl’s (1998) Multivariate taxometric procedures:
Distinguishing types from continua, this support is perhaps short on proofs of
statistical assertions, and, more generally, psychometric theory.
In the current work, an analytical investigation is conducted into the claims that
MAXCOV, MAMBAC, and MAXSLOPE are detectors of the TLS.1 Such an undertaken
is destined to fail if the terms on which it is founded are not clearly defined. Thus, before
preceding to the analysis, the concepts latent structure, defining characteristic and
manifest criterion of a latent structure, and assumption attendant to the employment of a
1
The MAXEIG procedure is handled in Maraun, Halpin, Gabriel, and Tkatchouk (2007).
3
tool of detection, and a review of the logic of employing manifest criteria as tools for the
detection of latent structures, are explicated.2 Finally, the support Meehl and his coauthors have provided for their candidate tools of detection, notably, the Monte Carlo
support that is a hallmark of their efforts, is considered.
1. Latent structures, manifest criteria, and assumptions
Let there be a set of p manifest variates (indicators), Xj, j=1..p, contained in
vector X, and è be a latent or unobservable variate, and let all variates be jointly
distributed in a focal population P. A (unidimensional) latent structure, LS, is a
set of (latent) properties, ti, each pertaining to either: (a) fX|è=θ , the joint density of
the manifest variates, conditional on è ; or (b) fè , the joint density of è itself (see,
e.g., Lazarsfeld, 1950, Anderson, 1959, McDonald, 1962, Bartholomew & Knott,
1999). Thus, a particular latent structure LS* is specified unambiguously as the
intersection,
k
∩t
i
, of its k defining characteristics, ti, each of these being of either
i =1
type (a) or (b).
Whereas a latent property is a property of a latent structure and, hence, a
property of either type (a) or (b), a manifest property, pi, is a property of the joint
density, fX, of the indicators. A manifest criterion, ci, of a latent structure
k
LS* ≡ ∩ t i is a pi that is, additionally, either a necessary, sufficient, or necessary
i =1
2
As will be seen, it has become distressingly commonplace, and damaging, in theoretical work on latent
variable structures to conflate the concepts of assumption and defining characteristic.
4
and sufficient condition of
k
∩t
i
. For example, the four defining characteristics of
i =1
one sub-specifies of the unidimensional, linear factor structure are:
1. the latent variate
has a normal distribution;
2. the vector of manifest variates, X = [X1, X2, …, Xp]’, is multivariate
normal conditional on each value of
3. the Xj have a linear regression on
4. conditional on each value of
;
;
, the Xj are uncorrelated.
The specification of characteristics (1) - (4) settles what is meant by (one
sense of) unidimensional, linear factor structure, and allows for the possibility of
deducing manifest criteria of this latent structure. The most important manifest
criterion of (1) - (4) is, of course, that ΣX = ΛΛ' + Ψ, in which Λ is a vector of real
coefficients, and Ψ is a p by p diagonal and positive definite matrix.
In contrast to both the defining characteristics {t1, t2…,tk} of a latent
k
structure LS* ≡ ∩ t i whose detection is of interest, and the manifest criteria that
i =1
are logically related to
k
I
t i , an assumption, aj, attendant to the employment of a
i =1
pi to make decisions about whether LS* is a latent structure of a particular X, is a
quantitative feature of fX that is required to hold in order that pi be a ci of LS*.
It is standard within mainstream statistics to carefully distinguish between
targets of detection, on the one hand, and assumptions attendant to the
successful employment of tools of detection, on the other. Consider the situation
5
in which it is desired to detect whether or not the means of two populations, µ1
and µ 2 , are equal. As is well known, it is not true that
{ µ1 = µ 2 }⇒
n1 n2 ( x 1 − x 2 )
~ t ( n1 +n2 −1) . It is true, however, that {{f(X|1) and f(X|2)
n1 + n2
sp
are normal distributions}∩{ σ 1 = σ 2 }∩ { µ1 = µ 2 }}⇒
n1 n2 ( x 1 − x 2 )
~ t ( n1 +n2 −1) .
n1 + n2
sp
Hence, conditional normality and homogeneity of variances are assumptions
attendant to the use of the two-sample t-test to make valid decisions about
whether, in fact, µ1 and µ 2 are equal (the magnitude of µ1 - µ 2 being the object of
detection). In the context of the detection of latent structures, an assumption is a
nuisance condition required in order for a given manifest property to be a
manifest criterion for LS (just as conditional normality and homogeneity of
variances are nuisance conditions that must be satisfied in order that the test
statistic employed in the two-sample t-test has a t-distribution).
Given that manifest criteria, {c1,c2,...,ck}, have been deduced for a given LS,
LS*, a test of whether LS* is an LS of a particular X is a test of whether one or
more of these ci do, in fact, describe fX. What kind of test is a test of whether a
given ci describes fX depends upon the nature of the logical link of ci to LS*. If
k
property ci is a necessary condition of LS* ≡ ∩ t i , then it is true that
i =1
k
∩t
i
⇒ci ,
i =1
and if ∼ci is true, then LS* is not a latent structure of X. The fact that ΣX = ΛΛ' + Ψ
is a necessary condition of (1) - (4) means that if ΣX cannot be represented as ΛΛ'
+ Ψ, then it can be logically concluded that the latent structure whose defining
6
characteristics are (1) - (4) is not a latent structure of X. Clearly, there could not
have been developed such a test of whether (1) - (4) is a latent structure of a set of
indicators, if (1) - (4) had not been specifiable as the latent structure whose
detection was of interest.
If, on the other hand, ci is a sufficient condition of LS, then it is true that
k
ci⇒ ∩ t i , and if ci turns out to be a property of fX, one may validly (but
i =1
provisionally) conclude that LS is a latent structure of X. Note that
c=[ΣX=Λ2Λ2'+Ψ] is not a sufficient condition of any of the two-dimensional linear
factor structures because it is not true that ∼LS2lf⇒∼c=[ΣX=Λ2Λ2'+Ψ]. On the
contrary, if a latent structure of X is the unidimensional, quadratic factor
structure, then it is also the case that ΣX=Λ2Λ2'+Ψ (McDonald, 1967). Hence, the
manifest property ΣX=Λ2Λ2'+Ψ cannot be used to distinguish between the 2dimensional linear, and 1-dimensional quadratic, factor structures, but only to
rule out both as possible latent structures of X. A manifest property that is both a
necessary and sufficient condition of LS can, of course, be used to both logically
disqualify and (provisionally) confirm LS as the latent structure of X.
Whether a manifest property ci is a manifest criterion of LS is established
by mathematical proof. There are no degrees of criterion-hood: A manifest
property either is or is not a criterion of a given LS. This degree of certainty is, of
course, not achievable when the researcher employs a particular criterion, ci, to
test whether X has some particular LS. Firstly, the researcher must make this
decision, not on the basis of knowledge of the properties of fX, but, rather, on the
7
basis of inferences about these properties based on a sample of size N drawn
from fX. Secondly, it is never the case that a manifest criterion ci holds exactly for
a given fX. Even in the population, the task is always one of judging whether ci
describes fX "well enough". Hence, the researcher must decide, on the basis of a
sample, whether ci is sufficiently close to being a property of fX. The existence of
approximation and sampling error renders sample based decisions made about
the existence of latent structures via the employment of criteria, inherently
tentative.
However, while the issues of sampling and approximation error are
certainly important, their fruitful consideration is contingent upon the
antecedent determination of whether a given pi is, in fact, a ci of a given latent
k
structure LS*. Because a property pi is a ci of LS* ≡ ∩ t i only if it a necessary,
i =1
sufficient, or necessary and sufficient condition of
k
∩t
i
, to ascertain whether or
i =1
not it is so presupposes that the defining characteristics
k
∩t
i
of LS* can be
i =1
unambiguously specified.3
The foregoing can be summarized as follows:
i. A latent structure is simply the intersection of properties of either fX| =θ
or f .
3
The truth value of a material implication, for example, cannot be established unless both its antecedent
and consequent can be unambiguously specified.
8
ii. There is no meaning to a statement of interest in “deciding whether
latent structure LS* is a latent structure of X” unless the full set of defining
characteristics of LS* can be unambiguously specified. If such a specification
cannot be provided, then the target of detection is unclear, and such detection
talk is vacuous.
iii. To build a tool of detection for particular latent structure LS*, whose
defining characteristics are specified as LS*={ ∩ t i }, is to mathematically deduce
i
one or more manifest criteria of LS*. One establishes mathematically that c* is a
manifest criterion by proving that one of the following is true: { ∩ t i }⇒c* (i.e., c*
i
is a necessary condition of LS*, and, hence, can be employed to disqualify LS* as a
latent structure of a set of indicators); c*⇒{ ∩ t i } (c* is a sufficient condition of LS*,
i
and, hence, can be employed to confirm that LS* is a latent structure of a set of
indicators); { ∩ t i } ⇔ c* (c* is a necessary and sufficient condition of LS*, and,
i
hence, can be used to both disconfirm LS* and provisionally confirm that LS* is a
latent structure of a set of indicators).
iv. The manifest criteria yielded by a given latent structure need not be
useful in testing for the existence of the latent structure. A latent structure must
be specified well enough, i.e., enough and the right combination of
characteristics specified, so as to yield manifest criteria that are unique to it, and,
hence, useful as input into the construction of a tool of detection.
9
v. Because to prove that manifest property c* is, in fact, a manifest criterion
of a latent structure LS* is to prove mathematically that c* is either a necessary,
sufficient, or necessary and sufficient condition of LS*, no such proof is possible,
and the status of c* must remain in question, unless the defining characteristics of
LS* can be unambiguously specified.
vi. A fruitful evaluation of whether property c* as a criterion of a given LS
not only requires that the defining characteristics of the LS be unambiguously
specified, but that these characteristics be carefully distinguished from any
assumptions that are necessary to bring about the criterion-hood of c*.
Assumptions and defining characteristics are fundamentally different types of
properties. The latter constitute the target of detection, while the former are
nuisance side-conditions that are required to make a tool of detection perform
adequately.
vii. Latent variable modelling technologies such as linear factor analysis,
linear structural equation modelling, item response theory, etc., are not general,
open-ended data analytic techniques, nor even open-ended tools for the
exploration of the “latent domain.” At root, each such name stands for the
linkage of a particular class of latent structures and associated, mathematically
deduced, manifest criteria. Thus, each “technique” is a tool for the detection of a
particular latent structure. Meehl’s appreciation of this point was the reason that
he rejected existing latent variable technologies, and began his development of
taxometric tools, when faced with the need to detect taxonic latent structures.
10
Thus, when one tests the core hypothesis of unidimensional factor analysis, i.e.,
that ΣX=ΛΛ'+Ψ, one is not testing a general claim about the existence of a latent
variate, nor conducting open-ended explorations of the latent domain, but,
rather, whether or not the hypothesis that a particular latent structure, the
unidimensional linear factor structure, underlies the indicators, can be
disconfirmed.
2. Meehl's taxometric procedures
Meehl invented his taxometric procedures with the aim of providing the
applied researcher with tools that would yield valid decisions about whether or
not a set of variates have a taxonic latent structure. MAXCOV, MAMBAC, and
MAXSLOPE, three of the more popular of these procedures, are each based on an
assertion that a particular manifest property necessarily obtains when the latent
structure of a set of manifest variates is taxonic. More particularly, each of these
procedures is based on the claim that a particular manifest property is a
necessary condition and, hence, manifest criterion of the TLS. However, it is not
possible to determine the truth value of material implications such as these
unless their antecedents and consequents are unambiguously specified. The
antecedent of each of the material implications of MAXCOV, MAMBAC, and
MAXSLOPE is the intersection of the defining characteristics of the TLS.
However, as will be seen, it is surprisingly difficult to acertain precisely what
11
Meehl takes to be these defining characteristics. Part of the difficulty lies in the
fact that assumption and defining characteristic appear to have been conflated in
Meehl’s published work.4
The procedures contained in Meehl's taxometric suite require different
numbers of manifest variates (indicators). Let Xj, j=1..r be a set of r continuously
distributed indicators5 and place these indicators in the random vector X. For an
⎛r⎞
arbitrary member of the set of ⎜ ⎟ partitions of the r indicators into p “output”,
⎝p⎠
and q=r-p “input”, indicators, let vector X1 contain the output, and X2 the input,
indicators. The random variate t=1’X2 is the unweighted sum of the indicators
contained in X2, and, hence, is itself an input indicator. While in the treatments
of Meehl and his co-authors, X2 contains but a single indicator, it will turn out to
be worthwhile to consider the case in which q>1. Thus, X2 will contain q≥1
indicators, with Meehl’s standard treatment that case in which t is the single
variate contained in X2. Finally, let
be a random, latent variate, let X and
have a joint distribution in some population under study, and let fX|
=s,
s={T’,T},
be the densities of X conditional on each of T’ and T.
4
While Meehl has offered many expository accounts of what he means by the concept taxon, and
has argued that this concept is an open concept whose contour lines are still being worked out,
there is no place for such equivocation with respect the defining characteristics of the TLS, at least
if mathematics is to be employed.
5
There has also existed in the literature controversy over whether it is appropriate to employ
dichotomous indicators as input into taxometric procedures (Maraun, Slaney, and Goddyn, 2003;
Meehl, 1995; Meehl & Yonce, 1996, p 1114). Meehl has indicated his preference for continuous
indicators, at least until this controversy is resolved (Meehl and Yonce, 1996, p.1114), and, hence,
unless otherwise noted, the Xj will be taken throughout the current work to be continuously
distributed random variates.
12
MAXCOV
⎛ (2 + q) ⎞
In MAXCOV, p=2 and q≥1, so that there are ⎜
⎟ partitions of the set
⎝ 2 ⎠
of indicators. Let Xi and Xj symbolize the indicators contained in X1. The focal
manifest property of MAXCOV is the conditional covariance function, C(Xi,Xj|t=x).
For MAXCOV to be a detector of TLSs, it must be the case that the material
implication “if X has a TLS, then C(Xi,Xj|t=x) is a single-peaked function of t.”
To determine whether this implication is true, its antecedent must be specifiable.
The specification of the antecedent in turn requires the specification of the
defining characteristics of the TLS. What then are the defining characteristics of
the TLS? Meehl and Yonce (1996, p. 1097) claim that the "conjectured structure
is...highly general, that of two overlapping unimodal frequency distributions." It
is clear from this that one characteristic of the TLS is:
1)
B : The latent variate
and P( =T')=(1-
T),
is Bernoulli distributed with 0<P( =T)=
T<1
in which T stands for the taxon, and T' the complement,
class.
This conclusion is further supported by the prominent place afforded to what
Meehl calls the "covariance mixture theorem", a decomposition of the covariance
matrix of Xi and Xj that is implied by every latent structure whose set of defining
characteristics includes B . However, the version of this theorem featured in
13
Meehl's papers is not a description of the conditioning situation relevant to
MAXCOV, the correct expression being (for a proof, see the appendix of Maraun
and Slaney, 2006):
2) C([Xi , X j ]||t = x) = ΠTx
in which
Tx=P(
Tx
=T|t=x),
+ (1 − Π Tx )
T'x
+ Π Tx (1 − Π Tx )(
Tx=C([Xi,Xj]|t=x
∩
=T) and
Tx
−
T'x
= E(X1|t=x ∩
)(
Tx
−
T'x
)'
T'x=C([Xi,Xj]|t=x
=T'), each a 2 by 2 conditional covariance matrix of Xi and Xj;
=T) and
T'x
Tx
∩
=E(X1|t=x ∩
=T’).
The conditional covariance function C(Xi,Xj|t=x) is the (1,2) element of the
left member of (2). The deductions that Meehl makes about the functional
dependence of C(Xi,Xj|t=x) on x follow largely from properties he imputes to (2)
as following from a combination of defining characteristics of TLS and
assumptions attendant to attempts to test for TLS. The problem is that the
boundary between these two sets is, at times, unclear in Meehl's writing. Meehl
and Yonce (1996, p.1096), for example, state that "The core idea motivating the
procedure is that, if two observable variables ("indicators") tend to discriminate,
i.e., are valid for, a latent category ("taxon") and they do not covary otherwise (no
"nuisance covariance" within the latent classes), then any observed correlation is
due solely to category mixture." That the elements of X must discriminate T from
T' suggests that a second defining characteristic of the TLS is:
14
3)
val: E(X | = T) -E(X | = T')>0, when the indicators are appropriately
reflected.
However, in this quote, the following property is also mentioned:
4)
lcu: C(X| = ) is a diagonal matrix for
={T’,T}.
It might then appear that this too should be taken as a defining characteristic of
TLS. However, one also reads that "If the latent structure is taxonic (and there is
sufficiently little or no nuisance covariance)..." (1996, p.1096). Because Meehl and
Yonce here entertain the possibility of taxonicity without lcu, it sounds as if lcu
should be taken to be merely an assumption, or side-condition, required for a
valid test of taxonicity. Thus, already, there has arisen uncertainty about the
antecedent of the material implication whose truth status is in question.
But the situation is even more uncertain than this, for it is clear from (2)
that it is not mere lcu that is relevant to MAXCOV, but, instead, the
uncorrelatedness of Xi and Xj conditional on t=x∩
5)
ldcu: C(X| t=x ∩
={T’,T}:
= ) is a diagonal matrix for
={T’,T} and all values x.
It is not clear from their work how Meehl and Yonce see ldcu as arising. Is it a
defining characteristic of TLC? If so, then why do they discuss lcu? Perhaps
15
they envision ldcu as entailed by lcu. If so, they envision wrongly (Maraun and
Slaney, 2006).
Now, for a LS whose set of defining characteristics includes {B ∩ldcu},
6)
C(Xi,Xj|t=x)=
in which
Xk|x
=(
Tx
Xk Tx
-
(1 -
Xk T'x
Tx
)
Xi|x
X j|x
,
) . An identity akin to (6) arises frequently in Meehl's
writings on MAXCOV, this perhaps suggesting that {B ∩ldcu} should, indeed,
be contained within the set of defining characteristics of TLS. It must be noted,
however, that once cannot take as a defining characteristic of an LS a property
such as ldcu, this property involving the conditioning of manifest indicators Xi
and Xj on not only
, but on manifest indicator t. A conditional result of this sort
must arise as a mathematically deduced consequence of the defining
characteristics of a latent structure. The trouble is that in the absence of an
unambiguous characterization of TLS, it is not even possible to attempt a proof of
such a deduction. Moreover, Meehl claims that his characterization of the TLS
implies that C(Xi,Xj|t=x) is a function of only
Tx.
However, neither
B ∩val∩ldcu, nor B ∩val∩lcu make this be the case by implying that
Xi|x
X j|x
is constant with respect to x.
As noted, Meehl and Yonce (1996, p.1096) claim that "... if two observable
variables ("indicators") tend to discriminate... and they do not covary otherwise...
then any observed correlation is due solely to category mixture." It is clear from
16
(6) that this not true, as the final term
Xi|x
X j|x
, an unknown function of x, enters
the expression for C(Xi,Xj|t=x). This is perhaps why Meehl, elsewhere in his
work, mentions the requirement that
7)
cm: (
Xi Tx
-
Xi T'x
) is a constant function of x.
But whether Meehl believes that cm is an assumption, a defining characteristic of
TLS, or, perhaps, a deduced consequence of TLS, is, similarly, unclear. Shortly,
an important candidate characterization of TLS, for which cm is a necessary
condition, will be considered.
Meehl and Yonce (1996) claim that "...The covariance mixture theorem is
general because it holds for situations when there is nuisance covariance..." This
is correct: decompositions of the form
= C(E(X|Y))+E(C(X|Y)) exist regardless
of the form of the conditional covariance matrices C(X|Y = y) . The problem is
that, contrary to the tone of the quote, this fact has little relevance to MAXCOV’s
performance. Any LS whose set of defining characteristics includes B has (2) as
a necessary condition and this underlines the fact that (2) is of little use as a
manifest criterion. For any LS whose defining characteristics includes
{B ∩val∩ldcu∩cm}, C(Xi,Xj|t=x) is a function of x only through
{B ∩val∩ldcu∩cm} does not determine the behaviour of
Tx
Tx
. However,
, because, while the
17
behaviour is determined by ft|
=T
(x) and ft (x) , these densities are not
determined by {B ∩val∩ldcu∩cm}.
This is problematic, because the basis for Meehl's belief that C(Xi,Xj|t=x) is
single-peaked under TLS appears to follow from (6), his belief that cm holds, and
his belief that
Tx
is both a nondecreasing function of x, and crosses .5 (see
Maraun and Slaney, 2006, for a detailed analysis). The first requirement, that
Tx
be nondecreasing in x, is the requirement that the density of
and t be
monotone likelihood ratio dependent (hereafter, mlrd) (Maraun & Slaney, 2005).
However, mlrd is not even implied by the "stronger" (in the sense of implied
manifest properties) unidimensional monotone latent variable (umlv) structures
(Holland & Rosenbaum, 1996; Maraun & Slaney, 2006) whose defining
characteristics are:
8)
i. a single latent variate (i.e., unidimensionality);
ii. latent conditional independence (lci): fX| θ = s =
r
∏
fXj| θ = s, for all values s.
j= 1
iii. Latent monotonicity (lm): P(Xj > x| = s) > P(Xj > x| = s) for j = 1, .. ,r,
all values x, and s={T’,T}.
Certainly, Meehl and colleagues do not provide a TLS whose defining
characteristics yield, as a necessary condition, mlrd of the distribution of
and t.
18
Interestingly,the implications {B ∩lci}⇒ldcu and {B ∩lci}⇒cm are both
true (Maraun & Slaney, 2005). Thus, given that the defining characteristics of the
TLS include at least {B ∩lci}, TLS implies ldcu and cm. A necessary condition
for latent structures whose defining characteristics include {B ∩val∩lci} is
9)
C(Xi,Xj|t=x)=
in which
Xk
Tx
(1 -
Tx
)
Xi
Xj
>0,
= E(Xk | = T) - E(X k | = T'). However, for neither {B ∩val∩lci},
nor the stronger umlv structure {B ∩lm∩lci}, is the nature of the functional
dependency of
Tx
on x deducible (Holland & Rosenbaum, 1986; Hemker,
Sijtsma, Molenaar, & Junker, 1997; Maraun & Slaney, 2005). Thus, for these
structures, neither is the behaviour of C(Xi,Xj|t=x) deducible. Thus, [singlepeakedness of C(Xi,Xj|t=x)] could not possibly be a necessary condition of the
much "weaker" structure {B ∩val∩lcu}.
What, then, could be the grounds for the belief of Meehl and Yonce that
TLS implies that [{
Tx
is nondecreasing in x}∩{
Tx
crosses .5}]? The answer may
well be found in a consideration of the supporting Monte Carlo study conducted
by Meehl and Yonce (1996). It is claimed by Meehl and Yonce that "Although
our Monte Carlo data were generated by a Gaussian algorithm assigning equal
variances SDt2, SDc2 to taxon and complement classes, none of the core
derivations underlying MAXCOV are thus restrictive. The conjectured
structure...is highly general, that of two overlapping unimodal frequency
19
distributions. The mathematics speaks for itself, and it was developed by Meehl
with psychopathology in mind, where skewness and heterogeneity of variance
are common" (Meehl and Yonce, 1996, p.1097). This is misleading. While
decomposition (2) does, indeed, follow directly from B , neither the behaviour of
Tx
, nor C(Xi,Xj|t=x), follows from (2), and the behaviour of these quantities are
the essential issues.
In the absence of an unambiguous characterization of TLS, Meehl and
Yonce cannot, in fact, know what role is played by the features that they assign
to their Monte Carlo data. In fact, Maraun and Slaney (2005) prove that the
latent structure whose defining characteristics are { B ∩val∩lci} implies both
that
Tx
is nondecreasing and crosses .5, so long as: i) ft| =s(x), s={T',T}, are each
θ
normal densities (a condition called, herein, cn); ii) σ2t|T= σ2t|T' (a condition
called, herein, hv). However, cn and hv are precisely the two features of the Monte
Carlo data simulations downplayed as unimportant by Meehl and Yonce. Thus, what
the Monte Carlo results of Meehl and Yonce (1996) actually illustrate is that
MAXCOV can be used to make valid decisions about whether
{ B ∩val∩lci∩cn∩hv} is a latent structure of a set of indicators. More will be
said on this point later in the paper.
MAMBAC
20
⎛ (1+ q) ⎞
In MAMBAC, p=1 and q≥1, so that there are ⎜
⎟ = (q + 1) partitions of
⎝ 1 ⎠
the set of indicators. Let Xi symbolize the single output indicator in an arbitrary
partition. The focal manifest property of MAMBAC is d(x)=E(Xi| t≥x)- E(Xi|t≤x).
For MAMBAC to be a detector of TLSs, it must be the case that the following
material implication is true: “if X has a TLS, then d(x) is a single-peaked function
of x.”6 Once again, to detemine the truth value of this material implication, its
antecedent must be specifiable, and this is equivalent to the requirement that the
defining characteristics of TLS be specifiable.
According to Meehl and Yonce, the inventors of MAMBAC, what then are
the defining characteristics of the latent structure that MAMBAC was designed
to detect? They (1994, pp.1065-1066) list properties B , val, and lcu ("...the
variables are uncorrelated within categories (no xy nuisance covariance", 1994,
p.1065), and two further restrictions on the conditional indicator densities:
10)
i. cu: The conditional densities, fX j| =s , j=1..r, s={T’,T}, are unimodal;
ii. cl: lim fX j =x| =s = 0 , j=1..r, s={T’,T}, and fX j =k| =T > fX j =k| =T' , j=1..r, for k
x →∞
suitably large.
Property cl arises in the context of the following quote: "That hb → Q as x → ∞
holds for any pair of smooth distributions such that the taxon density function
6
Meehl and Yonce (1996) also claim that d(x) will necessarily be convex if the latent structure of
X features a continuously distributed .
21
ft(x) exceeds the complement density function fc(x) for all values x>K (large
enough) and both densities are asymptotic to the baseline (Fisher's "high
contact") as x → ∞ " (1994, p.1066). This is the claim that under cl,
lim P(T'|Xi ≤ x) → (1- ΠT ) . Because
x →∞
11)
P(T'|X i ≤ x) =
P(X i ≤ x|T')(1 - Π T )
P(X i ≤ x)
⎡ P(Xi ≤ x|T')(1 - Π T ) + P(X i ≤ x|T)Π T ⎤
=⎢
⎥
P(Xi ≤ x|T')(1 - Π T )
⎣
⎦
−1
−1
⎡
Π T P(X i ≤ x|T) ⎤
= ⎢1 +
⎥ ,
⎣ (1 - Π T )P(X i ≤ x|T') ⎦
lim P(T'|Xi ≤ x) → (1- ΠT ) if and only if
x →∞
lim
12)
x →∞
P(X i ≤ x|T)
→ 1.
P(X i ≤ x|T')
It is certainly not self-evident that cl implies this limit.
Meehl and Yonce (1994) claim that the set of properties
{B
13)
val lcu cu cl} jointly imply that d(x) is equal to
Xi
(P(T|t ≥ x) + P(T' |t ≤ x) − 1) .
22
This then raises questions about the status of the set {B
while Meehl and Yonce call the equality d(x)=
Xi
val lcu cu cl}. For
(P(T|t ≥ x) + P(T' |t ≤ x) − 1) an
"algebraic identity" (1994, p.1066), it is not even a necessary condition of
{B ,val,lcu,cu,cl}. Under B and lci, the left member of E(Xi|t≥x)- E(Xi|t≤x) is
equal to
∞ ∞
14)
E(Xi| t≥x)=
∫∫Xf
i X i ,t
∞
∫Xf
i X i|t ≥ x
dX i =
dX i dt
x −∞
−∞
P(t ≥ x)
∞ ∞
∫ ∫ [X (1 -
T
∫ ∫ [(1 -
T
)fXi ,t| =T' +
∫ ∫ [X (1 - Π
T
)fXi| =T'ft| =T' + X i Π T fXi| =T ft| =T ]dX i dt
∫ ∫ [(1 - Π
T
i
=
)fXi ,t| =T' + X i
f
T X i ,t| =T
]dX i dt
x -∞
∞ ∞
f
T X i ,t| =T
(under B )
]dX i dt
x -∞
∞ ∞
i
= x - ∞∞ ∞
(under lci)
)fXi| =T'ft| =T' + Π T fXi| =T ft| =T ]dX i dt
x -∞
=
(1 - Π T )P(t ≥ x|T')E(X i | = T') + Π T P(t ≥ x| = T)E(X i | = T)
(1 - Π T )P(t ≥ x| = T') + Π T P(t ≥ x| = T)
Similarly, under {B
15)
E(Xi| t≤x0) =
lci}, the right member is equal to
(1 - Π T )P(t ≤ x|T')E(X i | = T') + Π T P(t ≤ x| = T)E(X i | = T)
(1 - Π T )P(t ≤ x| = T') + Π T P(t ≤ x| = T)
23
Substituting the identity P(t ≥ x|T') =
P(T'|t ≥ x)P(t ≥ x)
into E(Xi|t≥x) and
(1 - Π T )
simplifying, produces
16)
E(Xi| t≥x)=E(Xi| = T' ) +P(T| t≥x)
Similarly, substituting P(t ≤ x|T') =
17)
Xi
.
P(T'|t ≤ x)P(t ≤ x)
into E(Xi| t≤x) produces
(1 - Π T )
E(Xi| t≤x)= E(Xi| = T' ) +P(T| t≤x)
Xi
.
From (16) and (17) follows the central identity of MAMBAC:
18)
d(x)=E(Xi| t≥x)- E(Xi| t≤x)=
[E(Xi| = T' ) +P(T| t≥x)
Xi
=
Xi
]-[ E(Xi| = T' ) +P(T| t≤x)
( P(T| t≥x)- P(T| t≤x))=
Xi
Xi
Xi
]=
( P(T'| t≤x)- P(T'| t≥x))
(P(T| t≥x)+ P(T'| t≤x)-1).
Thus, the identity d(x)=
Xi
(P(T|t ≥ x) + P(T' |t ≤ x) − 1) is a necessary condition of
latent structures whose defining characteristics include B
and lci. If, as claimed
by Meehl and Yonce (1994), this identity is a necessary condition of TLS, then the
defining characteristics of the TLS cannot be {B
val lcu cu cl}.
The
24
psychometrician who would like to deduce manifest criteria for the TLS cannot
do so for lack of an unambiguous characterization of this latent structure.
As in Meehl's treatment of MAXCOV, the reader is told that neither
conditional normality, nor conditional homogeneity of variances, data generation
features of the Monte Carlo support they offer, are required for the proper
functioning of the MAMBAC-based decision mechanism: "The analytical
theorems on which MAMBAC is based do not postulate normality or equality of
variance" (Meehl & Yonce, 1994, p.1066). As with MAXCOV, to evaluate the
authenticity of this claim, two issues must be distinguished. The first issue is
whether either one, or both, of conditional normality and conditional
homogeneity of variances is required to be either a defining characteristic of TLS,
or, alternatively, an attendant assumption, in order that (18) be a necessary
condition of TLS. As was already established, (18) is not a necessary condition
of a latent structure whose defining characteristics are {B
val lcu cu cl}. It
is interesting to note, then, that given the joint normality of Xi and t, conditional
on
, lcu and lci just happen to be equivalent conditions, and (18) is then a
necessary condition of {B
val lcu cu cl}. Might, then, an unacknowledged
defining characteristic of TLS be conditional normality? Once again, Meehl and
Yonce are emphatic in rejecting this option, and so it remains a mystery as to
how they envision the required factorability of fXi ,t| =s , s={T’,T}, as coming about.
The second issue is whether or not the material implication TLS⇒[d(x) is
single-peaked] is true. Given that it could be shown that (18) is entailed by TLS,
25
and, recall, (18) is certainly not entailed by a TLS whose defining characteristics
are {B
val lcu cu cl}, it still would have to be proven, as a second step, that
the implication TLS⇒[
Xi
(P(T|t≥x)+ P(T'| t≤x)-1) is single-peaked] is true.
However, the truth value of this implication could only be evaluated given a
clear specification of the defining characteristics of TLS, and such a specification
is precisely what is missing from Meehl's work on MAMBAC.
The focal manifest property
Xi
(P(T|t ≥ x) + P(T' |t ≤ x) − 1) , a necessary
condition of all latent structures whose defining characteristics include {B
lci} ,
is a function of x because P(T|t ≥ x) + P(T ' |t ≤ x) is a function of x. Now,
19)
P(T| t≥x)- P(T| t≤x)=
=
P(t ≥ x|T)Π T P(t ≤ x|T)Π T
−
P(t ≥ x)
P(t ≤ x)
Π T (P(t ≥ x|T) - P(t ≥ x)) (1 − Π T )[P(t ≥ x|T) - P(t ≥ x|T')]
=
,
P(t ≥ x)(1 - P(t ≥ x))
P(t ≥ x)[1 - P(t ≥ x)]
from which it follows that
20)
E(Xi| t≥x)- E(Xi| t≤x)=
Xi
Π T [P(t ≥ x|T) - P(t ≥ x)]
P(t ≥ x)[1 - P(t ≥ x)]
.
The monotone property of any distribution function (e.g., Fraser, 1976, p.43)
ensures the non-increasingness of P(t≥x) in x, and, hence, the single-peakedness
26
of the denominator. However, little can be said about the numerator of (16). Let
= P( =T') so that P(t≥x) =(1- )P(t≥x| = T) +
P(t≥x| = T)=-
(P(t≥x| = T) +
P(t≥x| = T'). Then P(t≥x) -
P(t≥x| = T'))= - f, in which f=P(t≥x| = T)-
P(t≥x| = T'). Thus, if TLS has, at least, the characteristics { B ∩val∩lci}, then,
when the variates are appropriately reflected, P(t>x| = T)- P(t>x| = T')>0,
making P(t≥x) -P(t≥x| = T) negative, and thus implying that P(t≥x| = T)-P(t≥x)
>0. Nothing more detailed about the the behaviour of P(t≥x| = T)- P(t≥x) is
implied by { B ∩val∩lci}, and this is hardly surprising given that the behaviour
of this quantity is determined by ft, , a joint density that is determined neither by
{ B ∩val∩lci}, nor { B ∩val∩lci} in conjunction with any other properties Meehl
and Yonce (1994) consider.
MAXSLOPE
⎛ (1+ q) ⎞
In MAXSLOPE, p=1 and q≥1, so that there are ⎜
⎟ = (q + 1) partitions
⎝ 1 ⎠
of the set of indicators. Let Xi symbolize the single output indicator in an
arbitrary partition. The focal manifest property of MAXSLOPE is
d
E(Xi|t=x).
dx
MAXSLOPE can rightly be said to be a detector of TLSs if the following material
implication is true: “if X has a TLS, then
d
E(Xi|t=x) is a non-linear function of
dx
27
x.” According to Grove and Meehl (1993), the ingredients of MAXSLOPE are the
following:
21)
i. cu (1993, p.709).
ii. The manifest variates can be non-normally distributed (1993, p.709).
iii. B (1993, p.709).
iv. val (1993, p.710).
v. ~lcu: The Xi are correlated within each latent population, i.e., (1993,
p.710).
In contrasting MAXSLOPE and MAXCOV, Grove and Meehl (1993, p.712)
make the observation that, with respect MAXCOV, "Meehl (1973) assumed as a
testable conjecture and to simplify the mathematics, that the covariance of X and
Y was zero both in T and T'." This account is questionable. First, lcu is not a
testable condition, for only LS that possess particular defining characteristics in
addition to lcu yield testable implications for fX (Holland & Rosenbaum, 1986).
Let given latent structure LS* have defining characteristics {lcu∩t2.. ∩tr} and let
the implication {lcu∩t2.. ∩tr} ⇒ c be true. Then a test of whether or not fX satisfies
c is not a test of lcu per se, but rather the set {lcu∩t2.. ∩tr} that defines LS*. The
characteristic lcu cannot be tested independently of the other defining
characteristics of LS*. Second, Meehl suggests in his writings that one
interpretation of a taxon is that it has causal powers with respect to its indicators,
28
this suggesting that lcu (or lci), the standard and historically important
paraphrase of this notion, should be taken as a defining characteristic of TLS, not
merely as a simplifying convenience (as was seen previously, the central
identities of both MAXCOV and MAMBAC follow, in fact, from {B ∩lci}).
Regardless, the important point to note is that, according to Grove and
Meehl, a defining characteristic of the latent structure that MAXSLOPE was
designed to detect is ~lcu, the correlatedness of Xi and t conditional on
. Given
that one of lcu, or lci, or ldcu is a defining characteristic of the TLS that
MAXCOV was designed to detect, and that lci is a defining characteristic of the
TLS that MAMBAC was designed to detect, it appears to be the case that the
latent structure that is the target of MAXSLOPE is different than the latent
structures that are the targets of MAXCOV and MAMBAC. It is hard to know
what to make of this, given that, throughout Meehl's work, these procedures are
treated as if they were designed to be detectors of the same latent structure.
Grove and Meehl (1993, p.710) discuss the covariance mixture theorem,
but neither provide an expression for E(Xi|t=x), nor
d
E(Xi|t=x), that is
dx
deduced on the basis of a defining characterization of TLS. Consider the
following if-then linkages between particular latent structures and
For latent structures whose sole defining characteristic is B ,
d
E(Xi|t=x).
dx
29
22)
E(Xi|t=x)
= E(Xi |t = x| = T')(1- ΠTx ) + E(Xi |t = x| = T)ΠTx
= E(Xi |t = x| = T') +
in which
X i|x
Xi|x
Tx
= E(Xi |t = x| = T)- E(Xi |t = x| = T') , and E(Xi |t = x| = T') and
E(Xi |t = x| = T) are the "within class" regression functions. It follows from (21)
that for such a latent structure,
23)
d
E(Xi|t=x)= E(Xi |t = x| = T')' + [
dx
Xi|x
'
Tx
+
Tx
'
Xi|x
]
in which primes denote derivatives evaluated at x. Clearly, the behaviour of
d
E(Xi|t=x) is determined by both the regression functions and
dx
Tx
, and these,
in turn, are determined by the joint distribution of X and . However, this joint
distribution is not deducible for the latent structure whose sole defining
characteristic is B . Hence, neither is the nature of the functional dependency of
d
E(Xi|t=x) on x.
dx
For a latent structure whose defining characteristics are { B ∩lci},
24)
E(Xi |t = x| = T') = E(Xi | = T')
30
and
25)
E(Xi |t = x| = T)= E(Xi | = T) .
In other words, those latent structures whose set of defining characteristics
include lci, the standard quantitative translation of the causal interpretation of
latent taxa (Grove, 2004, p.5), and a defining characteristic of the latent structures
that are the targets of MAXCOV and MAMBAC, yield within-class regression
functions that do not depend on t, thus contradicting 20v. Once again, it appears
to be the case that MAXSLOPE has inadvertently been designed to detect a latent
structure that is different from those that are the targets of detection of MAXCOV
and MAMBAC. For a latent structure whose defining characteristics are
{B ∩lci},
26)
E(Xi|t=x)= (1 -
Tx
) E(Xi | = T') + E(Xi | = T)ΠTx
= E(Xi | = T') +
Xi
Tx
Xi
Tx
.
From (26), it follows that
27)
d
E(Xi|t=x)=
dx
'
31
For latent structures whose defining characteristics are {B ∩val∩lci}, it is,
additionally, the case that
28)
d
E(Xi|t=x)=
dx
Xi
Now, clearly, the shape of
Tx
'
>0.
'
Tx
is determined by the joint density of
and t.
But, once again, this density is neither determined by {B ∩val∩lci}, nor any
conjunction of properties considered by Grove and Meehl (1993).
Grove and Meehl (1993, p.710) provide an example based on conditional
gaussian data and state that this choice is "...purely for convenience..." Their
example also involves the equality of the conditional covariance matrices
C([Xi,t]| = T ' ) and C([Xi,t]| = T ), and the equality of
Xi
and
t
, data
generation properties that Meehl and Grove (1993, p.710) characterize as follows:
“None of these assumptions is critical, and all can be dropped later on...” But, in
fact, it is not clear whether this is true. The reader will recall that these data
generation features ensure that
and t are mlrd, and mlrd is equivalent to the
condition that P( = T|t = x) is nondecreasing and crosses .5.
In a later publication, Grove (2004) provides a more detailed mathematical
account of MAXSLOPE. He claims that
d
E(Xi|t=x) should be considered as
dx
having a “taxon-indicating shape” when it has “…a particular nonlinear form,
32
generally a “slanted” ogive shape.” The characteristics he initially ascribes to the
TLS are:
29)
i) B ;
ii) cu;
iii) io: fX j| =T' and fX j| =T , j=1..r, intersect at most once;
iv) cl: E(X j |t = x| = s) , j=1..r and s={T’,T}, are linear.
Thus, Grove’s claim should be read as, “a necessary condition of { B
is that
cu io cl}
d
E(Xi|t=x) is a nonlinear function of x.” He claims, furthermore, that
dx
“There is no need for the variables to have any certain distributions (subject to
the constraints given in auxiliaries [1] and [2])…there is no need for the withinpopulation regressions to be zero, i.e., “local independence” is not assumed…in
fact, there is no need for the within-population regressions to have the same
slope” (2004, p.8).
Beginning on page 11, Grove provides what he calls a “…bridging
partial…” (p.11) definition of the taxon concept, but which is, in fact, a second lisr
of TLS related characteristics:
30)
i) B ;
ii) cu;
iii) val;
33
iv) val2: M(X=j| = T) > M(Xj| = T'), in which
M(Z| =s) is the mode of the density of Z conditional on
v) (1 -
(1 -
T
T
)fX =M(X| = T')|T' >
)fY =M(X| = T')|T' >
f
f
T X = M(X| = T')|T
T Y = M(X| = T')|T
, (1 -
, (1 -
T
T
)fX =M(X| = T)|T' <
)fY =M(X| = T)|T' <
=s;
f
T X = M(X| = T)|T
,
f
T Y = M(X| = T)|T
vi) cl;
vii) eb: The slopes of E(X j |t = x| = T') and E(X j |t = x| = T) , by (vii),
each linear, are equal.
List (30) only further muddies the waters, for eb contradicts Grove’s earlier claim
that “…there is no need for the within–population regressions to have the same
slope..”, and, contrary to his claim on Page 37 of his manuscript, the properties
listed in (30) do not imply that the “…the within-class densities…intersect at
most once…”7?
Consider, once again, several linkages between latent structures and
d
E(Xi|t=x). For a latent structure whose defining characteristics are {B
dx
31)
E(Xi|t=x)=
(1 Tx
7
cl},
Tx
)[E(X i | = T') + bT' (x - E(t| = T'))] +
[E(X i | = T) + bT (x - E(t| = T))]
For example, two normal densities with unequal means satisfy (iiib)-(iiid), but cross twice unless they
have equal variances.
34
Employing eb in conjunction with (31), leads to Grove’s (2004) equation (2):
32)
E(Xi|t=x)= E(Xi | = T') + bT' (t - E(t| = T' ) +
Tx
[
Xi
- bT' t ] .
That is, the focal manifest property of MAXSLOPE is a necessary condition of
{B
cl eb}.
From (32), it follows that, for a latent structure whose defining
characteristics are { B
33)
cl eb},
d
E(Xi|t=x)= bT' +
dx
Tx
Once again, the behaviour of
'
[
Xi
- bT' t ]
d
E(Xi|t=x) and, in particular, the issue of
dx
whether it has a “slanted ogive shape” rests on the behaviour of
Tx
. As was the
case for MAXCOV and MAMBAC, what would be required to determine the
behaviour of
Tx
is knowledge of the joint distribution of
latent structure whose defining characteristics are { B
and t. However, the
cl eb} does not
determine this density. Hence, even if, as Grove seems to suggest, the TLS has
defining characteristics { B
cl eb}, the behaviour of the right member of (33)
could not possibly be a necessary condition of this latent structure.
35
What is known
Table 1 summarizes linkages between latent structures that could
reasonably be considered to be candidates for labeling as the TLS, and the focal
manifest properties of each of MAXCOV, MAMBAC, and MAXSLOPE. It
furthermore indicates those quantities whose functional dependency on x is not
determined by the given latent structure. As is evident, none of the latent
structures considered determines the functional dependency of the focal manifest
properties of MAXCOV, MAMBAC, or MAXSLOPE, on x. For each linkage, the
focal manifest property is a function of at least one quantity (listed in the far right
column) whose functional dependency on x remains undetermined by the
corresponding latent structure. In other words, as it stands, neither MAXCOV,
nor MAMBAC, nor MAXSLOPE are detectors of any of these latent structures.
While it is possible that Meehl and colleagues would prefer to consider the
intersection of some other set of characteristics as constituting the TLS, their
published efforts, as has been shown in the present work, certainly do not settle
what this set of characteristics might be.
Table 1 shows that, for latent structure {B ∩val∩lci}, the focal manifest
properties of MAXCOV, MAMBAC, and MAXSLOPE are functionally dependent
on x through only a single quantity:
(MAMBAC), and
'
Tx
Tx
(MAXCOV), P(T| t≥x)+ P(T'| t≤x)
(MAXSLOPE). We will now introduce a single
36
assumption: that q, the number of indicators contained in X2, is large. As q
becomes large, the density functions ft|
θ
= s,
s = {T, T’}, will each converge to
normal densities (Basawa & Rao, 1980; Holland, 1990):
⎛ (t exp ⎜ 1
⎜ 2
2
2
⎝
)
t|s
1
34)
(2
)2 ⎞
⎟⎟ .
⎠
t|s
2
t|s
Given satisfaction of this assumption, the behaviour of the focal manifest
properties of each of MAXCOV, MAMBAC, and MAXSLOPE can now be
determined.
As proven in Maraun and Slaney (2005), for {B ∩val∩lci}, as q becomes
large, D Tx converges to
-1
⎛ (1 - Π T )
⎞
exp(ax2 + bx + c) ⎟ ,
⎜ 1+
ΠT
⎝
⎠
35)
in which a =
2
c=
2
t|T
2
t|T
(ó 2 t|T' - ó 2 t|T )
-
t|T'
2
2ó 2 t|T ó 2 t|T'
2
, b=
2
t|T
2
t|T'
t|T'
(ì
t|T'
ó 2 t|T - ì
t|T
ó 2 t|T ó 2 t|T'
ó 2 t|T' )
, and
2
+ ln
t|T
2
. The behaviour of (35) can be summarized as
t|T'
follows (see Maraun & Slaney, 2005):
37
If ó 2 t|T' = ó 2 t|T , then the joint distribution of
and t is monotone likelihood ratio
dependent, and D Tx is both nondecreasing and crosses .5;
2
If
≠
t|T'
2
t|T
, then D Tx is a quadratic function of x, that, depending in a
complicated way on the values assumed by the parameters
2
and
t|T
T,
t|T'
,
t|T
,
2
t|T'
,
, either does or does not cross .5.
Thus, it can be concluded that, so long as the assumption of large q holds, it is a
necessary condition of {B ∩val∩lci} that C(Xi,Xj|t=x) is either one- or twopeaked. Thus, under the satisfaction of this single assumption, the behaviour of
C(Xi,Xj|t=x) can legitimately be used as a detector of the latent structure
{B ∩val∩lci}.
For {B ∩val∩lci}, as q becomes large, it follows from (35) that
'
Tx
36)
converges to
(1 - Π T )(2ax + b)
⎛
⎞
(1 - Π T )
exp(ax 2 + bx + c) ⎟
⎜1+
ΠT
⎝
⎠
3
.
This function can be shown to possess either one or two critical points, and a
single peak, depending on the values assumed by
T
,
t|T'
,
t|T
,
2
t|T'
, and
2
t|T
.
38
Thus, under the assumption of large q, the behaviour of
d
E(Xi|t=x) can
dx
legitimately be used as a detector of the latent structure {B ∩val∩lci}.
Finally, for {B ∩val∩lci}, as q becomes large, P(T| t≥x)+ P(T'| t≤x)=
P(T| t≥x)- P(T| t≤x) converges to
37)
∞
∞
n(
;
∫[
t|T
x T
2
in which n(
n(
t|T
; t) + (1 -
t|s
t|T
x
2
∫ n( t|T ; t|T ; t)dt
T -∞
,
- x
2 ; t) + (1 2 ; t)]dt
)n(
; 2 ; t)]dt ∫ [
n(
;
)n(
;
T
t|T'
t|T'
T
t|T
t|T
t|T'
t|T'
-∞ T
; 2
T x∫
t|T
; t)dt
t|s
⎛ (t exp ⎜ 1
⎜ 2
2
2
⎝
t|s )
1
; 2 ; t) =
(2
)2 ⎞
⎟⎟ . This function is more
⎠
t|s
2
t|s
complicated than those of MAXCOV and MAXSLOPE because it involves the reintroduction of the marginal density of t. Not surprisingly, then, its functional
dependency on x is far more varied, ranging from increasing, to single-peaked, to
convex, depending on the values assumed by the parameters
and
2
t|T
T,
t|T'
,
t|T
,
2
t|T'
. What can be concluded, then, is nothing more than that a necessary
condition of {B ∩val∩lci}, under the assumption of large q, is that the focal
manifest property of MAMBAC is nonlinear. Thus, if it does not appear that
E(Xi| t≥x)- E(Xi|t≤x) is nonlinear, the hypothesis that {B ∩val∩lci} is the latent
structure of a set of indicators should be rejected.
,
39
40
Discussion
Throughout his work, Paul Meehl has argued forcefully for both the
empirical realist conception of science, and the central role of latent variable
modelling technology in the execution of the empirical realist program of
psychological research. His unique technical contribution was the development
of a class of tools, taxometric tools, three of which are MAXCOV, MAMBAC,
and MAXSLOPE, that he claimed could be effectively used to detect discrete
causal structures (latent taxa). Ambiguity inherent to his specification of the
defining properties of latent taxa has meant that it has not been possible to
evaluate the truth of these claims. In the current work, it has been proven that,
given satisfaction of a single assumption, i.e., that the number of indicators
contained in the conditioning set of a taxometric analysis be large, MAXCOV,
MAXSLOPE, and MAMBAC can legitimately be employed as detectors of the
latent structure {B ∩val∩lci}. If it is agreed on by researchers that this latent
structure is an acceptable paraphrase of what they mean by latent taxon, then
MAXCOV, MAXSLOPE, and MAMBAC can be employed as a disconfirmatory
tool for the hypothesis of latent taxonicity. To date, there does not appear to
exist a confirmatory test of the existence of this latent structure (i.e., one based on
the existence of a sufficient condition of {B ∩val∩lci}).
In a related line of work, Maraun, Halpin, Tkatchouk, and Gabriel (2007)
have proven that, under the same conditions, the taxometric technique MAXEIG
(Waller & Meehl, 1998) is a detector of {B ∩val∩lci}. Specifically, they prove
41
that, for large q, if the first-eigenvalue function is not one- or two-peaked, then it
is logically valid to conclude that the latent structure in play is not {B ∩val∩lci}.
Thus, each of the most notable of Meehl’s techniques can legitimately be
employed as a disconfirmatory tool of the hypothesis that the latent structure of
a set of indicators is {B ∩val∩lci} . “Consistency tests” (see Waller & Meehl,
1998) can, of course, be produced, if desired, by conducting analyses on all of the
⎛ (p + q) ⎞
⎜ p ⎟ partitions of the overall set of (p+q) indicators into p output indicators,
⎝
⎠
X1, and one input indicator, t=1’X2.
42
References
Basawa, I., & Rao, B. (1980). Statistical inference for stochastic processes. New
York:Academic Press.
Fleisher, E. & Baize, H.R. (1982). Self monitoring: A theoretical critique. Paper
presented at the annual convention of the American Psychological Association,
Washington, D.C.
Fraser, D.A.S. (1976). Probability and Statistics: Theory and Applications. Toronto:DAI
Press.
Grove, W.M. (2004). The MAXSLOPE Taxometric Procedure: Mathematical Derivation,
Parameter Estimation, Consistency Tests. Psychological Reports, 95(2), 517-550.
Grove, W., & Meehl, P. (2003). Simple regression-based procedures for taxometric
investigations. Psychological Reports, 73(3, Pt 1)
Guttman, L. (1977). What is not what in statistics. The Statistician, 26, 81-107.
Holland, P. (1990). The Dutch identity: A new tool for the study of item response
models. Psychometrika, 55(1), 5-18.
Holland, P., & Rosenbaum, P. (1986). Conditional association and
unidimensionality in monotone latent variable models. The Annals of
Statistics, 14(4), 1523-1543.
Lehmann, E. (1966). Some concepts of dependence. Annals of Mathematical
Statistics, 37, 1137-1153.
McDonald, R.P. (1967). Nonlinear factor analysis. Richmond, Va.: The William
Byrd Press, Inc.
43
Maraun, M., Halpin, P., Tkatchouk, M., & Gabriel, S. (2007). MAXEIG: An
Analytical Treatment. Unpublished manuscript.
Maraun, M., & Slaney, K. (2005) An analysis of Meehl’s MAXCOV-HITMAX
procedure for the case of continuous indicators. Multivariate Behavioral
Research, 40(4), 489-518.
Maraun, M., Slaney, K., & Goddyn, L. (2003). An analysis of Meehl's MAXCOVHITMAX procedure for the case of dichotomous items. Multivariate Behavioral
Research, 38(1), 81-112.
Meehl, P.E. (1995). Bootstraps taxometrics: Solving the classification problem in
psychopathology. American psychologist, 50(4), 266-275.
Meehl, P., & Yonce, L. (1996). Taxometric analysis: II. Detecting taxonicity using
covariance of two quantitative indicators in successive intervals of a third
indicator (MAXCOV PROCEDURE). Psychological reports, 78(3, pt 2), 10911227, Monograph Supplement.
Meehl, P., & Yonce, L. (1994). Taxometric analysis: I. Detecting taxonicity with two
quantitative indicators using means above and below a sliding cut (MAMBAC
procedure). Psychological Reports, 74(3, Pt 2), 1059-1274
Snyder, Mark. (1974). Self-monitoring of expressive behavior. Journal of Personality
and Social Psychology, 30(4), 526-537.
Tukey, J. (1958). A problem of Berkson, and minimum variance orderly
estimators. Annals of Mathematical Statistics, 29, 588-592.
Waller, N. G., & Meehl, P. E. (1998) Multivariate taxometric procedures:
distinguishing types from continua. Thousand Oakes, CA: Sage.
44
Table 1. Focal Manifest Properties under selected Latent Structures
LS
Focal manifest property
Undertermined
functions of x
MAXCOV: C(Xi,Xj|t=x)
B
Tx
B ∩lci
Tx
(1 -
Tx
)
Xi
Xj
B ∩val∩lci
Tx
(1 -
Tx
)
Xi
Xj
ij|Tx
+(1 -
Tx
)
ij|Tx
+
Tx
(1 -
Tx
)
Xi|x
X j|x
,
Tx
ij|Tx
,
ij|Tx
,
Xi|x
,
Tx
>0
Tx
MAMBAC: d(x)=E(Xi| t≥x)- E(Xi|t≤x)
B ∩lci
Xi
(P(T| t≥x)+ P(T'| t≤x)-1)
P(T| t≥x)+ P(T'| t≤x)
B ∩val∩lci
Xi
(P(T| t≥x)+ P(T'| t≤x)-1)>0
P(T| t≥x)+ P(T'| t≤x)
MAXSLOPE:
B
E(Xi |t = x| = T')' + [
B ∩lci
B ∩val∩lci
B
d
E(Xi|t=x)
dx
cl eb
Xi
Tx
Xi
Tx
bT' +
'
Tx
+
Tx
'
Xi|x
]
'
'
'
Xi|x
Tx
'
>0
Tx
'
[
Tx
Xi
- bT' t ]
'
Tx
X j|x
Download