Introduction - Wright State University

advertisement
TESTING LATENT VARIABLE MODELS WITH SURVEY DATA
TABLE OF CONTENTS FOR THIS SECTION
As a compromise between formatting and download time, the chapters below are in
Microsoft Word. You may want to use "Find" to go to the chapters below by pasting the chapter
title into the "Find what" window.
FORWARD
INTRODUCTION
THEORETICAL MODEL TESTING STUDIES
STEP I IN UNOBSERVED VARIABLE-SURVEY DATA (UV-SD) MODEL TESTING-DEFINING MODEL CONCEPTS
FIRST-ORDER CONSTRUCTS
SECOND-ORDER CONSTRUCTS
INTERACTIONS AND QUADRATICS
DEFINING CONCEPTS
IDENTIFYING IMPORTANT ANTECEDENTS
SUGGESTIONS FOR STEP I-- DEFINING MODEL CONCEPTS
 2002 Robert A. Ping, Jr. 9/20/02 i
TESTING LATENT VARIABLE MODELS WITH SURVEY DATA
FORWARD
This book critically reviews the process of testing, or validating, as it is sometimes called,
theoretical models involving unobserved or latent variables and survey data, and selectively
suggests improvements in this process. Because I am a captive of my discipline, a branch of
social science research that investigates socio-economic exchanges between organizations (also
known as marketing channels research), the book's examples and many of its comments about
theoretical model testing practices using survey data involve research in Marketing. Nevertheless,
because research in Marketing follows the same conventions and practices for theoretical model
testing as the other brances of the social sciences, this book and its suggestions have application
across the social sciences. Thus, this book is intended for researchers who test theoretical models
involving latent variables1 and survey data, and its purpose is to help these researchers reliably
test these models with survey data.
I am a theory tester, and my first experience with covariant structure analysis or structural
equation analysis (I shall use the more popular term structural equation analysis) was while
testing a theoretical model with multiple dependent or endogenous variables and survey data.
After trying Ordinary Least Squares regression, then Canonical Correlation, to estimate the path
coefficients in my structural model, I settled on using structural equation analysis, specifically
LISREL, for its ability to jointly estimate simultaneous equations and its ability to model the
effects of measurement error. I was subsequently surprised by how difficult structural equation
analysis was to use. In those days LISREL was available only on a mainframe computer. Using it
to estimate path coefficients in a structural model required learning the LISREL programming
"language," then writing and debugging a comparatively lengthy computer program using this
language. LISREL is now available on PC computers, and programming has been somewhat
simplified (see Hayduk, 1996:xiii) with its SIMPLIS code (LISREL program) generator.
There were (and are) other structural equation analysis programs besides LISREL2 (e.g.,
EQS, AMOS, etc.), but in comparison to OLS regression, for example, structural equation
analysis still seems as much an "art" as an estimation technique. Thus, this book is also intended
to help make analyzing survey data using structural equation analysis a little easier.
The book is organized around the process of testing theoretical models involving latent
variables and survey data (i.e., first define the model constructs. Then state the relationships
among these constructs, develop appropriate measures of the constructs, and gather data using
these measures. Next validate these measures, and test the stated relationships among the
constructs). It brings together what is known about this process, and it selectively adds to this
body of knowledge. The book begins with a discussion of the process of testing theoretical
1. The book is restricted to models that have criterion variables rather than classification variables (i.e., latent class
models are not discussed).
2. Available structural equation analysis software include AMOS (Arbuckle, 1988), COSAN (Fraser, 1988), EQS
(Bentler, 1985), LINCS (Schoenberger and Arminger, 1988), and LISCOMP (Muthen, 1984).
 2002 Robert A. Ping, Jr. 9/20/02 ii
models involving latent variables and survey data, then it details each step in this process using
several examples involving real-world survey data.
Although the book assumes the reader is familiar with the terminology of covariant
structure analysis or structural equation analysis, and a software package for the analysis of
structural equations (e.g., LISREL, EQS, AMOS, etc.), I have tried to make it as accessible as
possible.
The list of those I should thank in this book is long and certainly incomplete. My first
exposure to structural equation analysis was while working with Bob Dwyer at the University of
Cincinnati. Neil Ritchie, also at UC, helped refine that first exposure. My thinking about
structural equations has been heavily influenced by the writings of James Anderson, Richard
Bagozzi, David Gerbing and John Hunter; Peter Bentler; Kenneth Bollen; Michael Browne and
Robert Cudeck; John Fox and Michael Sobel; Leslie Hayduk; Karl Jöreskog and Dag Sörbom;
and John Kenny.
This book is on my web site for several reasons. It allows me to use my recent
experiences to periodically revise the book without the rigors of publishing a revised edition.
Because the book is searchable using standard "Find" functions, I did not have to spend time
building an index, and this searchabilty seems to make it more useful than a printed version.
However, for several reasons the book is in Microsoft Word, rather than Acrobat Reader (PDF)
or HTML, so it may download rather slowly. It was also "auto" formatted from WordPerfect to
Word, and so, in addition to my own errors of omission and commission, there are probably
reformatting errors.
If you see anything you like, or dislike, errors, etc., please e-mail me with the details.
Robert A. Ping, Jr.
Department of Marketing
Wright State University
Dayton, Ohio 45435-0001
rping@wright.edu
 2002 Robert A. Ping, Jr. 9/20/02 iii
TESTING LATENT VARIABLE MODELS WITH SURVEY DATA
INTRODUCTION
This monograph selectively suggests improvements in the process of testing theoretical
models involving unobserved variables3 and survey data. Because my social science research
involves theoretical research in Marketing, the book's examples and much of its discussion of
theoretical model testing is focused there. Nevertheless, because the conventions and processes
of theoretical model testing using survey data are generally the same across the social sciences,
the book's suggestions have application throughout the social sciences. Thus, the book is
intended for researchers in the social sciences who test theoretical models involving latent
variables and survey data.
The suggestions in the monograph are prompted by a review of a sample of substantive
articles in the social sciences that qualitatively judged compliance with generally accepted
procedures for testing latent variables using survey data, and a review of the recent social science
methods literature.4 The book also selectively extends recent results in the methods literature, and
proposes novel applications of several others. It provides numerous explanations and examples,
and overall is intended as a contribution to continuous improvement in the use of generally
accepted procedures for theoretical model tests involving unobserved variables and survey data
in the social sciences.
THEORETICAL MODEL TESTING STUDIES
Perhaps Bollen (1989:268) best states the objective of theoretical model testing studies that
involve unobserved variables and survey data: "In virtually all cases we do not expect to have a
completely accurate description of reality. The goal is more modest. If the model... helps us to
3. The book assumes criterion dependent or endogenous variables rather than classification variables (i.e., latent
class models are not discussed).
4. I reviewed articles reporting theoretical model tests involving survey data in the the Journal of Consumer
Research, the Journal of Marketing Research, the Journal of Marketing, Marketing Science, the Journal of the
Academy of Marketing Science, the Journal of Retailing, the Journal of Personal Selling and Sales Management,
and the Journal of Business Research from 1980 to the present. I also selectively reviewed methodological articles in
the Psychological Bulletin/Psychological Methods, Psychometrika, Multivariate Behavioral Research, Sociological
Methodology, Sociological Methods and Research, the Journal of Marketing Research, and Quality and Quantity,
for this same period. I had planned to complete similar reviews of the major journals in Psychology, Education,
Political Science, Sociology, etc., but shortly after starting those reviews it became apparent that research in
Marketing could be argued to be representative of the good and bad practices involved in testing theoretical model
involving survey data. The resulting sample also may restrict any raised hackles resulting from my critical remarks
about model testing practices in the sample to those in Marketing.
 2002 Robert A. Ping, Jr. 9/20/02 1
understand the relations between variables and does a reasonable job of matching (fitting) the
data, we may judge it (the model) as partially validated. The assumption that we have identified
the exact process generating the data would not be accepted."
Reasonableness or adequacy in model testing studies involving unobserved variables and
survey data is addressed by first determining measure adequacy, then determining model
adequacy. Measure adequacy is typically determined using conceptual definitions of the
unobserved concepts, observed items that "tap into" or measure the unobserved concepts, and,
increasingly, model-to-data fit and parameter estimates from measurement models that utilize
structural equation analysis. Model adequacy is determined using hypotheses, and model-to-data
fit and parameter estimates from structural models that utilize structural equation analysis.
Specifically, social science researchers appear to agree that specifying and testing models
using unobserved variables with multiple item measures of these unobserved variables and
survey data (UV-SD models) involve i) defining model constructs, ii) stating relationships among
these constructs, iii) developing appropriate measures of these constructs, iv) gathering data using
these measures, v) validating these measures, and vi) validating the model (i.e., testing the stated
relationships among the constructs). However based on articles I reviewed, there also appears to
be considerable latitude in some cases, and confusion in others, regarding how these steps are
carried out in UV-SD model tests.
For example, in response to calls for increased psychometric attention to measures in
theoretical model tests in Marketing (e.g., Churchill, 1979; Churchill and Peter, 1984; Cote and
Buckley, 1987, 1988; Heeler and Ray, 1972; Peter, 1979, 1981; Peter and Churchill, 1986;
among others), reliability and validity now receive more attention in these tests, when compared
to the study results. However, the articles reviewed exhibited significant variation in what
constitutes an adequate demonstration of valid and reliable measures when unobserved variables
and survey data were involved. For example in some articles, steps v) (measure validation) and
vi) (model validation) involved separate data sets. In other articles a single data set was used to
validate both the measures and the model. Further, in some articles the reliabilities of measures
used in previous studies were reassessed. However in other articles, reliabilities were assumed to
be constants that, once assessed, should be invariant in subsequent studies. Similarly, in some
articles many facets of validity for each measure were examined, even for previously used
measures. In other articles few facets of measure validity were examined, and validities for
existing measures were also assumed to be constants (i.e., once judged acceptably valid a
measure should be acceptably valid in subsequent studies).
Further, methodologists in the Social Sciences have long warned about regressions potential
for coefficient bias and sample-to-sample coefficient variation (inefficiency) because of
measurement error (Bohrnstedt and Carter, 1971; see Rock, Werts, Linn and Jöreskog, 1977;
 2002 Robert A. Ping, Jr. 9/20/02 2
Warren, White and Fuller, 1974; and demonstrations in Cohen and Cohen, 1983). Nevertheless
based on the articles I reviewed, regression still appears to be generally acceptable in some areas
as an estimation technique for survey data with variables that contain measurement error. In
addition, although many of the studies I reviewed acknowledged the risk of generalizing from a
single study,5 in general there was little subsequent concern about the appropriateness or amount
of generalizing from a single study. Because there are other examples, major and minor, such as
little apparent concern about violations of the assumptions underlying the estimation techniques
used in UV-SD model tests (e.g., the use or ordinal data with covariant structure analysis, which
assumes continuous data), it seems fair to say there appear to be fewer generally accepted
principles of model validation using unobserved variables and survey data than there could be.6
Fortunately there have been important advances in validating UV-SD models. These include
new results in developing, testing and evaluating multiple item measures, and estimating models
employing these measures. However, some of these developments have appeared in literatures
not widely read or easily understood by all substantive researchers in the social sciences.
5. Generalizing from a single study involves recommending interventions based on a single study, which ignores the
possibility that "confirmed" hypotheses could be disconfirmed in a subsequent study, and disconfirmed associations
could be confirmed in a future study. This can occur in many ways, including errors in the study (e.g., omission of
important predictor/independent/exogenous variables, the use or ordinal data with structural equation analysis which
assumes continuous data, etc.), and the presence of unmodeled interactions or quadratic latent variables in the model.
6. A previous version of this monograph commented on the effects on of this latitude in UV-SD studies. However,
informal reviewers reacted negatively to statements such as "authors and reviewers... may apply idiosyncratic, rather
than generally accepted, standards in evaluating these studies," and, "this can produce an unnecessarily prolonged
and unpredictable review processes... in which errors of acceptance and rejection can be higher than they ought to
be." They stated that these problems were well known and readers did not need to be reminded of them.
 2002 Robert A. Ping, Jr. 9/20/02 3
Thus, this monograph is intended for these researchers, and one of its objectives is to
selectively identify areas for continuous improvement in the process of testing UVSD models.
The book provides a qualitative review of UV-SD model testing practices.7 It provides selective
discussions of the errors of omission and commission in the process of testing theoretical models
involving latent variables and survey data, and it suggests, and selectively extends, remedies
from recent applicable methods research. For example, it suggests an additional procedure for
achieving model-to-data fit, or consistency in a measure, using covariant structure analysis (I will
use the more popular term, structural equation analysis, e.g., analysis using LISREL, EQS,
AMOS, etc.), and it includes a suggestion for easily executed pretests using scenario analyses.
The monograph provides accessible discussions of several overlooked but valuable statistics such
as Average Variance Extracted (AVE) and Root Mean Squared Error of Approximation
(RMSEA), and it suggests an estimator of AVE that does not rely on structural equation analysis.
It discusses matters that may be well-known to methodologists but may not be as well known to
substantive researchers, such as a discussion of error-adjusted regression, the use of single
summed indicators in structural equation analysis, and the use of a nonrecursive model to
investigate directionality or causality. This research selectively discusses recent advances in the
detection of interactions and quadratics, and provides a rationale for the more frequent inclusion
of interactions and quadratics in UV-SD model tests. It calls for additional attention to measure
consistency, and thus model-to-data fit, in structural equation analysis, and argues for a higher
thresholds for acceptable reliability based on average extracted variance. This research also
renews calls for caution in generalizing from a single study because of the unavoidable risks
from violations of methodological assumptions and the use of inter-subject research designs to
test intra-subject hypotheses. It also discusses the implications of reliability and facets of validity
as sampling statistics with unknown sampling distributions. It suggests techniques such as easily
executed experiments that could be used to pretest measures, and bootstrapping for reliabilities
and facets of validity. In addition, it suggests an alternative to omitting items in structural
equation analysis to improve model-to-data fit, that should be especially useful for older
measures established before structural equation analysis became popular.
The first step in UV-SD model testing is discussed next.
7. See Footnote 4.
 2002 Robert A. Ping, Jr. 9/20/02 4
STEP I IN UNOBSERVED VARIABLE-SURVEY DATA MODEL TESTING-DEFINING MODEL CONCEPTS
Models with unobserved variables with multiple item measures of unobserved variables and
survey data (UV-SD models) involve so called latent or unobserved variables because we
observe indirect evidence or indications of these model variables. For example, we can directly
observe or measure a concept such as household income, but we can measure only indirect
evidence or indications of the concept or construct overall satisfaction. Thus, it is the practice in
social science to measure several indications, or what is termed indicators, of each latent variable
using multiple-item measures.
The construction of these indicators or multiple items is guided by the definition of the
construct or concept. These definitions were as a rule clearly stated in the articles I reviewed.
Because these matters have received attention previously (e.g., Bollen , 1989:180 and Churchill,
1979), later in this section I will simply summarize the two definitional requirements for the
unobserved variables typically involved in theoretical model tests: conceptual definition and
operational definition.
However, because several types of constructs are underutilized in the social sciences and they
offer opportunities to add to the richness and descriptiveness of UV-SD models, I will comment
on second-order constructs, and interactions and quadratics. To discuss second-order constructs I
begin with the notion of a first-order construct.
FIRST-ORDER CONSTRUCTS
A first-order construct has observed variables (i.e., the items in its measure) as indicators of
the construct. These constructs were ubiquitous in the articles I reviewed. The relationship
between indicators and their construct in a first-order construct typically assumes the construct
"drives" the indicators (i.e., the indicators are observable instances or manifestations of their
unobservable construct, and a diagram of the construct and its indicators would show the
construct specified or connected to the indicators with arrows from the construct to the
indicators-- a reflexive relationship, see Bagozzi, 1980b, 1984 and Figure A in Appendix A).
Less frequently, the indicators "drive" the construct (i.e., the indicators define the construct rather
than being several instances of a construct, and a diagram of the construct and its indicators
would show the indicators connected to the construct with arrows from the indicators to the
construct-- a formative relationship, see Fornell and Bookstein, 1982).
In UV-SD models estimated using regression, unidimensional items are summed to indicate
their construct, while in structural equation analysis the indicators relationship with their
 2002 Robert A. Ping, Jr. 9/20/02 5
construct is explicitly specified using structural equation analysis software such as AMOS, EQS
and LISREL) (however, see Step V-- Single Indicator Structural Equation Analysis below for a
summed indicator approach used in structural equation analysis).
SECOND-ORDER CONSTRUCTS
Second-order constructs have other unobserved constructs as their "indicators" (see Figure J
in Appendix J). These constructs were infrequently observed in the articles I reviewed.
Nevertheless, in Dwyer and Ohs (1987) study of environmental munificence and relationship
quality in interfirm relationships, for example, the second-order construct relationship quality had
the first-order constructs satisfaction, trust, and minimal opportunism as indicators (see Bagozzi,
1981a; Bagozzi and Heatherton, 1994; Gerbing and Anderson, 1984; Gerbing, Hamilton and
Freeman, 1994; Hunter and Gerbing, 1982; Jöreskog, 1970; and Rindskopf and Rose, 1988 for
accessible discussions of second-order constructs). Each first-order construct has its respective
observed indicators. Specifying second-order constructs, such as relationship quality, with firstorder constructs simplified the structural paths of the model, and it provided a richer description
of the consequences of environmental munificence for example. A similar approach was taken in
Pings (1997a) study of the relationship between cost-of-exit and voice in interfirm relationships.
Cost-of-exit was itemized using several unobserved constructs (alternative attractiveness,
relationship investment, and switching cost) which themselves had observed indicators.
As these examples suggest, a second-order construct can be used to combine several related
constructs into a single higher-order construct to simplify the structural paths in a UV-SD model.
A second-order construct can also be used as an alternative to omitting items of a multidimensional measure to obtain model-to-data fit in structural equation analysis. This can be
useful with established measures, developed before the advent of structural equation analysis,
that turn out to be multi-dimensional using structural equation analysis (see Gerbing, Hamilton
and Freeman, 1994). In addition, a second-order construct can be used to account for types of
error other than measurement error in structural equation analysis (see Gerbing and Anderson,
1984).
A second-order construct can be conceptualized for regression as factors in an exploratory
factor analysis that are not particularly orthogonal (however, see the cautions about regression
using variables measured with error in Step VI-- Violations of Assumptions). When the items in
each of these factors are summed, an exploratory factor analysis of the resulting summed items is
also unidimensional. To use a second-order construct in regression (e.g., for exploratory
purposes), the items in each first-order construct (factor) should be unidimensional. In addition
the second-order construct (factor) should be unidimensional using exploratory factor analysis
 2002 Robert A. Ping, Jr. 9/20/02 6
with each first-order constructs summed items as a single item per construct, and the secondorder construct should be face or content valid using the first-order constructs as "items" (see
Appendix J for an example).
INTERACTIONS AND QUADRATICS
Unlike experiments analyzed with ANOVA where interactions (e.g., XZ in
Y = b0 + b1X + b2Z + b3XZ + b4XX + e
(1
= b0 + b1X + (b2 + b3X)Z + b4XX + e )
(1a
and quadratics (e.g., XX in Equation 1) are routinely estimated to help interpret significant main
effects (i.e., the X-Y and Z-Y associations), interactions and quadratics were rarely seen in the
articles reviewed. This may have been because authors have confused detection difficulties with
their frequency of occurrence (see Podsakoff, Todor, Grover and Huber, 1984; also see
McClelland and Judd, 1993). In addition, until recently interactions and quadratics in UV-SD
models have been difficult for researchers to specify and interpret (see Aiken and West, 1991;
Ping, 1995, 1996a).8 Further, because they are mathematical constructs or concepts rather than
mental constructs, and have indicators that are products of observed variables, interactions and
quadratics may be judged by some substantive researchers as inappropriate in UV-SD models.9
Nevertheless, authors have called for more thorough investigation of interactions and
quadratics in survey research (e.g., Aiken and West, 1991; Blalock, 1965; Cohen, 1968; Cohen
and Cohen, 1975, 1983; Darlington, 1990; Friedrich, 1982; Kenny, 1985; Howard, 1989; Jaccard,
Turrisi and Wan, 1990; Neter, Wasserman and Kunter, 1989; Pedhazur, 1982). Their argument is
the same as that used for main effects in ANOVA: failing to consider the possibility of
interactions and quadratics in the population model is likely to lead to erroneous interpretations
of the study's results.
To explain, in Equation 1 the actual coefficient of Z is given by (b2 + b3X) (see Equation 1a).
The statistical significance of this moderated coefficient of Z could be very different from the
statistical significance of the coefficient of Z in Equation 1 without the XZ variable (i.e., Y = b0'
+ b1'X + b2'Z ) (see Aiken and West, 1991). Specifically, if the interaction is significant (i.e., b3 is
significant) b2' could be nonsignificant while (b2 + b3X) is significant over part(s) of the range of
8. There are a variety of techniques for detecting interactions and quadratics. However, because many do not
produce structural coefficients (e.g., b3 and b4 in Equation 1) that suggest the direction and strength of a significant
interaction or quadratic, and they do not permit detailed interpretation such as that shown in Appendix C, the
discussion will concentrate on techniques that estimate structural coefficients for interactions and quadratics.
9. This research treats interactions as mathematical constructs. Because they are mathematical, interactions and
quadratics can, for example, be algebraically factored, as shown in Equation 1a. Because they are also constructs,
their psychometrics (e.g., reliability and validity) are, or should be, important.
 2002 Robert A. Ping, Jr. 9/20/02 7
X in the survey (see Table C2 in Appendix C). Thus, failing to include an interaction when it is
present in the population model would lead to a misleading interpretation of the Z-Y association.
While strictly speaking a nonsignificant b2' implies the Z-Y association is disconfirmed, it is
clearly not the case with a significant XZ interaction that Z is never associated with Y. The
association simply depends on the level of X. This has several implications, including that b2'
could be observed to be variously nonsignificant or significant across multiple studies, and it may
explain inconsistent findings in studies.
Alternatively b2' could be significant while (b2 + b3X) could be nonsignificant over part of the
range of X in a study. In this event, failing to include the interaction could produce a false
"confirmation" of the Z-Y association: The significant Z-Y association is actually nonsignificant
over parts of the range of X in a study. This error is especially insidious in UV-SD model tests.
Many of the studies I reviewed provided interventions (e.g., recommendations to practitioners)
based on significant associations in the study, many of which seemed to me could easily have
been contingent associations.
The algebra and implications of failing to consider the possibility of a population quadratic
are similar. Thus, care should be taken to consider interactions and quadratics in UV-SD model
testing studies. Specifically, they should, of course, be considered when theory postulates their
existence. In addition, they should be considered in post-hoc probing, as is done in experiments
analyzed with ANOVA, to aid either in interpreting and providing the implications of significant
associations, or as a possible explanation for hypothesized but nonsignificant associations (see
Appendix C for an example), or inconsistent results across studies.
There has been considerable progress in estimating interactions in survey data using
regression (e.g., Aiken and West, 1991; Denters and Puijenbrork, 1989; Feucht, 1989; Heise,
1986; Jaccard, Turissi and Wan, 1990; Ping, 1996b; Warren, White and Fuller, 1974) and
structural equation analysis (see Bollen, 1995; Hayduk, 1987; Jaccard and Wan, 1995; Jöreskog
and Yang, 1996; Kenny and Judd, 1984; Ping, 1995, 1996a; Wong and Long, 1987) (also see
Appendix A). However, estimating interactions using structural equation analysis is more
difficult than using Ordinary Least Squares regression (Aiken and West, 1991). Nevertheless
latent variable interactions have been estimated using structural equation analysis (e.g.,
Hochwarter, Ferris and Perrewe, 2001; Lee and Bae, 1999; Lee and Ganesh, 1999; Masterson
2001; Osterhuis, 1997; Singh, 1998). In addition, interactions between first-order and secondorder latent variables have been estimated with structural equation analysis (see Ping, 1999). I
will discuss the estimation of interactions and quadratics later.
I now return to Step I-- Defining Model Concepts.
DEFINING CONCEPTS
 2002 Robert A. Ping, Jr. 9/20/02 8
Concepts are defined using words, then they are measured using observed variables or items
that "tap" or are instances of these definitions. Thus, conceptual definitions, definitions of the
concepts in the model, are required to provide meaning for the label (e.g., satisfaction, solidarity,
equity, etc.) used for each concept, and meaning for concepts associated with these labels. For
example, a concept with the label "overall relationship satisfaction" could have the conceptual
definition, "the subject's overall evaluation of the costs and benefits attributed to the buyer-seller
relationship." Conceptual definitions are important for judging the adequacy of the observed
items used to measure the concept because these items should be judged to be instances or
indicators of the concept.
Operational definitions of concepts can be used in addition to conceptual definitions.
Conceptual definitions can be general, while operational definitions can allow for contextual
specificity or other contingencies in the study. For example, exiting (a label) could be
conceptualized as, or have the conceptual definition of, physically ending the relationship.
However, this concept could be operationalized or measured in many ways depending on the
study context, the population being sampled, the difficulty of tracking down subjects who have
exited, etc. Thus, the concept of exiting could be operationalized as exit-propensity, and have an
operational definition of "the intention to end the relationship" (see Ping, 1993).
An operational definition should be consistent with its conceptual definition, and items
should be "instances" of or "tap" their operational definition. A second-order construct should be
conceptually and operationally defined because the validity of a second-order constructs is as
important as the validity of a first-order construct. However, providing conceptual definitions for
interactions or quadratics is difficult because they are mathematical concepts rather than mental
constructs. Nevertheless, operational definitions should be provided (e.g., Zs moderation of the
X-Y association was operationalized as XZ, the interaction between X and Z) because there are
many operationalizations of an interaction (e.g., X/Z, etc., see Jaccard, Turissi and Wan, 1990).
IDENTIFYING IMPORTANT ANTECEDENTS
Ideally defining the concepts in the model to be tested should involve all the antecedents of
each dependent or endogenous variable. However, models in the social sciences typically account
for only a portion of the variation in the dependent variables. There usually are other unknown
variables that are antecedents of each dependent variable but are not included in the model.
While accounting for all antecedents of every dependent variable is frequently impossible,
especially in early stages of theory development, not including important antecedents (i.e., ones
that are significantly related to the dependent variable, and that are also correlated with the other
 2002 Robert A. Ping, Jr. 9/20/02 9
independent variables included in the model) biases (i.e., inflates or deflates) observed
associations (see Duncan, 1975).
This places a great burden on theoretical model testing for the inclusion of all important
antecedents. Knowing which unstudied antecedents are important obviously requires
considerable researcher knowledge and time. As a result, an objective of pretesting the model,
which is discussed later, should be to address the adequacy of the antecedents (i.e., explained
variance). A model that does not explain much variance in a dependent variable is vulnerable to
subsequent research that use models that explain more variance in a dependent variable: non
significant results may later turn out to be significant because important antecedents were not
included in the original model.
Finally, the requirement to adequately model the antecedents of dependent variables is not
contrary to the notion of model parsimony, because parsimony is intended to exclude variables
that are un important.
SUGGESTIONS FOR STEP I-- DEFINING MODEL CONCEPTS
Because of their importance in establishing the meaning of a construct and their importance
to evaluating the domain sampling adequacy of the observed items that comprise the measure of
the construct, conceptual definitions should be carefully stated for each construct. Operational
definitions should be used to allow for contextual specificity of the sample or other contingencies
in the study, and to describe second order concepts and interactions.
To reduce bias in the estimates of the associations with dependent variables in the model, and
thus the likelihood of false negative (Type I) and false positive (Type II) errors, the important
antecedents of each dependent variable should be specified in the model to be validated. For
similar reasons, interactions and quadratics should be considered even if theory is mute on their
possible existence as they are in ANOVA studies. At a minimum they should be estimated on a
post hoc basis when an hypothesized association turns out to be non significant, as a possible
explanation for this lack of significance.
Because second-order constructs can combine dimensions of a multidimensional construct
and produce a more parsimonious structural model, among other reasons, plausible second-order
constructs should also be considered.
(end of section)
 2002 Robert A. Ping, Jr. 9/20/02 10
Download