A Framework for Analyzing Levels of Analysis Issues in Studies of E

advertisement
IEEE TRANSACTIONS ON PROFESSIONAL COMMUNICATION, VOL. 48, NO. 1, MARCH 2005
87
A Framework for Analyzing Levels of Analysis
Issues in Studies of E-Collaboration
—MICHAEL J. GALLIVAN AND RAQUEL BENBUNAN-FICH
Abstract—There has been a proliferation of competing explanations regarding the inconsistent results reported
by the e-collaboration literature since its inception. This study advances another possible explanation by
investigating the range of multilevel issues that can be encountered in research on the use of synchronous
or asynchronous group support systems. We introduce concepts of levels of analysis from the management
literature and then examine all empirical studies of e-collaboration from seven information systems journals
for the period 1999–2003. We identified a total of 54 studies of e-collaboration in these journals, and after
excluding 18 nonconforming studies—those that were primarily conceptual, qualitative, or exploratory
only—we analyzed the levels of analysis issues in the remaining 36 empirical studies. Based on our analysis
and classification of these studies into six different clusters according to their levels of analysis, we found
that a majority of these studies contain one or more problems of levels incongruence that cast doubts on the
validity of their results. It is indeed possible that these methodological problems are in part responsible for
the inconsistent results reported in this literature, especially since researchers’ frequent decisions to analyze
data at the individual level—even when the theory was formulated at the group level and when the research
setting featured individuals working in groups—may very well have artificially inflated the authors’ chances of
finding statistically significant results. Based on our discussion of levels of analysis concepts, we hope to
provide guidance to empirical researchers who study e-collaboration.
Index Terms—E-collaboration, group support systems (GSS), levels of analysis.
U
nderstanding and enhancing the value of
information technology (IT) within organizations
is, arguably, the primary research objective of
information systems (IS) literature. Over the past
30 years, beginning with early studies by Lucas,
researchers have sought to identify when and
why computer technology delivers benefits to
organizational members [1]. While this endeavor
has evolved into distinct research streams
examining the use of computer and communication
technologies by individuals, groups, organizations,
and interorganizational supply chains, the issues and
insights from each research stream have important
implications for each other. While the explanations
regarding whether and how IT creates benefits have
value from one level of analysis to another, it is
critical that IS researchers bear level issues in mind
as they formulate their theories, design their studies,
and analyze their data. Our objective is to evaluate
IS research conducted on electronic collaboration
(henceforth e-collaboration) to examine levels of
Manuscript received March 22, 2004; revised July 26, 2004.
M. J. Gallivan is with the Department of Computer Information
Systems, Robinson College of Business, Georgia State
University, Atlanta, GA 30302 USA (email: mgallivan@gsu.edu).
R. Benbunan-Fich is with the Computer Information Systems
Department, Zicklin School of Business, Baruch College,
City University of New York, New York, NY 10010 USA
(email: raquel_benbunanfich@baruch.cuny.edu).
IEEE DOI 10.1109/TPC.2005.843301
0361-1434/$20.00 © 2005 IEEE
analysis issues, and to assess the extent to which
these concerns are appropriately addressed.
One example of the importance of consciously
reflecting upon levels of analysis issues appears in
the IT payoff literature. While studies of IT payoff are
generally conducted at the organizational level of
analysis, in contrast to research on e-collaboration,
which is usually studied at the group level of analysis,
problems that accompany misspecification of the
appropriate level of analysis are important for all
researchers to recognize and resolve [2]. In order to
illustrate our topic, we offer the following analogy to
issues that have plagued researchers in the IT payoff
literature over the past decade.
Within the organizational IT payoff literature, there
has been considerable emphasis on the so-called
productivity paradox, a phenomenon first mentioned
in the late 1980s by economist Stephen Roach [3],
[4]. Over the subsequent 15 years, many studies that
have investigated IT payoffs at first supported Roach’s
productivity paradox [5], [6], but more recent studies
have rejected it and, conversely, have demonstrated
the considerable value of IT investments to firm-level
performance [7]–[9]. Among the advances that have
led to better insights into the consequences of IT
investments have been studies that urged researchers
to more clearly specify the levels of analysis at which
their data are collected and analyzed (i.e., at the
88
IEEE TRANSACTIONS ON PROFESSIONAL COMMUNICATION, VOL. 48, NO. 1, MARCH 2005
industry, firm, or business unit level) [5]. A second
set of improvements have resulted from developing
more precise construct definitions and statistical
procedures for detecting so-called payoffs from IT
investments—for example, specifying whether the
benefits appear in terms of greater productivity,
profitability, or consumer welfare [10], and whether a
time lag exists between when the funds are invested
and when payoffs appear [5].
A decade ago, Brynjolfsson underscored the
importance of levels of analysis when he noted
the difference between analyzing payoffs from IT
investments at the firm level versus the industry level
of analysis:
IT may be beneficial to individual firms, but
unproductive from the standpoint of the industry
as a whole or the economy as a whole: IT
rearranges the shares of the pie without making
it any bigger . . . [E]conomists have recognized
for some time that . . . one firm’s gain comes
entirely at the expense of others, instead of by
creating new wealth . . . IS researchers would
draw very different conclusions from studies
that examine industry-level benefits from IT
investment (which may be absent or negative) vs.
studies that examine firm-level benefits (which
may be positive). Thus, misspecification of the
appropriate level of analysis regarding where
the benefits accrue would lead to incorrect
conclusions regarding the value of IT investments.
[5, p. 75]
Moreover, other recent studies have urged researchers
to seek more complex theories to explain the
conflicting results that characterize this area of
study [11], such as process models [12], or mediating
variables that link IT effects on specific processes
to overall firm performance [13]. Based on these
advances in research methods and in the precision of
theoretical formulation, IT payoff studies in recent
years have been able to consistently identify firm-level
benefits of IT spending [7], [8], [14], thus refuting the
productivity paradox.
Without a doubt, research on IT usage at the group
level is an important research domain within the
IS literature, yet researchers often fail to notice the
parallels between group- and organizational-level
research. Chan has noted that researchers at different
levels of analysis often “talk past each other,” and she
claims that, with regard to these levels of analysis
issues, the “schisms are getting more noticeable over
time” [2, p. 241].
Most group support systems (GSS) research is
plagued with seemingly contradictory findings that
sometimes advocate for use of these technologies, and
other times report little or no benefit. If any smoking
gun exists (i.e., in terms of a study that challenges
the value of GSS technologies, as Roach did when
he first noted the productivity paradox), it is the
study by Pinsonneault, Barki, Gallupe and Hoppen
where the authors concluded that the use of GSS for
supporting group brainstorming created an “illusion
of productivity” [15, p. 110]. Aside from this particular
study, there appear to be no other studies that
challenge the value of GSS technologies or the GSS
research program as a whole. Yet, the most optimistic
conclusion one might offer regarding the past 20
years of GSS research is that the findings have been
steadfastly inconsistent. It is unclear whether the fault
lies in the use of overly deterministic epistemologies
[16], [17], one-shot research approaches that neglect
to consider group history and changes over time [16],
[18], adherence to deterministic theories that assume
an unproblematic “logic of determination” [11], failure
to ensure task-technology fit within the research
context studied [19], or other possible explanatory
factors.
Our objective in this paper is to draw attention to
another potential explanation in the literature on
group-level IT usage by arguing that ongoing neglect
of levels of analysis issues that have been discussed
in the management literature for over 20 years
may contribute to the confusion and inconsistent
results concerning GSS usage and other forms of
e-collaboration among users [20], [21]. We do so by
attempting to bridge the gap, or schism, between
researchers who focus on IT use and its impacts
at different levels of analysis. We believe that the
earlier problems and insights derived from research
on IT payoffs (conducted at the firm level) indeed
have important implications for IS researchers
who study issues related to group-level IT use and
impacts. We proceed by drawing attention to recent
contributions to the levels of analysis debate within
the management literature [22]–[24], arguing that the
insights offered there have important implications for
IS researchers studying e-collaboration in terms of
how we theorize, operationalize our constructs, collect
data, and conduct our analyses. Based on a review
of 54 studies of e-collaboration published in seven
leading IS journals during the period 1999–2003,
we find that there has been insufficient attention to
ensuring a good fit between the levels at which the
theory is formulated vs. the levels at which data are
collected and analyzed. In his recent commentary on
the IT payoff literature, Kohli noted that:
past studies that had been looking for IT payoff at
the economy or industry level should have been
examining the impacts at the firm-level . . . [Given
the prior history of] mixed or negative results . . .
the business value of IT, or IT payoff, literature
appears to face challenges to move from the
macro level to micro level. [25, p. 1467]
In an analogous fashion, we believe that inadequate
attention to levels of analysis concerns may be
responsible for the contradictory findings in studies
GALLIVAN AND BENBUNAN-FICH: ANALYZING LEVELS OF ANALYSIS ISSUES IN STUDIES OF E-COLLABORATION
of GSS use and other forms of e-collaboration. This
is consistent also with Poole and Van de Ven’s advice
that
many problems and solutions apparent at one
level of organization manifest themselves in
different and contradictory ways at other levels
. . . Key dynamics can often be explained and
understood at one level of organization as a
result of processes occurring at another. [26, pp.
570–571]
Through our review of the GSS literature, we hope
to make IS researchers aware of the problems with
levels of analysis issues that have characterized
much research on GSS use and impacts, and to offer
guidelines for redressing these problems in the future.
LITERATURE REVIEW
There is increasing evidence that despite constant
effort on the part of IS researchers over the past
two decades, there has been little in the way of a
cumulative body of findings on the use of IT to support
e-collaboration. Within the past few years alone,
several studies have reviewed and even meta-analyzed
the literature on GSS use and its impacts [19],
[27]–[29]. Among the problems identified from such
analyses were conclusions from Dennis and Wixom
who noted that:
For almost 20 years, researchers have been
studying the effectiveness and efficiency
of systems that support synchronous and
asynchronous teams . . . Unfortunately, drawing
some overall conclusions from this collective
body of research about the general effect of GSS
has not been easy because GSS findings have
been relatively inconsistent . . . (Based on findings
from two meta-analyzes), GSS use was found
to improve decision quality and the number of
alternatives or ideas generated across studies,
but . . . to increase the time taken to complete
the task and reduce participant satisfaction . . .
[In their] comprehensive nonstatistical analysis
of more than 230 articles . . . [Fjermestad and
Hiltz] found that in most cases, use of GSS led
to no improvements, even in the applications
believed to be most suited to GSS use ([namely]
idea generation) . . . [27, pp. 235–236]
Gopal and Prasad, who advocated alternative
epistemologies and longitudinal studies to follow
groups over time, described the problems that such
inconsistent results create for IS researchers:
There is wide acknowledgment that GDSS
research results have been either inconsistent or
nonexistent . . . Rather than narrowing down to
some “truth,” . . . there appears to be little accord
among researchers . . . The inconsistency and the
lack of significant results appear to have been
particularly disturbing to the GDSS research
community, with almost every new voice within
89
the literature drawing attention to them and
attempting to explain how the problem might be
solved. The solutions proposed, unfortunately,
have resulted in the proliferation of theories and
models and in the fragmentation of views within
the community. [16, p. 510]
In addition to calls for alternate research
epistemologies and methods [16], several authors
have called attention to the need for researchers
to understand whether and how group members
appropriate the various features that support
group collaboration [30], to examine the level of
task-technology fit [31] or to carefully attend to both
sets of concerns [27]. Other proposed solutions
have been to develop new constructs such as
faithfulness of appropriation [32] and consensus
on appropriation [33], or to employ theories that
recognize inconsistency and a “logic of opposition”
in their basic assumptions, rather than to expect
consistency and technological determinism [11]. In
this paper, we offer a different explanation for the
inconsistency in research findings that characterize
previous studies of e-collaboration: we propose that
researchers have neglected to attend to levels of
analysis issues, and that several of the prior findings
may be called into question as being trustworthy or
valid. While we recognize the degree to which our claim
may be controversial, nevertheless, our arguments
are well supported by much research in the fields
of psychology and management, and these issues
are finally beginning to receive attention from IS
researchers [25], [34]. Despite these nascent attempts
to raise such issues, both within the traditional IS
outlets as well as beyond the boundaries of the IS
discipline, we believe that such issues should bear
increased attention within the IS community. Hence,
we seek to convey our message to IS researchers who
study e-collaboration through this special issue on
“Expanding the Boundaries of E-Collaboration.”
In reviewing studies of GSS use and other forms of
e-collaboration, we examined a broader set of studies
than have been discussed in previous review papers
[19], [27]–[29]. While the set of papers we reviewed
is more circumscribed in terms of the time period
covered (compared to these prior review studies), we
sought to review not only the traditional studies of
electronic meeting systems or group decision support
systems (GDSS) (i.e., technologies that support
same time, same place meetings) but also to include
studies of distributed teams and communities using
asynchronous support tools. While the earlier review
studies found a very small proportion of studies
of asynchronous technologies among the overall
set of GSS studies (only 8% of all GSS lab studies
[28] and 16% of GSS field studies [29] focused on
asynchronous support technologies), we deliberately
searched for studies that feature distributed teams
and asynchronous support tools to complement the
90
IEEE TRANSACTIONS ON PROFESSIONAL COMMUNICATION, VOL. 48, NO. 1, MARCH 2005
traditional studies of GSS, GDSS, and electronic
meeting systems. We included studies that examined
the use of a broad range of group-related technologies,
regardless of the specific labels employed by the
authors (GSS, GDSS, electronic meeting systems,
asynchronous collaboration, distributed or virtual
teams, virtual communities, etc.).
Below, we introduce the core concepts and
terminology related to levels of analysis issues and
emphasize that these insights are valuable for all
IS researchers who study e-collaboration among
individuals and groups. Despite the fact that the
concepts that we introduce and discuss are often
labeled as multilevel theory concerns, we strongly
emphasize that awareness of these issues should not
be restricted to researchers who regard themselves
as multi-level researchers, or to those who explicitly
conduct multi-level or cross-level research, but rather
to all IS researchers. Throughout the paper, we seek
to draw parallels between refinements in theory and
methods that can help to improve the consistency
of e-collaboration research by drawing analogies to
advances that were previously achieved in the domain
of firm-level IT payoff studies. These advances have
encouraged researchers who study IT payoffs at the
firm or industry levels to more clearly specify the
levels at which they expect the hypothesized effects to
occur, and to follow appropriate statistical procedures
for analyzing IT payoffs [6], [13].
LEVELS OF ANALYSIS CONCEPTS
DEFINITIONS
AND
Over the past several years, the set of issues that fall
under the label of levels of analysis have become a
niche research area in management and psychology,
with scholars such as Katherine Klein [22]–[24], [35],
[36], Steve Kozlowski [37], [38], and Fred Dansereau
[22], [39] among the leading thinkers and writers.
While there has been increasing interest in multi-level
research and levels of analysis concerns in the
management, psychology, and educational research
literature in recent years, the underlying concepts
were first explicated over 20 years ago by Denise
Rousseau [20], [21]. In addition to such seminal
studies of key multilevel concepts [40], statistical
approaches for ensuring that data collected at the
individual level can be properly aggregated to the
group level have been in existence for over 20 years
[41], [42]. Despite the longevity of these concepts and
statistical methods concerning levels of analysis, we
believe that there are two key reasons that these
important ideas have been almost entirely neglected
in the IS literature (with one notable exception [34]).
First, there has been a proliferation of recent studies
comparing and critiquing the various statistical
approaches for examining measures of intercoder
reliability and within-group homogeneity [37],
[43]–[45]. In addition to being very complex—in terms
of the level of mathematical sophistication required
to understand them—these papers have inadvertently
created the illusion that the primary issues are
statistical ones. This is an unfortunate state of
affairs because the key issues in understanding and
correctly specifying the levels of analysis for research
are actually conceptual rather than statistical
concerns [22]. It is critical for all researchers to
clearly understand and specify the levels of analysis
at which their theories apply, even before they delve
into the details of appropriate statistical techniques to
ensure that their data collection and analytic methods
conform to the level of their theories [22], [23], [38].
A second potential reason for the general neglect of
this level of analysis literature is that many authors
consider it relevant only for researchers who explicitly
develop multi-level or cross-level theories. This second
misunderstanding is easily explained, since the first
publications to explicate the key issues involved in
levels of analysis featured titles such as “multi-level
and cross-perspectives,” which implied that this
topic was only of concern to multi-level or cross-level
theorists [21]. While both misconceptions have
contributed greatly to the lack of attention to levels of
analysis issues in organizational research, we are not
the first authors to acknowledge these contributing
problems. A decade ago, psychologist Katherine Klein
and her colleagues provided a cogent, nonstatistical
explanation of the theoretical issues involved in
specifying proper levels of analysis [22]. In numerous
publications, and assisted by various co-authors, she
has led the charge to develop a greater awareness
of levels of analysis issues through special journal
issues [24], focused monographs [23], and other
research studies [36]. In particular, with regard to the
misconception that levels of analysis issues should be
of concern only to those researchers who regard their
work as multi-level, Klein et al. acknowledge the fact
that prior guidelines on levels of analysis issues:
. . . create the inadvertent impression that
attention to levels is only a priority for scholars
who undertake mixed-level theory. But, precise
articulation of the level of one’s constructs
is an important priority for all organizational
scholars whether they propose single- or
mixed-level theories . . . [There are] profound
implications of specifying a given level or levels of
a theory. Greater appreciation and recognition
of these implications will . . . enhance the clarity,
testability, comprehensiveness, and creativity of
[all] organizational theories. [22, p. 196]
We acknowledge that the concepts we discuss below
are not novel, but have been discussed by Klein and
her colleagues [22], [36], Rousseau and her colleagues
[20], [21], [46], and James and his colleagues [41],
[42], [47] in years past. We believe, however, that while
our observations below may not be groundbreaking
to methodological experts within the IS field—those
GALLIVAN AND BENBUNAN-FICH: ANALYZING LEVELS OF ANALYSIS ISSUES IN STUDIES OF E-COLLABORATION
Cortina labels as the “methods folks”—nevertheless,
we are the first to bring these issues to the attention
of a broader group of IS researchers, specifically those
studying the use of synchronous and asynchronous
tools for e-collaboration [48, p. 339].
Levels of analysis concerns are important for all
researchers to understand and follow. Seminal
works on levels issues [20], [22] have identified three
domains at which levels issues may be considered:
the level at which the theory is conceptualized, the
level at which data are collected, and the level at
which data are analyzed. While the level of data
collection and level of data analysis are largely
self-explanatory, the notion of the level at which
theory is conceptualized—the FOCAL UNIT—is not so
straightforward. According to Rousseau [20]:
The level to which generalizations are made is the
focal unit. In practice, the focal unit often is not
identical to either the level of measurement or the
level of [data] analysis. Researchers [may seek to]
. . . measure an individual’s sense of autonomy
and the number of formal rules and regulations
in an individual’s job and conclude that an
organization’s technology affects its structure.
[20, p. 4]
In both seminal articles, the authors explain that it
is critical that the level at which data are analyzed
must conform to the focal unit of the theory [20],
[22]. Where the focal unit is incongruent with the
level of data analysis, problems of misspecification
occur, leading to cross-level fallacies, contextual
fallacies, and aggregation biases. Contrary to what
most researchers believe, Klein et al. argue that it
is not necessary that the level of data collection
match the level of data analysis—and, in fact, they
urge researchers to collect data at multiple levels
of analysis [22]. While such advice may appear
counterintuitive, Klein et al. argue that it is the level
at which data are analyzed that must match the
focal unit (but the level at which data are collected
may be different—ideally, collecting data at a more
“micro” level than the level at which data will be
analyzed). Thus, it is possible, and even desirable,
for researchers to specify group-level hypotheses
as their focal unit, but then to collect data at the
individual level and analyze their data at the group
level. In this regard, the focal unit indeed conforms
to the level of data analysis since both are at the
group level, but the focal unit (i.e., group level) is at a
higher level of analysis than the level at which data
are collected. This is both appropriate and desirable
[22]. Researchers must, however, demonstrate that
their data meet specific criteria before analyzing
individual-level data at the group level of analysis;
otherwise problems of aggregation bias will occur [20],
leading to findings that are statistically significant,
but possibly invalid [34].
91
Not only is it permissible for data to be collected
one level lower than the level at which they are to
be analyzed, but Klein et al. strongly encourage
researchers to do so because such lower-level data
can be statistically analyzed to ensure that they
exhibit the necessary attributes to be aggregated to
the higher, group level [22]. In contrast, researchers
who collect their data only at the group level (e.g.,
in order to test a group-level theory) have no choice
but to simply assume that their constructs are valid
at the group level. They are unable to statistically
test whether these constructs are valid at the group
level. According to Klein et al. [22], it is always
better for researchers to test their assumption that
individual-level data can be statistically aggregated
to the group level, rather than simply assume this
to be true [22]. Thus, for example, in a study whose
focal unit is the group level, it is desirable for the
researcher to collect data at both the individual and
the group levels, and then to statistically test whether
individuals are more similar within groups than would
occur by chance, thus demonstrating that the data
can be analyzed at the group level. There are several
statistical methods for doing so (which we briefly
review in the next section). The important conclusion
is that it is acceptable for the researcher to use
individual-level data for testing group-level theories,
but only if the data can be statistically proven to meet
the criteria for conducting analyses at the group level
(generally known as within-group homogeneity of
variance or interrater reliability). Klein et al. explain
why a demonstration of within-group homogeneity is
such a critical threshold for researchers to meet when
testing their theories at the group (or unit) level:
The very definition of such [group] constructs
asserts that unit members agree in their
perceptions of the relevant characteristics
of the unit. In the absence of substantial
within-unit agreement, the unit-level construct is
untenable, moot . . . For example, in the absence
of substantial agreement among the members of
a unit about the unit’s norms, the unit simply
has no shared norms. [22, p. 4]
Below, we distinguish between three types of
group-level constructs, which are global (measured at
the group level), shared (measured at the individual
level, but homogeneous within groups), and fuzzy or
fictional (constructs that arguably do not exist at the
group level) [43]. (There is another category we do
not discuss here which is known as the “frog-pond
effect” [22]. This concerns data which are measured
individually, but which are shown to be heterogeneous
among members of a group. Such constructs may be
useful for explaining how members’ divergence from
their group’s mean can explain other constructs. An
example is research on organizational demography,
which shows, for example, how age, gender, or racial
diversity within a management team influences
group- or corporate-level performance.)
92
IEEE TRANSACTIONS ON PROFESSIONAL COMMUNICATION, VOL. 48, NO. 1, MARCH 2005
If a construct is measured at the group level, and
there is no individual-level analog, then the measure
is a global measure. Some examples are the group’s
mission and the quality of the group’s output or
performance. Other measures that are typically
assumed to be global measures include group size
and the number of unique ideas proposed by the
group during a brainstorming task. If the construct
is actually measured at the individual level and then
aggregated up to the group level (usually through
averaging or summing the individual-level data within
each group), the construct is shared rather than
global. In order for a construct to be shared at the
group level, however, the individuals within each
group must be relatively homogenous, in terms of
their individual scores. One way to operationalize
such within-group homogeneity is by requiring that
the statistical variation of the scores within each given
group be less than the variance among the individual
scores across the various groups. A number of
complex statistical metrics have been proposed to
measure within-group homogeneity (sometimes
labeled intercoder agreement), with names such as
eta-squared [44], rwg (within-group correlation [19]),
intra-class correlation coefficient [44], [45] (there are
two versions of this, known as ICC-1 and ICC-2),
and within-and-between analysis (WABA) [49]. The
statistical methods for assessing such within-group
homogeneity are complex, and have been explicated
in more than a dozen papers in psychology and
management journals dating back some 20 years
[41], [42]. Some useful studies have recently been
published which compare and contrast the various
metrics for assessing within-group homogeneity
[35], [45], examining issues such as how group size
affects the various metrics [44]. Most of these metrics
are used to assess whether it makes sense to use
aggregated, individual-level data by examining the
data for all groups on a specific construct (these
include eta-squared, ICC-1, and ICC-2). Other metrics
are used for testing whether it is appropriate to
aggregate the data for a single group on a particular
construct (rwg ), in which case, a separate rwg score
can be derived for each group on each construct [42],
[47]. One other metric, within-and-between analysis,
examines the entire set of constructs for all groups,
indicating whether it is permissible for the overall
data to be aggregated to the group level [45], [35]. We
note that there is no metric for assessing within-group
homogeneity of a single group on multiple constructs.
Thus, it makes no sense for a researcher to state
that a single group was homogenous across many
constructs, or to report a single metric for inter-coder
reliability for several constructs—although one recent
study did just that [82].
While it is beyond the scope of this paper to define
these measures or to explain their mathematical
formulae, two points worth bearing in mind are: (1)
much contemporary writing already exists on the
legitimacy and statistical methods for aggregating
individual-level data to the group level, and (2) if
no metrics are provided by researchers to support
their claim of within-group homogeneity, then it
makes no sense to aggregate individual data up to
the group level. The mere reporting of Cronbach’s
alpha or Cohen’s kappa coefficients is inadequate,
since these measures have no bearing on the decision
to aggregate individual-level data to the group level.
Without some statistical evidence for within-group
homogeneity, the meaning of data that has been
averaged or summed to the group level is unclear
[38]. Bliese labels such constructs as “fuzzy” [43, p.
369]. The mere fact that some aggregated measure of
individual perceptions may be statistically significant
in explaining other group-level constructs is not
sufficient proof that the construct truly exists at
the group level. Klein et al. are very resolute in
their statement that “within-group agreement is a
prerequisite for the aggregation of the individual-level
data to the group level. In these models, within-group
agreement is more than a statistical hurdle. It is an
integral element in the definition of the group level
construct” [36, p. 4]. This is a critical point. Even if
researchers wish to use aggregated individual data
to test a group-level theory, if the individual-level
scores cannot be shown to exhibit within-group
homogeneity, then the construct makes no sense at
the group level of analysis. Klein et al. assert that:
if the level of statistical analysis matches the
level of theory [e.g., the group level], yet the data
do not conform to the predicted level of theory,
a researcher may draw erroneous conclusions
from the data. The importance of conformity
of the data to theories predicting within-group
homogeneity is relatively well known and well
understood. [22, p. 199]
With this overview of levels of analysis issues and
statistical techniques for assessing within-group
homogeneity, the following section describes our
research methods for examining recent literature on
e-collaboration to determine the extent to which these
levels issues are appropriately managed.
RESEARCH METHODS
To examine the levels of analysis issues described
above, we analyzed recently published research
on e-collaboration. We selected the three leading
IS journals in North America (Information Systems
Research, MIS Quarterly, and Journal of MIS); two
top European journals (European Journal of IS and
Information Systems Journal); two specialized outlets,
including a technical journal (Decision Support
Systems) and an e-commerce journal (International
Journal of Electronic Commerce). We examined all
papers published in these journals from 1999–2003.
Although some recent review studies focused on just
the three North American journals (for example, [2],
[34]), we included a more comprehensive set of outlets
GALLIVAN AND BENBUNAN-FICH: ANALYZING LEVELS OF ANALYSIS ISSUES IN STUDIES OF E-COLLABORATION
where studies of e-collaboration are published. Our
data collection period was one year longer than that
of Walczuch and Watson, who conducted a similar
review of whether researchers had appropriately
taken group-level factors into account in their studies
of GSS usage, based on four years of empirical studies
[34].
To identify relevant articles, we used ABI/Inform and
Business Source Premier to search the seven journals,
using terms and phrases such as GSS, electronic
meeting systems, groupware, virtual teams, and
collaborative technology. Next, both authors reviewed
each paper retrieved from the search to ensure
that it was related to the use of IT for collaboration
among individuals in groups, teams, or communities.
We used three attributes of Fjermestad and Hiltz’s
definition to identify studies of e-collaboration, to wit:
First, the study had to be published in a refereed
journal . . . Second, [those we included] . . .
were studies of groups, which we defined as
comprising at least three members . . . Third,
they used a computer-based GDSS or GCSS
[group communication support system] with at
least minimal features designed to support group
communication and decision-making processes.
[28, p. 9]
We specifically did not follow the fourth criterion
specified by Fjermestad and Hiltz—namely, that
the study had to be a controlled experiment [28].
Instead, we deliberately sought diversity in terms of
the research methods employed. In fact, similar to a
recent study that identified three primary research
methodologies that have been employed to investigate
e-collaboration [50], the studies we retrieved employed
a range of experimental, survey, and case research
methods. Despite the variety of research methods
and group technologies employed, the articles that
we retrieved excluded studies conducted at the
organizational level of analysis (e.g., firm-level case
studies of electronic markets, alliances, and supply
chain management [51]–[53] or studies that examined
the adoption of interorganizational collaborative
technologies such as EDI and B2B e-commerce
[54]–[57]).
We found a total of 54 articles that met our search
criteria. This included 37 papers from the three
leading North American Journals, seven papers
from the European journals, and ten from the
specialized publications. The total number of research
articles published in these seven outlets during
1999–2003 was 989. This figure includes only
research articles and research notes, and excludes
research commentaries, issues and opinions pieces,
editorial comments, and book reviews. A total of 388
articles were published in the leading North American
journals, from which nearly 10% (37 papers) matched
our search criteria. The distribution of the sample of
93
this set of 37 articles was 57% from JMIS, 30% from
MISQ, and 13% from ISR. The ratio of selected articles
to total articles from these three North American
general IS outlets were 4.5% from ISR, 12% from JMIS,
and 11% from MISQ. These figures are similar to the
proportion of studies that Vessey et al. identified at
the group level of analysis in the same journals for an
earlier time period (which were 10.6%, 15.3%, and 8%
for these three journals, respectively) [58]. Moreover,
the larger number of studies on e-collaboration that
we retrieved from JMIS (21 studies) than from ISR (5
studies) or MISQ (11 studies) is thus consistent with
the findings of Vessey et al. [58], showing that JMIS
published more group-level studies than the other
journals they examined. It was surprising to find so
few qualifying studies in the European journals (just
7 out of 200), in the e-commerce journal (1 out of
100), and in the specialized publication (9 out of 301).
Table I presents the list of the studies that met our
search criteria, organized by publication.
TABLE I
Research articles on e-collaboration published from
1999–2003 by journal title
For each of these 54 studies we identified, we coded
the following information: the type of collaborative
technology examined; the number of groups studied;
group size; total number of individuals studied;
the duration of the study; independent, dependent
and mediating variables; and the analytic methods
employed. In terms of level of analysis issues,
we identified three domains, consistent with the
arguments summarized above [21], [22]: (1) whether
the initial theory and hypotheses were specified at
the group or individual level, (2) the level at which
data were collected, and (3) the level at which
data were analyzed. If the researchers aggregated
individual-level data to the group level, we also noted
whether they included some metrics to establish
within-group homogeneity or interrater reliability
94
IEEE TRANSACTIONS ON PROFESSIONAL COMMUNICATION, VOL. 48, NO. 1, MARCH 2005
prior to aggregation. Similarly, if the researchers
analyzed their data at the individual level, we noted
whether they controlled for the group to which each
subject was assigned, as recommended by Walczuch
and Watson, and whether the subjects actually
interacted with each other within their groups, or
whether these individuals worked alone. In this
regard, we noted some confusion in the terminology
employed by researchers. In much of the empirical
GSS research, the term group is employed to refer
to the context in which task performance occurs
(e.g., such as electronic brainstormingor other
problem solving). Some researchers, however, employ
the term group where they are simply referring to
the treatment condition to which subjects were
assigned. For example, in some studies, the treatment
conditions were having access to videoconferencing
versus simple, textual data. We consider these to be
two different treatment conditions rather than two
groups per se. In some studies, there were multiple
groups within each treatment condition; in other
studies, however, there were no groups, because the
treatment was administered to individual subjects
without any group interaction or communication.
These subjects co-existed within a classroom setting
or virtual community but did not have to produce a
specific joint output (e.g., a group decision or group
report) [34].
Of the total 54 studies, there were 18 “nonconforming
studies,” to use Fjermestad and Hiltz’s terminology,
that we were unable to analyze for levels of analysis
issues [28, p. 77]. These studies met our definition
of e-collaboration, but for various reasons, we were
unable to examine the levels of analysis issues in the
same manner as with the other published studies.
This included two quantitative meta-analyzes [19],
[27]; two qualitative literature reviews [28], [29]; two
conceptual papers [59], [60]; two methodological
reviews [61], [62]; a qualitative, comparative case
study analysis [63]; and six case studies that were
conducted within a single group, project team, or firm
[16], [64]–[68]. We also excluded three exploratory
studies for which no a priori theory or hypotheses
were stated [69]–[71], where only descriptive statistics
were presented, rather than the results of any
multivariate analysis or hypothesis testing. The latter
study collected quantitative and qualitative data
from several groups, with the goal of showing how
positivist and interpretive analyses of the same data
could reveal different insights [71]. Although these
studies did collect and analyze quantitative data from
multiple groups using a GSS, these studies were
exploratory, and no theory or hypotheses were stated.
We classified these three studies as “nonconforming”
because only descriptive statistics were reported,
without any true analysis or hypothesis testing.
We excluded these 18 studies either because
quantitative meta-analysis studies combine data
from many prior studies or because qualitative case
studies are not susceptible to the level of analysis
concerns that we described above [19], [27]. In this
regard, we concur with Larsen, who noted that
qualitative research has many advantages “due to
its rich description, [but] research developed using
quantitative approaches offer a higher degree of
formalization in application of methods” [72, p.
170]. Finally, for the studies that examined only a
single group or team [16], [64]–[68], it is meaningless
to attempt to compute measures of within-group
homogeneity for a single group [22]. After omitting
these 18 nonconforming studies, there were a total
of 36 empirical studies of e-collaboration remaining,
which both authors read and coded [28].
RESULTS
We analyzed the sample of articles by identifying
two sets of issues regarding congruence of levels.
First, what is the level at which the theory is stated
(the focal unit, per Rousseau [20]), and does it
match the level at which the data were analyzed?
Second, is the level at which the data were analyzed
congruent with the nature of data themselves? For
instance, a group-level theory is congruent with
group measures that are collected globally (e.g.,
group size, quality of group output, completion time,
or number of unique ideas within a brainstorming
exercise) or with individual-level data that can be
statistically shown to be shared (i.e., homogeneous)
among group members. As Klein et al. described,
it is feasible to posit hypotheses at the group level
and then aggregate individual-level data up to the
group level, but such analyses make sense only if
the researchers can first demonstrate that group
members are more similar to their peers within
the group than to members of other groups [22].
The authors must demonstrate such within-group
homogeneity separately on each measure for which
they wish to aggregate individual-level data up to
the group level. There are a host of techniques for
documenting within-group homogeneity of variance,
as described above (e.g., [35], [43], [45]), and doing
so is a necessary prerequisite to conducting the
analysis at the group level. Based on our review of
these two sets of questions, we identified six distinct
clusters of research articles. Table II summarizes the
criteria describing each cluster, as well as any levels
incongruence, and the studies corresponding to each
cluster.
Of the six clusters of articles that we identified, the
findings from three of these clusters are valid and
trustworthy (clusters 1, 2, and 5), because there is
congruence between the level of analysis at which
the theory is stated (the focal unit), the level at
which data are analyzed, and the nature of the data
themselves [20]. For the other clusters, there is
some form of incongruence—whether incongruence
GALLIVAN AND BENBUNAN-FICH: ANALYZING LEVELS OF ANALYSIS ISSUES IN STUDIES OF E-COLLABORATION
between the focal unit and the level of data analysis
(cluster 4) or incongruence between the nature of the
data themselves and the methods using to analyze
them (clusters 3 and 6). Below, we describe and give
examples of each cluster, explaining how these studies
exhibit levels of analysis congruence or incongruence.
Cluster 1 consists of studies where the level of
analysis is unquestionably group level. This means
that the focal unit at which the theory and hypotheses
are stated, the nature of the data collected, and the
analysis methods are all at the group level. To belong
to this cluster, all of the data collected and analyzed
should be global constructs measured at the group
level—such as the amount of time to solve a problem
or reach a decision, the total number of unique ideas
generated during brainstorming, or the quality of
the group’s solution. The type of global performance
measure is dependent upon the nature of the task the
group is performing. For tasks with right answers,
the accuracy of the group’s decision is appropriate;
for brainstorming tasks, the total number of unique
ideas generated is a meaningful, global construct;
finally, for judgment tasks, the quality of the group’s
output may be used. Conclusions drawn from such
studies are valid because there is a strong fit between
the group focal unit and the levels at which data were
collected and analyzed. There were three studies that
conformed to cluster 1, based on our search. Adkins
et al. [73] and Benbunan-Fich et al. [74] partially
corresponded to this cluster because most of their
hypotheses (except for one in each case) were at the
95
group level. In addition, we also placed the study by
Pinsonneault et al. in this cluster because the theory,
the data collection, and the data analysis were all at
the group level [15]. Another study by Barkhi was
identified as also partially conforming to cluster 1
[75]. However, only two of its hypotheses were stated
at the group level of analysis, while a total of four
hypotheses were formulated at the individual level;
thus, we coded this study as belonging to cluster 6.
Cluster 2 consists of valid and trustworthy studies,
where the focal unit of the theory was at the group
level and all data were collected at the individual
level, and then appropriately aggregated up to the
group level before testing group-level hypotheses.
By “appropriately aggregated,” we mean that the
authors specifically conducted and reported their
tests of within-group homogeneity of variance to show
that subjects were indeed more similar to their peers
within a given group than to other subjects across
groups, and that such individual-level data could
be safely aggregated to the group level—usually by
averaging or summing individual survey responses.
We found just two studies belonging to cluster 2:
Piccoli and Ives [76] and Yoo and Alavi [77]. In both
studies, the researchers examined and reported
specific measures of within-group homogeneity (based
on James’s rwg , a form of inter-coder reliability) before
they averaged the individual survey responses up to
the group level [41]. Results from these studies can
be considered valid and trustworthy, at least in terms
of the levels of analysis issues discussed here.
TABLE II
Cluster classification of research articles on e-collaboration
96
IEEE TRANSACTIONS ON PROFESSIONAL COMMUNICATION, VOL. 48, NO. 1, MARCH 2005
Cluster 3 represents the first set of problematic
studies—whose results are not guaranteed to be
valid and trustworthy. These are studies where the
focal unit was the group, but where the data were
inappropriately aggregated from the individual level to
the group level before testing a group-level theory or
hypothesis. By “inappropriately aggregated,” we mean
that the authors neglected to conduct or report any
statistical tests of within-group homogeneity, which
are needed to justify aggregation of individual-level
data to the group through averages or sums of
individual-level scores. In the absence of such
evidence of within-group homogeneity, we cannot be
certain whether the researchers’ use of group averages
or sums representing individual subjects’ scores are
valid representations of the actual data. There were
eight such studies in which the researchers averaged
(or summed) the individual-level data to the group
level, without providing any justification for doing
so: Huang and Wei [78]; Huang et al. [79]; Kahai
and Cooper [80], [81]; Kayworth and Leidner [82];
Limayem and DeSanctis [83]; Tan et al. [84]; and
Townsend et al. [85]. While in many cases, the authors
found support for their theories or hypotheses, such
conclusions are questionable. In this regard, we
reiterate the observation of Klein et al. regarding the
problematic nature of any such conclusions: “if the
level of statistical analysis matches the level of theory
(i.e., both are at the group level), yet the data do not
conform to the predicted level of theory, a researcher
may draw erroneous conclusions” [22, p. 199].
Cluster 4 represents another set of problematic
studies whose results are not guaranteed to be
valid or trustworthy. These are studies in which
the focal unit was the group, but in which all data
were collected and analyzed at the individual level.
There is a mismatch between the level of the focal
unit (because the hypotheses were formulated at
the group level) and the level at which the data were
collected and analyzed (at the individual level, without
controlling for the different groups). The problem
is that the authors have anthropomorphized some
phenomenon, claiming that a given behavior occurs
at the group level while only providing evidence of a
different (but related) phenomenon at the individual
level. Given this divergence between the group-level
theory and the individual-level data and analysis,
this means that a “misspecification” or “fallacy of
the wrong level” exists [21, p. 5]. Such incongruence
may easily be remedied in one of two ways. First, the
authors may restate their theory and hypotheses to
refer to individuals or “individuals within groups,”
rather than trying to theorize about the behavior of
group entities as a whole [22, p. 198]. The second
way is to retain the theory at the group level and
to continue analyzing individual-level data, but
also include a dummy variable to represent each
group in the statistical analysis, to detect possible
differences between groups. This is the solution
advocated by Walczuch and Watson, who argued
that individual-level ANOVA analyses are improper
in evaluating GSS data, unless the researchers
recognized and controlled for the manner in which
subjects were clustered into groups [34]. We found
three studies corresponding to cluster 4 in our review:
Burke and Chidambaram [86], Grise and Gallupe [87],
and Tung and Quaddus [88]. In the first study, the
authors stated 21 group-level hypotheses, but then
collected and analyzed individual-level data, using
ANOVA, and ignoring any group-level effects—despite
the fact that subjects worked and interacted in
four-person groups [86]. Similarly, the other two
studies formulated their theories at the group level,
but they employed ANOVA analyses to test only
individual-level data, again neglecting to take into
account the fact that individuals had been assigned
to small groups and interacted within those groups
[87], [88]. Such individual-level analytic methods
(e.g., ANOVA) treat all individuals as identical, thus
ignoring the fact that they worked within different
groups—and thus, may have been subject to specific
group-level effects. A simple dummy variable added to
the ANOVA analysis (sometimes called nested-ANOVA
analysis) could easily resolve the problem. Without
somehow controlling for possible group-level effects,
the problem is that individual-level differences may be
statistically supported, but in some cases the effect
would be absent if the researchers had bothered
to control for group membership. According to
Walczuch and Watson [34], the distortion that results
from ignoring group-level effects when analyzing
such individual-level data is more problematic for
larger-sized groups than for smaller groups [44].
Cluster 5 consists of studies where the level is
unquestionably individual, even though the study
examines collaborative technologies. This means
that the level at which the theory and hypothesis
are stated, the nature of the data collected, and
the analysis methods all appropriately occur at the
individual level. The results from these studies may
be considered valid and trustworthy. To belong to this
cluster, all of the data collected and analyzed must
be individual-level constructs, and the authors must
have explicitly demonstrated that individual-level
analysis is appropriate (either because the subjects
were not assigned to work together in groups—and
thus, did not interact with each other—or because
the subjects did work in teams, but the authors
tested for and showed that within-group homogeneity
was absent). Given the differences between these
two conditions, we opted to split this cluster into
two sub-clusters: cluster 5a (studies in which the
subjects were not assigned to groups, and thus had
no interaction with other members in the study), and
cluster 5b (in which subjects were assigned to work
in small groups, but where evidence of within-group
homogeneity was explicitly tested for and shown to be
absent). We found six studies conforming to cluster
GALLIVAN AND BENBUNAN-FICH: ANALYZING LEVELS OF ANALYSIS ISSUES IN STUDIES OF E-COLLABORATION
5a (Garfield et al. [89], Hilmer and Dennis [90],
Khalifa and Kwok [91], Koh and Kim [92], Piccoli, et
al. [93], Reinig and Shin, [18]) and just one study,
by Alavi et al. [94], conforming to cluster 5b. The
study by Khalifa and Kwok consisted of two separate
empirical studies, one where individual subjects did
not interact within a group (corresponding to cluster
5a), and another where the subjects did interact
within their groups (thus, corresponding to cluster
6). Since this article was comprised of two empirical
studies within the same paper, we counted it as half
a paper corresponding to cluster 5a and another half
paper corresponding to cluster 6.
Regarding the studies corresponding to cluster 5a,
some novel techniques employed by researchers to
justify their individual levels of analysis are worth
noting. For example, in a recent study conducted by
Garfield et al., the subjects in the study believed that
they were collaborating with other team members in
their group; however, the presumed members were
fictitious [89]. Instead, the researchers employed a
technology called a group simulator which:
looks and acts like a groupware system, but
instead of sharing ideas among participants,
[the simulator] . . . presents participants with
comments that appear to be from other
participants, but which are, in fact, drawn from
a database of preset ideas. Simulators increase
experimental control by enabling a very specific
and precise experimental environment [89, p.
327]
Despite subjects’ beliefs that they were interacting
with other team members via a groupware technology,
these subjects were exposed to controlled, identical
feedback via the group simulator. All subjects were
truly individuals in the experiment, and thus no
group-level effects were possible. In the remaining
studies corresponding to cluster 5a, subjects were
not assigned to groups, but rather were studied
as individuals working alone, often in classroom
settings. In the single remaining study that we
classified as belonging to cluster 5b, Alavi et al.
assigned students to 7–10 member student teams
[94], yet an examination of within-group homogeneity
statistics (based on James’s rwg metric [41]) showed
that subjects within each group were no more similar
on the measured constructs than subjects were
across groups. By showing that an individual-level
analysis was appropriate, these researchers elected to
analyze all data at the individual level (using ANOVA),
an approach that conformed to their individual-level
hypotheses. In this example, Alavi et al. stated one
hypothesis (H3) at the individual level and stated two
hypotheses (H1, H2) in such a manner that either an
individual- or group-level of analyses was possible
[94].
The results of these studies corresponding to clusters
5a and 5b are valid, because the hypotheses, data
97
collection, and analysis were all at the individual
level. In summary, there were a total of 6.5 studies
corresponding to cluster 5 (including cluster 5a and
5b).
The studies comprising cluster 6 are similar to those
in cluster 5, with the exception that subjects did
communicate and interact within groups, and thus,
the researchers should have taken group-level effects
into account. Like cluster 5, the cluster 6 studies
were those in which the theory was formulated at
the individual level, and individual-level data were
collected and analyzed. The problem with the studies
corresponding to cluster 6 (which distinguishes them
from cluster 5) is that subjects were assigned to work
in groups, and thus, any individual-level analysis
ignores group-level effects (similar to the problems
noted in cluster 4). What distinguishes the studies in
cluster 6 from those in cluster 4 is that for studies
in cluster 6, the overall theory is formulated at the
individual level (e.g., “members of GSS-supported
groups will exhibit more of some attribute, compared
to members of face-to-face groups”), whereas the
hypotheses in cluster 4 studies are formulated at the
group level. In the studies corresponding to cluster
4, the group level theory or hypotheses often took
the form of statements such as “GSS groups will
exhibit more of [some outcome variable], compared to
face-to-face groups.”
Such cluster 6-type hypotheses concern the attributes
or performance of the group entity as a whole, rather
than of the individual members comprising the
group. It is uncertain whether the individual-level
data should have been analyzed at the individual or
group level of analysis, because the authors provided
no evidence of within-group homogeneity to show
whether the data should be analyzed at the group
level or the individual level of analysis. Since no
tests of within-group homogeneity are reported, the
researchers’ decision to analyze data at the individual
level of analysis has unknown congruence with
the focal unit—which is at the individual level. We
consider these studies to be problematic because
they ignore possible group-level effects. Given the
fact that these studies do state their theory at the
individual level and conduct their analyses at the
individual level, there is (at least) partial congruence.
The problem arises with the fact that, due to a
lack of evidence for within-group homogeneity or
heterogeneity, it is unclear whether the appropriate
level of analysis at which researchers should test
their theory is the group or the individual level. By
including such evidence, as did Alavi et al. in the one
study corresponding to cluster 5b, the researchers
could have justified that their individual level of
analysis was appropriate, but they failed to do so
[94]. There were 12 studies that fully corresponded
to cluster 6—over one-third of the total empirical
studies of e-collaboration. This included studies by
98
IEEE TRANSACTIONS ON PROFESSIONAL COMMUNICATION, VOL. 48, NO. 1, MARCH 2005
Dennis et al. [95]; Dennis and Garfield [96]; Hayne
et al. [97], Hender et al. [98]; Karahanna et al. [99];
Kwok, et al. [100]; Kwok, Ma and Vogel [101]; Luo et
al. [102]; Miranda and Bostrom [103]; Reinig [104];
Sia et al. [105]; and Warkentin and Beranek [106]. In
addition, we counted the paper by Khalifa and Kwok
[91] as corresponding one-half to cluster 6, because
one of its two embedded studies featured subjects
interacting within groups. Also, the study by Barkhi
that we mentioned above corresponded primarily to
cluster 6 because most of its hypotheses were stated
at the individual level, and data were analyzed at
the individual level, despite the fact that subjects
interacted within groups [75]. Like the other studies
in cluster 6, the studies by Barkhi [75] and one of
the two embedded studies within Khalifa and Kwok’s
paper [91] neglected to test whether group-level effects
existed (by examining within-group homogeneity or
by adding dummy variables to their ANOVA analyses)
to test for or to rule out possible group-level effects.
Thus, there were a total of 13.5 studies corresponding
to cluster 6, 37.5% of these empirical studies.
DISCUSSION AND CONTRIBUTIONS
Due to the nature of e-collaboration as a field of
inquiry, research in this area is naturally positioned to
encounter different challenges than other IS research
domains, in terms of multi-level issues. Some of the
most widely used and tested theories are formulated
at the group level, but empirical data is often collected
at the individual level and then aggregated up to
the group level through simple means and sums.
Often, researchers neglect to report whether their
individual-level data are sufficiently homogeneous
within groups and are thus suitable candidates for
aggregation to the group level. We identified two
clusters of what we consider to be highly problematic
studies, which we labeled as clusters 3 and 4. The
results of the studies corresponding to these clusters
cannot be considered valid and trustworthy, but for
different reasons. The eight studies corresponding to
cluster 3 do not provide any evidence of within-group
homogeneity, and therefore the researchers’ decision
to aggregate their individual-level data to the group
level is inappropriate. To even speak of the group-level
construct (e.g., process satisfaction) may be untenable
or moot, if evidence of within-group homogeneity is
lacking [22]. The three studies located in cluster 4 are
also open to challenge as being valid and trustworthy,
although here the problem is somewhat different,
namely a mismatch between the level of the theory
(the group level) and the level of the data collection
and analysis (both at the individual level). Taken
together, the studies corresponding to clusters 3 and
4 violate warnings by Rousseau and Klein et al. to
ensure that the level of theory is congruent with the
level at which data are analyzed [20], [22]. In total,
almost one-third of the studies that we found (11
out of 36) corresponded to these highly problematic
clusters.
However, judging by the number of studies classified
as belonging to cluster 6, the most widely encountered
multi-level issue occurs when researchers seek to test
theories formulated at the individual level, and when
they collect and analyze individual-level data without
examining the potential group-level effects. Although
the studies in this cluster do not suffer from level
mismatch problems that Rousseau warned against,
data collected from subjects working in groups
violates one of the key assumptions for using ANOVA
or regression analysis—since the scores for members
within groups are likely to be correlated, thus
violating the assumption of statistical independence
of data [20]. When individuals are assigned to groups
(and thus interact or communicate within these
groups), the data are no longer independent, but
are subject to common group-level effects. When
researchers erroneously analyze such data at the
individual level, they artificially inflate the degrees
of freedom, and hence the likelihood of finding
apparent results by rejecting the null hypothesis
when true individual-level effects may be absent [34],
[44]. In such cases, researchers should account for
group-level effects in order for their results to be valid
and trustworthy. While we do not consider the 13.5
studies corresponding to cluster 6 to be problematic
in exactly the same manner as those in clusters 3 and
4 (because with cluster 6, at least the level of theory
matches the level of data analysis), nevertheless,
researchers are violating assumptions of statistical
independence in their data. To remedy this problem,
they should test to see whether group-level effects
exist before assuming that individual-level analyses
are appropriate. As we described above, there are two
ways to do this. One way is to examine within-group
homogeneity of variance and show that it is absent,
as did Alavi et al. [94]. A second way is to conduct a
nested-ANOVA analysis, adding group-level dummy
variables to control for any possible within-group
effects [34]. Over one-third (13.5 out of 36) of the
empirical studies of e-collaboration that we identified
corresponded to this moderately problematic cluster
6, in which the results should not be considered
reliable and trustworthy.
Given that about one-third of the studies corresponded
to the highly problematic clusters (clusters 3 and
4), and over one-third (37.5%) corresponded to the
moderately problematic cluster 6, this leaves just
one-third of the empirical e-collaboration studies
from the seven journals that we reviewed (32%) as
exhibiting appropriate congruence between the theory
level and the level of data collection and analysis.
These relatively few studies (11.5 papers, to be exact)
were divided among several clusters, including those
in which the focal unit and data were all at the group
GALLIVAN AND BENBUNAN-FICH: ANALYZING LEVELS OF ANALYSIS ISSUES IN STUDIES OF E-COLLABORATION
level (cluster 1, consisting of three studies); those
in which the focal unit was at the group level, but
where the authors collected individual-level data and
justified their aggregation to the group level (cluster
2, consisting of two studies); and studies in which
the focal unit was the individual, and the subjects
either worked independently, or in which the authors
statistically demonstrated a lack of within-group
homogeneity (cluster 5a, consisting of 5.5 studies,
and cluster 5b, consisting of 1 study, respectively). All
of these studies corresponding to clusters 1, 2, and
5 may be considered valid and trustworthy, at least
in the sense that they exhibit no levels of analysis
incongruence.
The fact that just three studies correspond to
cluster 1 is an indication that researchers are rarely
able to restrict their empirical measures of group
behavior solely to global group constructs (i.e.,
constructs where the property exists for the group
as a whole, and not for its individual members).
While it is common for GSS researchers to feature
some global constructs in their models (e.g., group
size, decision quality, or completion time), there are
very few studies that consist only of such global
constructs. More common is for researchers to
include at least some individual-level constructs
(e.g., individual process satisfaction) in their studies
to complement the global measures. Most of these
studies employed some individual-level measures,
which indicate that researchers are more inclined to
collect individual-level data rather than measure all
group constructs globally, regardless of the level of the
theory or their data analysis. Although researchers
may posit theories and hypotheses at the group
level, they are also likely to gather individual-level
data. Of course, how the researchers choose to treat
such data is critical, whether they first justify that
the data exhibit within-group homogeneity and
thus can be appropriately aggregated to the group
level (as in cluster 2), or instead neglect to consider
levels of analysis concerns by just assuming that
the data can be aggregated to the group level (as
in cluster 3). According to our classification, only
one study totally conformed to cluster 1, while two
were found to partially conform to this cluster. This
means that few studies of e-collaboration can afford
to limit themselves to global group-level constructs,
and underscores the importance of researchers’
understanding what they can and cannot do with
individual-level data that they collect. In such cases,
the researchers may follow appropriate guidelines
from the management literature either for ensuring
that their data can be appropriately aggregated to the
group level, in order to test group-level hypotheses
(cluster 2), or alternatively, they can show that their
study scenario and data are free from group-level
effects and instead examine individual-level
hypotheses (cluster 5). Before considering what
we learned from the three problematic clusters
99
(clusters 3, 4, and 6), we reiterate the point that
all the empirical studies collected at least some
individual-level data, and thus, it is very important for
researchers to understand how to treat this data so as
to ensure that the data themselves are congruent with
their level of data analysis and also correspond to the
focal unit of their theories. It is thus vitally important
for all researchers who collect individual-level data
when studying the use of collaboration technologies
by individuals, groups, and communities to bear in
mind the insights from our study.
Overall, over two-thirds of the studies that we analyzed
(24.5 out of 36 studies, or 68%) corresponded to one
of the three problematic clusters (3, 4, or 6) for which
the validity of results and conclusions should be
considered questionable. Within this group of studies,
however, we have distinguished between studies
that are highly problematic (30.5% corresponding
to clusters 3 and 4) and those that are moderately
problematic (37.5% corresponding to cluster 6).
Given this surprisingly large fraction of studies
whose statistical results are thus open to question,
it seems likely that the predominant confusion and
inconsistent results that have characterized the
e-collaboration literature over the past two decades
are due, at least in part, to the multi-level problems
described above. By creating awareness of these
multi-level issues, the IS research community can be
prompted to pay greater attention to the important
conceptual issues involving levels of analysis [20],
[22], [24], and the available statistical techniques for
measuring within-group homogeneity [35], [41], [42],
[45], [47], some of which have been in circulation
for two decades or more. Just as the IS literature
on firm-level payoffs from IT investments began to
exhibit greater consistency once the levels of analysis
confusion described by Brynjolfsson was resolved (in
addition to other methodological advances), we believe
that IS researchers who study group-level IT use may
benefit from the insights described here [5].
The contributions of this paper are three-fold. First,
we have offered an alternative explanation for why
studies of e-collaboration present divergent and
inconsistent results. Unlike other voices in the
literature who have attempted to explain how the
problem of inconsistent findings might be solved, we
do not propose alternative epistemologies [16]; new
theoretical lenses [30], [31], new constructs [32], [33];
or advocate longitudinal studies of groups [16], [18].
Instead, we link the problem of inconsistent results
of e-collaboration with various forms of incongruence
between the level at which theory is stated (the focal
unit) and the levels at which data are collected and
analyzed. The second contribution of our work is to
empirically demonstrate the extent of the problem of
levels incongruence: we found that only 32% of the
empirical studies in seven leading international IS
journals were without any levels of analysis concerns
100
IEEE TRANSACTIONS ON PROFESSIONAL COMMUNICATION, VOL. 48, NO. 1, MARCH 2005
(11.5 studies out of 36), while nearly one-third of the
studies were highly problematic (11 out of 36), and
over one-third were moderately problematic (13.5 out
of 36, or 37.5%). We consider this strong evidence of
the severity of the problem of levels incongruence in
e-collaboration research, and we strongly encourage
IS researchers who study e-collaboration to read the
seminal studies that explain the conceptual issues
involving levels of analysis, particularly some classic
pieces by Rousseau and her colleagues [40], [46], [76]
and Klein and her colleagues [22]–[24]. Of course,
the insights from our work may be of value not only
to researchers who empirically study e-collaboration
but to peer reviewers and journal editors who are
evaluating the merits of this research domain. We
hope that this occasion of a special issue of IEEE
TRANSACTIONS ON PROFESSIONAL COMMUNICATION can
serve as a useful vehicle for this third contribution,
a framework for classifying studies of GSS and
e-collaboration, which will be of value to reviewers
and journal editors, as well as to researchers.
CONCLUSION
There has been a proliferation of competing
explanations regarding the inconsistent results
reported by the e-collaboration literature since its
inception. This study advances another possible
explanation by investigating the range of multi-level
issues that can be encountered in this research. In
order to avoid problems of levels incongruence, the
level of the theory (the focal unit), the level of the data
analysis, and the unit of analysis collected must be
consistent with each other. Our analysis of 36 studies
of e-collaboration published in several IS journals
in the last five years found that over two-thirds
of these studies contain one or more problems of
levels incongruence that cast doubts on the validity
of their findings. It is indeed possible that these
methodological problems are in part responsible
for the inconsistency of the results reported in this
literature, especially since a researcher’s decision
to analyze data at the individual level, even when
the research setting features individuals working
in groups, will artificially increase the likelihood of
finding artifactual or inaccurate results [34]. Such
an outcome may have occurred in some of the
studies that we classified as belonging to cluster
6. Alternatively, the studies that inappropriately
analyzed their data at the group level of analysis
(based on aggregated individual-level data) would
be less likely to find support for their hypotheses
than if the data had been analyzed individually
(e.g., studies corresponding to cluster 3). While
we cannot definitively state whether any specific
author’s results from the studies that we classified
into clusters 3, 4, or 6 are valid or not, the possibility
that the researchers may have reached inappropriate
conclusions due to levels of analysis incongruence
means that such findings should be considered
cautiously, and scholars should not be surprised by
the lack of consistency across studies [11], [16]. By
reflecting more consciously on these levels of analysis
issues described in this paper, we believe that IS
researchers who study IT use for e-collaboration will
be better able to build a more solid, consistent, and
trustworthy body of findings in the future.
REFERENCES
[1] H. C. Lucas, Why Information Systems Fail. New York: Columbia Univ. Press, 1974.
[2] Y. E. Chan, “IT value: The great divide between qualitative and quantitative and individual and organizational
measures,” J. Manage. Inform. Syst., vol. 16, no. 4, pp. 225–261, 2000.
[3] S. S. Roach, “America’s technology dilemma: A profile of the information economy,” Morgan Stanley Special
Economic Study, Apr. 1987.
, “Services under siege: The restructuring imperative,” Harvard Bus. Rev., pp. 82–92, Sept.–Oct. 1991.
[4]
[5] E. Brynjolfsson, “The productivity paradox of information technology,” Commun. ACM, vol. 36, no. 12, pp.
67–77, 1993.
[6] G. M. Loveman, “An assessment of the productivity of information technology,” in Information Technology and
the Corporation of the 1990s., T. Allen and M. S. S. Morton, Eds. Cambridge, MA: MIT Press, 1994.
[7] E. Brynjolfsson and L. M. Hitt, “Paradox lost? Firm-level evidence on the returns to information systems
spending,” Manage. Sci., vol. 42, no. 4, pp. 541–558, 1996.
[8]
, “Beyond computation: Information technology, organizational transformation, and business performance,”
Journal of Economic Perspectives, vol. 14, no. 4, pp. 23–48, 2000.
[9] J. Dedrick, V. Gurbaxani, and K. L. Kraemer, “Information technology and economic performance: A critical
review of the empirical evidence,” ACM Computing Surveys, vol. 35, no. 1, pp. 1–28, 2003.
[10] L. M. Hitt and E. Brynjolfsson, “Productivity, business profitability, and consumer surplus: Three different
measures of information technology value,” MIS Quart., vol. 20, no. 2, pp. 121–142, 1996.
[11] D. Robey and M.-C. Boudreau, “Accounting for the contradictory organizational consequences of information
technology: Theoretical directions and methodological implications,” Inform. Syst. Res., vol. 10, no. 2, pp.
167–186, 1999.
[12] C. Soh and L. M. Markus, “How IT creates business value: A process theory synthesis,” in Proc. 16th Int. Conf.
Inform. Syst., J. I. DeGross, G. Ariav, C. Beath, R. Hoyer, and K. Kemerer, Eds., Amsterdam, The Netherlands,
Dec. 1995, pp. 29–41.
GALLIVAN AND BENBUNAN-FICH: ANALYZING LEVELS OF ANALYSIS ISSUES IN STUDIES OF E-COLLABORATION
101
[13] A. Barua, C. H. Kriebel, and T. Mukhopadhyay, “Information technologies and business value: An analytic and
empirical investigation,” Inform. Syst. Res., vol. 6, no. 2, pp. 3–23, 1995.
[14] T. F. Bresnahan, E. Brynjolfsson, and L. M. Hitt, “Information technology, workplace organization, and the
demand for skilled labor: Firm-level evidence,” Quart. J. Econ., vol. 117, no. 1, pp. 339–370, 2002.
[15] A. Pinsonneault, H. Barki, R. B. Gallupe, and N. Hoppen, “Electronic brainstorming: The illusion of
productivity,” Inform. Syst. Res., vol. 10, no. 2, pp. 110–133, 1999.
[16] A. Gopal and P. Prasad, “Understanding GDSS in symbolic context: Shifting the focus from technology to
interaction,” MIS Quart., vol. 24, no. 3, pp. 509–546, 2000.
[17] M. L. Markus and D. Robey, “Information technology and organizational change: Causal structure in theory
and research,” Manage. Sci., vol. 34, no. 5, pp. 583–598, 1988.
[18] B. A. Reinig and B. Shin, “The dynamic effects of group support systems on group meetings,” J. Manage.
Inform. Syst., vol. 19, no. 2, pp. 303–325, 2002.
[19] A. R. Dennis, B. H. Wixom, and R. J. Vandenberg, “Understanding fit and appropriation effects in group
support systems via meta-analysis,” MIS Quart., vol. 25, no. 2, pp. 167–193, 2001.
[20] D. M. Rousseau, “Issues in level in organizational research: Multi level and cross level perspectives,” in
Research in Organizational Behavior, L. L. Cummings and B. M. Staw, Eds. Greenwich, CT: JAI Press,
1985, vol. 7, pp. 1–37.
, “Characteristics of departments, positions and individuals: Contexts for attitudes and behavior,” Admin.
[21]
Sci. Quart., vol. 23, no. 4, pp. 521–540, 1978.
[22] K. J. Klein, F. Dansereau, and R. J. Hall, “Levels issues in theory development, data collection, and analysis,”
Acad. Manage. Rev., vol. 19, no. 2, pp. 195–229, 1994.
[23] K. J. Klein and S. W. J. Kozlowski, Eds., Multilevel Theory, Research, and Methods in Organizations. San
Francisco, CA: Jossey-Bass, 2000.
[24] K. J. Klein, H. Tosi, and A. A. Cannella, “Multilevel theory building: Benefits, barriers and new developments,”
Acad. Manage. Rev., vol. 24, no. 2, pp. 243–248, 1999.
[25] R. Kohli, “In search of IT business value: Do measurement levels make a difference?,” in Proc. 9th Amer.
Conf. Inform. Syst., 2003, pp. 1465–1468.
[26] M. S. Poole and A. Van de Ven, “Using paradox to build management and organization theories,” Acad. Manage.
Rev., vol. 14, no. 4, pp. 562–580, 1989.
[27] A. R. Dennis and B. R. Wixom, “Investigating the moderators of the group support systems use with
meta-analysis,” J. Manage. Inform. Syst., vol. 18, no. 3, pp. 235–258, 2001/2002.
[28] J. Fjermestad and S. R. Hiltz, “An assessment of group support systems experimental research: Methodology
and results,” J. Manage. Inform. Syst., vol. 15, no. 3, pp. 7–149, 1998/1999.
[29]
, “Group support systems: A descriptive evaluation of case and field studies,” J. Manage. Inform. Syst.,
vol. 17, no. 3, pp. 115–159, 2000/2001.
[30] G. DeSanctis and M. S. Poole, “Capturing the complexity in advanced technology use: Adaptive structuration
theory,” Org. Sci., vol. 5, no. 2, pp. 121–147, 1994.
[31] I. Zigurs and B. K. Buckland, “A theory of task/technology fit and group support systems effectiveness,” MIS
Quart., vol. 22, no. 3, pp. 313–334, 1998.
[32] W. W. Chin, A. Gopal, and W. D. Salisbury, “Advancing the theory of adaptive structuration: Development of a
scale to measure faithfulness of appropriation,” Inform. Syst. Res., vol. 8, no. 4, pp. 342–367, 1997.
[33] W. D. Salisbury, W. W. Chin, A. Gopal, and P. R. Newsted, “Better theory through measurement: Developing a
scale to capture consensus on appropriation,” Inform. Syst. Res., vol. 13, no. 1, pp. 91–105, 2002.
[34] R. M. Walczuch and R. T. Watson, “Analyzing group data in MIS research: Including the effect of the group,”
Group Decision Negot., vol. 10, no. 1, pp. 83–94, 2001.
[35] K. J. Klein, P. D. Bliese, S. W. Kozlowski, F. Dansereau, M. B. Gavin, M. A. Griffin, D. A. Hofmann, L. R.
James, F. J. Yammarino, and M. C. Bligh, “Multilevel analytical techniques: Commonalities, differences, and
continuing questions,” in Multilevel Theory, Research, and Methods in Organizations, K. J. Klein and S. W.
Kozlowski, Eds. San Francisco, CA: Jossey-Bass, 2000, pp. 512–553.
[36] K. J. Klein, A. B. Conn, D. B. Smith, and J. S. Sorra, “Is everyone in agreement? An exploration of within-group
agreement in employee perceptions of the work environment,” J. Appl. Psych., vol. 86, no. 1, pp. 3–14, 2001.
[37] S. W. Kozlowski and K. Hattrup, “A disagreement about within-group agreement: Disentangling issues of
consistency versus consensus,” J. Appl. Psych., vol. 77, no. 2, pp. 161–167, 1992.
[38] S. W. Kozlowski and K. J. Klein, “A multilevel approach to theory and research in organizations,” in Multilevel
Theory, Research, and Methods in Organizations, K. J. Klein and S. W. J. Kozlowski, Eds. San Francisco, CA:
Jossey-Bass, 2000, pp. 3–90.
[39] F. Dansereau, F. J. Yammarino, and J. C. Kohles, “Multiple levels of analysis from a longitudinal perspective,”
Acad. Manage. Rev., vol. 24, no. 2, pp. 346–357, 1999.
[40] K. H. Roberts, C. L. Hulin, and D. M. Rousseau, Developing an Interdisciplinary Science of Organizations. San
Francisco, CA: Jossey Bass, 1978.
[41] L. R. James, “Aggregation bias in estimates of perceptual agreement,” J. Appl. Psych., vol. 67, no. 2, pp.
219–229, 1982.
[42] L. R. James, R. G. Demaree, and G. Wolf, “Estimating within-group interrater reliability with and without
response bias,” J. Appl. Psych., vol. 69, no. 1, pp. 85–98, 1984.
[43] P. D. Bliese, “Within-group agreement, nonindependence, and reliability: Implications for data aggregation and
analysis,” in Multilevel Theory, Research, and Methods in Organizations, K. J. Klein and S. W. J. Kozlowski,
Eds. San Francisco, CA: Jossey-Bass, 2000, pp. 349–381.
102
IEEE TRANSACTIONS ON PROFESSIONAL COMMUNICATION, VOL. 48, NO. 1, MARCH 2005
[44] P. D. Bliese and R. H. Halverson, “Group size and measures of group level properties: An examination of
eta-squared and ICC Values,” J. Manage., vol. 24, no. 2, pp. 157–172, 1998.
[45] S. L. Castro, “Data analytic methods for the analysis of multilevel questions: A comparison of intraclass
correlation coefficients, rwg , hierarchical linear modeling, WABA, and random group resampling,” Leadership
Quart., vol. 13, no. 1, pp. 69–93, 2002.
[46] R. House, D. M. Rousseau, and M. Thomas-Hunt, “The meso paradigm: A framework for integration of micro
and macro organizational behavior,” in Research in Organizational Behavior, vol. 17, L. L. Cummings and
B. Staw, Eds. Greenwich, CT, 1995, pp. 71–114.
[47] L. R. James, R. G. Demaree, and G. Wolf, “rwg : An assessment of within-group interrater agreement,” J. Appl.
Psych., vol. 78, no. 2, pp. 306–309, 1993.
[48] J. M. Cortina, “Big things have small beginnings: An assortment of ‘minor’ methodological misunderstandings,”
J. Manage., vol. 28, no. 3, pp. 339–362, 2002.
[49] D. A. Waldman and F. J. Yammarino, “CEO charismatic leadership: Levels-of-management and
levels-of-analysis effects,” Acad. Manage. Rev., vol. 24, no. 2, pp. 266–285, 1999.
[50] N. Kock, “Action research: Lessons learned from a multi-iteration study of computer-mediated communication
in groups,” IEEE Trans. Profess. Commun., vol. 46, no. 2, pp. 105–120, 2003.
[51] E. Christiaanse and N. Venkatraman, “Beyond Sabre: An empirical test of expertise exploitation in electronic
channels,” MIS Quart., vol. 26, no. 1, pp. 15–38, 2002.
[52] H. G. Lee, T. Clark, and K. Y. Tam, “Research report. Can EDI benefit adopters?,” Inform. Syst. Res., vol.
10, no. 2, pp. 186–195, 1999.
[53] G. E. Truman, “Integration in electronic exchange environments,” J. Manage. Inform. Syst., vol. 17, no. 1, pp.
209–244, 2000.
[54] P. Chwelos, I. Benbasat, and A. S. Dexter, “Research report: Empirical test of an EDI adoption model,” Inform.
Syst. Res., vol. 12, no. 3, pp. 304–321, 2001.
[55] R. J. Kauffman, J. McAndrews, and Y.-M. Wang, “Opening the ‘black box’ of network externalities in network
adoption,” Inform. Syst. Res., vol. 11, no. 1, pp. 61–94, 2000.
[56] H. H. Teo, K. K. Wei, and I. Benbasat, “Predicting intention to adopt interorganizational linkages: An
institutional perspective,” MIS Quart., vol. 27, no. 1, pp. 19–50, 2003.
[57] A. A. Yoris and R. J. Kauffman, “Should we wait? Network externalities, compatibility, and electronic billing
adoption,” J. Manage. Inform. Syst., vol. 18, no. 2, pp. 47–63, 2001.
[58] I. Vessey, V. Ramesh, and R. L. Glass, “Research in information systems: An empirical study of diversity in the
discipline and its journals,” J. Manage. Inform. Syst., vol. 19, no. 2, pp. 129–174, 2002.
[59] R. B. Johnston and S. Gregor, “A theory of industry-level activity for understanding the adoption of
interorganizational systems,” Eur. J. Inform. Syst., vol. 9, no. 4, pp. 243–251, 2000.
[60] A. Morton, F. Ackerman, and V. Belton, “Technology-driven and model-driven approaches to group decision
support: Focus, research philosophy, and key concepts,” Eur. J. Inform. Syst., vol. 12, no. 2, pp. 110–126, 2003.
[61] M. Mandviwalla and S. Khan, “Collaborative object workspaces (COWS): Exploring the integration of
collaboration technology,” Decision Support Syst., vol. 27, no. 3, pp. 241–254, 1999.
[62] M. J. McQuaid, T.-H. Ong, H. Chen, and J. F. Nunamaker, “Multidimensional scaling for group memory
visualization,” Decision Support Syst., vol. 27, no. 1–2, pp. 163–176, 1999.
[63] A. R. Dennis, T. Carte, and G. G. Kelly, “Breaking the rules: Success and failure in groupware-supported
business process reengineering,” Decision Support Syst., vol. 36, no. 1, pp. 31–47, 2003.
[64] R. O. Briggs, M. Adkins, D. Mittleman, J. Kruse, S. Miller, and J. F. Nunamaker, “A technology transition model
derived from field investigation of GSS use,” J. Manage. Inform. Syst., vol. 15, no. 3, pp. 151–196, 1998/1999.
[65] R. O. Briggs, G-J. De Vreede, and J. F. Nunamaker, “Collaboration engineering with thinklets to pursue
sustained success with group support systems,” J. Manage. Inform. Syst., vol. 19, no. 4, pp. 31–64, 2003.
[66] A. Majchrzak, R. Rice, A. Malhotra, and N. King, “Technology adaptation: The case of a computer-supported
inter-organizational virtual team,” MIS Quart., vol. 24, no. 4, pp. 569–600, 2000.
[67] A. Malhotra, A. Majchrzak, R. Carman, and V. Lott, “Radical innovation without collocation: A case study at
Boeing-Rocketdyne,” MIS Quart., vol. 25, no. 2, pp. 229–249, 2001.
[68] J. Scott, “Facilitating interorganizational learning with information technology,” J. Manage. Inform. Syst., vol.
17, no. 2, pp. 81–113, 2000.
[69] G.-J. De Vreede, N. Jones, and R. J. Mgaya, “Exploring the application and acceptance of group support
systems in Africa,” J. Manage. Inform. Syst., vol. 15, no. 3, pp. 197–234, 1998/1999.
[70] A. P. Massey, M. M. Montoya-Weiss, and Y. Hung, “Because time matters: Temporal coordination in global
virtual project teams,” J. Manage. Inform. Syst., vol. 19, no. 4, pp. 129–155, 2003.
[71] E. M. Trauth and L. M. Jessup, “Understanding computer-mediated discussions: Positivist and interpretive
analyses of group support system use,” MIS Quart., vol. 24, no. 1, pp. 43–79, 2000.
[72] K. R. T. Larsen, “A taxonomy of antecedents of information systems success: Variable analysis studies,” J.
Manage. Inform. Syst., vol. 20, no. 2, pp. 160–246, 2003.
[73] M. Adkins, M. Burgoon, and J. F. Nunamaker, “Using group support systems for strategic planning with the
United States air force,” Decision Support Syst., vol. 34, no. 3, pp. 315–337, 2003.
[74] R. Benbunan-Fich, S. R. Hiltz, and M. Turoff, “A comparative content analysis of face-to-face vs. asynchronous
group decision making,” Decision Support Syst., vol. 34, no. 4, pp. 457–469, 2003.
[75] R. Barkhi, “The effects of decision guidance and problem modeling on group decision-making,” J. Manage.
Inform. Syst., vol. 18, no. 3, pp. 259–283, 2001.
[76] G. Piccoli and B. Ives, “Trust and the unintended effects of behavior control in virtual teams,” MIS Quart., vol.
27, no. 3, pp. 365–393, 2003.
GALLIVAN AND BENBUNAN-FICH: ANALYZING LEVELS OF ANALYSIS ISSUES IN STUDIES OF E-COLLABORATION
103
[77] Y. Yoo and M. Alavi, “Media and group cohesion: Relative influences on social presence, task participation,
and group consensus,” MIS Quart., vol. 25, no. 3, pp. 371–390, 2001.
[78] W. W. Huang and K. K. Wei, “An empirical investigation of the effects of group support systems (GSS) and
task type on group interactions from an influence perspective,” J. Manage. Inform. Syst., vol. 17, no. 2, pp.
181–206, 2000.
[79] W. W. Huang, K. K. Wei, R. T. Watson, and B. Tan, “Supporting virtual team-building with a GSS: An empirical
investigation,” Decision Support Syst., vol. 34, no. 4, pp. 359–367, 2003.
[80] S. S. Kahai and R. B. Cooper, “Exploring the core concepts of media richness theory: The impact of cue
multiplicity and feedback immediacy on decision quality,” J. Manage. Inform. Syst., vol. 20, no. 1, pp. 263–300,
2003.
, “The effect of computer-mediated communication on agreement and acceptance,” J. Manage. Inform.
[81]
Syst., vol. 16, no. 1, pp. 165–188, 1999.
[82] T. R. Kayworth and D. E. Leidner, “Leadership effectiveness in global virtual teams,” J. Manage. Inform. Syst.,
vol. 18, no. 3, pp. 7–40, 2001.
[83] M. Limayem and G. DeSanctis, “Providing decisional guidance for multicriteria decision making in groups,”
Inform. Syst. Res., vol. 11, no. 4, pp. 386–401, 2000.
[84] B. Tan, K. K. Wei, and J-E. Lee-Partridge, “Effects of facilitation and leadership on meeting outcomes in a group
support system environment,” Eur. J. Inform. Syst., vol. 8, no. 4, pp. 233–246, 1999.
[85] A. M. Townsend, S. M. Demarie, and A. R. Hendrickson, “Desktop video conferencing in virtual workgroups:
Anticipation, system evaluation and performance,” Inform. Syst. J., vol. 11, no. 3, pp. 213–227, 2001.
[86] K. Burke and L. Chidambaram, “How much bandwidth is enough? A longitudinal examination of media
characteristics and group outcomes,” MIS Quart., vol. 23, no. 4, pp. 557–579, 1999.
[87] M. Grise and B. Gallupe, “Information overload: Addressing the productivity paradox in face-to-face electronic
meetings,” J. Manage. Inform. Syst., vol. 16, no. 3, pp. 157–185, 1999.
[88] L. L. Tung and M. A. Quaddus, “Cultural differences explaining the differences in results in GSS: Implications
for the next decade,” Decision Support Syst., vol. 33, no. 2, pp. 177–199, 2002.
[89] M. J. Garfield, N. J. Taylor, A. R. Dennis, and J. W. Satzinger, “Modifying paradigms: Individual differences,
creativity techniques, and exposure to ideas in group idea generation,” Inform. Syst. Res., vol. 12, no. 3, pp.
322–333, 2001.
[90] K. Hilmer and A. R. Dennis, “Stimulating thinking: Cultivating better decisions with groupware through
categorization,” J. Manage. Inform. Syst., vol. 17, no. 3, pp. 93–114, 2000.
[91] M. Khalifa and R. Kwok, “Remote learning technologies: Effectiveness of hypertext and GSS,” Decision Support
Syst., vol. 26, no. 3, pp. 195–207, 1999.
[92] J. Koh and Y-G. Kim, “Sense of virtual community: A conceptual framework and empirical validation,” Int. J.
Electron. Commerce, vol. 8, no. 2, pp. 75–93, 2003/2004.
[93] G. Piccoli, R. Ahmad, and B. Ives, “Web-based virtual learning environments: A research framework and a
preliminary assessment of effectiveness in basic IT skills training,” MIS Quart., vol. 25, no. 4, pp. 401–426, 2001.
[94] M. Alavi, G. M. Marakas, and Y. Yoo, “A comparative study of distributed learning environments on learning
outcomes,” Inform. Syst. Res., vol. 13, no. 4, pp. 404–415, 2002.
[95] A. R. Dennis, J. E. Aronson, W. G. Heninger, and E. D. Walker, “Structuring time and task in electronic
brainstorming,” MIS Quart., vol. 23, no. 1, pp. 95–108, 1999.
[96] A. R. Dennis and M. J. Garfield, “The adoption and use of GSS in project teams: Toward more participative
processes and outcomes,” MIS Quart., vol. 27, no. 2, pp. 289–323, 2003.
[97] S. C. Hayne, C. E. Pollard, and R. E. Rice, “Identification of comment authorship in anonymous group support
systems,” J. Manage. Inform. Syst., vol. 20, no. 1, pp. 301–330, 2003.
[98] J. M. Hender, D. L. Dean, T. L. Rodgers, and J. F. Nunamaker, “An examination of the impact of stimuli type
and GSS structure on creativity: Brainstorming versus nonbrainstorming techniques in a GSS environment,” J.
Manage. Inform. Syst., vol. 18, no. 4, pp. 59–86, 2002.
[99] E. Karahanna, M. Ahuja, M. Srite, and J. Galvin, “Individual differences and relative advantage: The case of
GSS,” Decision Support Syst., vol. 32, no. 4, pp. 327–341, 2002.
[100] R. Kwok, J.-N. Lee, M. Huynh, and S.-M. Pi, “Role of GSS on collaborative problem-based learning: A study on
knowledge externalization,” Eur. J. Inform. Syst., vol. 11, no. 2, pp. 98–107, 2002.
[101] R. Kwok, J. Ma, and D. R. Vogel, “Effects of group support systems and content facilitation on knowledge
acquisition,” J. Manage. Inform. Syst., vol. 19, no. 3, pp. 185–229, 2002.
[102] H. Lou, W. Luo, and D. Strong, “Perceived critical mass effect on groupware acceptance,” Eur. J. Inform.
Syst., vol. 9, no. 2, pp. 91–103, 2000.
[103] S. M. Miranda and R. P. Bostrom, “Meeting facilitation: Process versus content interventions,” J. Manage.
Inform. Syst., vol. 15, no. 4, pp. 89–114, 1999.
[104] B. A. Reinig, “Toward an understanding of satisfaction with the process and outcomes of teamwork,” J.
Manage. Inform. Syst., vol. 19, no. 4, pp. 65–83, 2003.
[105] C. L. Sia, B.C.Y. Tan, and K. K. Wei, “Group polarization and computer-mediated communication: Effect of
communication cues, social presence and anonymity,” Inform. Syst. Res., vol. 13, no. 1, pp. 70–90, 2002.
[106] M. Warkentin and P. M. Beranek, “Training to improve virtual team communication,” Inform. Syst. J., vol.
9, no. 4, pp. 271–289, 1999.
104
IEEE TRANSACTIONS ON PROFESSIONAL COMMUNICATION, VOL. 48, NO. 1, MARCH 2005
Michael Gallivan is an Associate Professor in the Computer Information Systems Department at Georgia State University
in the Robinson College of Business. He conducts research on human resource practices for managing IT professionals, as
well as strategies for managing effective IT implementation, IT outsourcing, and interorganizational alliances. He received his
Ph.D. from MIT Sloan School of Management. His research has appeared in Database for Advances in IS, Information Systems
Journal, Information Technology & People, Information & Management, Information and Organization, and IEEE TRANSACTION
ON PROFESSIONAL COMMUNICATIONS.
Raquel Benbunan-Fich is at the Computer Information Systems Department of the Zicklin School of Business, Baruch
College, City University of New York. Her research interests include computer-mediated communications, group collaboration
and e-commerce. She received her Ph.D. from Rutgers University. She has published articles in Communications of the ACM,
Decision Support Systems, Information & Management, International Journal of E-commerce, IEEE TRANSACTIONS ON PROFESSIONAL
COMMUNICATIONS, and other journals.
Download