Essays on Social Networks and Information... Productivity Lynn Wu LRHIE

Essays on Social Networks and Information Worker
Lynn Wu
M.Eng. in Electrical Engineering and Computer Science, MIT, 2003
B.S. in Electrical Engineering and Computer Science, MIT, 2003
B.S. in Management Science, MIT, 2002
Submitted to the Sloan School of Management
in partial fulfillment of the requirements for the degree of
Doctor of Philosophy in Management
at the
June 2011
C 2011 Lynn Wu. All Rights Reserved.
The author hereby grants to MIT permission to reproduce and
to distribute publicly paper and electronic copies of this thesis document
in whole or in part in any medium now known or hereafter created.
Signature of the Author
MIT Sloan School of Management
May 8, 2011
Certified by
Erik Brynjolfsson
Schussel Family Professor
Thesis Suprisor
Accepted by
R erto
William F Pounds Pro ssor in
Chair of Ph.D. Program, MIT Sloan School o
Essays on Social Networks and Information Worker Productivity
Lynn Wu
Submitted to the Alfred P. Sloan School of Management in Partial Fulfillment of the
Requirements for the Degree of
Doctor of Philosophy in Management Science
In this thesis, I examine how information, information technology, and social networks affect
information worker productivity. The work is divided into three essays based on tracking
detailed communication patterns of information workers in the high-tech industry.
Essay 1: "Social Network Effects on Performance and Layoffs: Evidence from the Adoption of a
Social Networking Tool." By studying the changes in employees' networks and performance
before and after the introduction of a social networking tool, I find that a structurally diverse
network (low in cohesion and rich in structural holes) has a positive effect on work performance.
The size of the effect is smaller than traditional estimates, suggesting that omitted individual
characteristics may bias the estimated network effect. I consider two intermediate mechanisms
by which a structurally diverse network is theorized to improve work performance, information
diversity (instrumental) and social communication (expressive), and quantify their effects on
two types of work outcomes: billable revenue and layoffs. Analysis shows that the information
diversity derived from a structurally diverse network is more correlated with generating billable
revenue than is social communication. However, the opposite is true for layoffs. Friendship, as
approximated by social communication, is more correlated with reduced layoff risks than is
information diversity. Field interviews suggest that friends can serve as advocates in critical
situations, ensuring that favorable information is distributed to decision makers. This, in turn,
suggests that having a structurally diverse network can drive both work performance and job
security, but that there is a tradeoff between either mobilizing friendship or gathering diverse
Essay 2: "Identification of Influence: An Experimental Platform for Understanding the
Relationship between Social Networks and Performance." This study creates an experimental
platform for identifying the relationship between social networks and performance. While a
large body of literature has examined the correlations between certain network topologies and
performance, little research has shown a definitive causal linkage. I address this problem
through conducting three sets of randomized field experiments using an on-line experimental
platform at a large information technology firm. The platform enables randomly selected
employees to achieve certain network characteristics. By examining work performance before
and after the experiment, I plan to show the causal relationship between networks and
Essay 3: "Water Cooler Networks: Performance Implications of Informal Face-to-Face
Interaction Structures in Information-Intensive Work." This study examines the performance
characteristics of face-to-face interaction networks and finds that their structural properties are
important for effective knowledge transfer and productivity. We argue that network theory
should incorporate the implications of media choice, and particularly differences between face-
to-face and electronic communication, when assessing how networks affect individual
performance. We introduce a new methodology, using Sociometric badges, to record precise
data on face-to-face interaction networks for a group of workers in a large IT manufacturing
firm over a one-month period. Linking these data to detailed performance metrics, we find that
1) network cohesion is associated with higher worker productivity, in contrast to previous
findings in email data; 2) cohesion in face-to-face networks is associated with even higher
performance during complex tasks, suggesting that cohesion complements information-rich
media for transferring the complex knowledge needed to complete such tasks; 3) while
information-seeking from many colleagues creates disruptions, more interactions with a few key
strong-tie informants speeds up work. Face-to-face networks have more explanatory power than
physical-proximity networks, suggesting that information flows in actual conversations (rather
than individuals' correlated exposure to common environmental factors through physical
proximity) are driving our results. These results augment our understanding of how media
choice and network structure interact, shedding light on the organizational effects of face-toface interaction. The methods and techniques we introduce are replicable, creating
opportunities for new lines of research into the consequences of face-to-face interaction in
Erik Brynjolfsson (Chair)
Director, Center for Digital Business & Schussel Family Professor
MIT Sloan School of Management
Roberto Fernandez
William F. Pounds Professor in Management &Professor of Organization Studies
MIT Sloan School of Management
Ray Reagans
Alfred P. Sloan Professor of Management,
MIT Sloan School of Management
Sinan Aral
Assistant Professor of Information, Operations and Management Sciences
New York University Stern School of Business
I've been incredibly fortunate to have an amazing group of advisors, peers, and friends.
Without their support, the journey would have been much more intimidating. First of all, I
would like to thank my fiance, Tim Kaldewey, for his love and support. Having flown endless
miles from California to Boston during my graduate study, Tim has been my biggest supporter,
cheering for me when I made a breakthrough, and motivating me when I hit a roadblock. Tim,
you are the reason I survived this journey.
To the members of my committee, Erik Brynjolfsson, Sinan Aral, Roberto Fernandez and
Ray Reagans, I am honored and proud to call you my advisors and my role models. My deepest
gratitude goes to my Chair, Erik Brynjolfsson. I am indebted to your inspirations, insights, and
guidance. You have deeply influenced the way I view the world and how I approach solving a
problem. Thank you for believing in me, investing in me, and giving me opportunities to grow
and explore. Every interaction with you has been an eye-opening experience and the courage,
excitement, and confidence you inspired every time after we meet were critical for pushing me
forward. Sinan, you gave me the courage to pursue my passion. Your guidance and inspiration
have had a profound impact on my personal growth as a researcher. You showed me how to look
at the bigger picture beyond a single research project. Most importantly, you taught me the
importance of communicating my ideas to others with clarity. Roberto, you are one of the most
effective educators I know. Thank you for always being honest and motivating me to do my best.
Your unwavering commitment to academic rigor taught me to be a meticulous researcher. You
have never failed to transmit valuable lessons and remind me what is important in my work and
in the field. Your commitment to give back to the academic community has inspired me to do
the same. Ray, your insightful comments were critical for my thesis. You have an incredible
ability to see what is still very cloudy in my mind and help me frame my research to fit in the
broader picture. Each minute spent with you has saved me hours. Thank you for believing in my
work and me.
Several other faculty members at MIT Sloan School of Management also provided
tremendous support and guidance. Professor Stuart Madnick has always been instrumental for
my growth. You have been a great mentor and a friend. Having also spent my entire education at
MIT in course 6 and course 15, like you, I feel an immediate affinity to you. You helped me
bridge to the world of management from my training in engineering. Wanda Orlikowski, George
Westerman, Andy McAfee, and Stephanie Woerner, you have been very generous with your time
and provided valuable feedback, especially during the critical stage of my job hunting. Marshall
Van Alstyne, you have always been a great mentor and gave me valuable insights about research
and where the field is headed.
I would also like to thank my peers. Without you, the journey would have been
unthinkable. Jason Abaluck, Phil Anderson, Joelle Evans, Heekyung Kim, Xitong Li , Yiftach
Nagar, Adam Saunders, Jialan Wang, Yanbo Wang, and Jie Yang are the ones who make this
experience tolerable and, at times, fun. I will always remember our laughter and tears. I
especially owe thanks to Chuck Eesley. Not only did we grow up together as researchers, we have
also grown to be great friends.
Over the past 5 years, I have been affiliated with the MIT Center for Digital Business
(CDB), which has created an exciting environment, enabling me to pursue my research interests.
Without its financial support and contacts with various industrial partners, my research
program would not have been possible. I especially thank Ching-yung Lin from IBM Research
for your indefatigable support over the past years. Your faith, guidance, and generous financial
support are greatly appreciated. I look forward to our continuing collaboration.
I would not have made it without our amazing PhD director, Sharon Cayley. Your
support, especially during dark times, has been instrumental for my survival in the PhD
program. You have always made my visits to your office feel like home.
Lastly, I would like to thank my parents, Zhen Wu and Cindy Lin, and my grandparents.
Your unconditional love and support are always a source of comfort. Thank you for always
believing in me and standing behind all my endeavors.
Table of Contents
In trodu ction .......................................................................................................................................................
Essay 1: "Social Network Effects on Performance and Layoffs: Evidence from the Adoption of
a Social Networking Tool."
1. In trodu ction ................................................................................................................................................
2. Th eory.............................................................................................................................................................17
3 . Settin g .............................................................................................................................................................
4 . D ata ..................................................................................................................................................................
5. Id entification ..............................................................................................................................................
6. E m p irical M eth ods ..................................................................................................................................
7. R esults ............................................................................................................................................................
8 .D iscu ssion an d C on clu sion ..............................................................................................................
R eferen ce ...........................................................................................................................................................
Essay 2: "Identification of Influence: An Experimental Platform for Understanding the
Relationship between Social Networks and Performance.
1t. In trodu ction ................................................................................................................................................
D ata an d Settin g........................................................................................................................................67
3. R esearch D esign .......................................................................................................................................
4. Surveys on Expertise-Find Usage Patterns.....................................................................................
5. R obu stn ess Ch ecks..................................................................................................................................93
6. Conclusion and Pre-Experim ental Statistics...................................................................................
R eferen ces.........................................................................................................................................................95
Essay 3: "Water Cooler Networks: Performance Implications of Informal Face-to-Face
Interaction Structures in Information-Intensive Work
1. Introduction ....................................................................................................................................................
2. Theory............................................................................................................................................................101
3. Background and D ata ...............................................................................................................................
4. Em pirical M ethods.....................................................................................................................................125
5. R esults............................................................................................................................................................127
6. D iscussions and Conclusion....................................................................................................................135
Organizations have long recognized the importance of social capital and looked for ways
to effectively use the social networks of their employees. Recently, with the wide adoption of
social networking tools, it has become increasingly important to understand how people derive
value from their networks and if social media plays a role in helping individuals build desired
networks and subsequently, harvest value from them. The goal of this thesis is to examine how
social networks, information, and information technology (IT) affect information worker
While the field has made some progress in understanding how certain properties of
social networks favor superior work performance, many questions remain-in particular, those
concerning the causal relationship between networks and performance, and the mechanisms of
how networks impact performance. In addressing these questions, the thesis is divided into
three essays. The first essay, "Social Network Effects on Performance and Layoffs: Evidence
from the Adoption of a Social Networking Tool," assesses the effect of social networks for a
group of consultants on two outcome measures: the ability to generate billable revenue and the
risk of being laid off. Not only do I examine the performance implications of network structures,
but I explore the intermediate outcomes networks generate that ultimately affect performance.
The second essay, entitled "Water Cooler Networks: Performance Implications of Informal
Face-to-Face Interaction Structures in Information-Intensive Work,"
performance implications of face-to-face networks. Using Sociometric badges to record precise
face-to-face interaction data for a group of IT workers and linking these data to detailed
performance metrics, we find that, contrary to previous findings in email networks, network
cohesion (the lack of structural holes) is associated with higher worker productivity. In my third
essay, "Identification of Influence: An Experimental Platform for Understanding the
Relationship between SocialNetworks and Performance,"I create an experimental platform to
address the causal linkage between networks and performance. Together, the three essays form
a program dedicated to understanding how social networks, information, and social media affect
information worker productivity.
These topics are of growing importance to managerial theory, practices, and policymaking as information work has become a cornerstone of production in developed economies
where access to and processing of information are the keys to driving information worker
productivity. The growth of information in the past 20 years is unprecedented and the advance
of information technologies, such as search engines and data management tools, has greatly
facilitated our ability to search for information. However, social networks remain an important,
if not the predominant, way we obtain relevant and valuable information. Interestingly, during
this time, there is a similar uptick in the use of social networking tools, easing the sharing of
information. Together with the advance of digital communication, they provide an exciting
opportunity to explore how people use social networks to obtain information and how that
affects productivity.
My work draws on several rich fields for both theoretical foundations and
methodologies. For understanding how information workers generate value, I draw theoretical
insights from labor economics, especially the production function of labor. Examining how
social networks and social media drive productivity, I draw on theories from economic sociology
and information economics-specifically, how certain attributes of social networks, such as
structural holes, confer work advantages on individuals. I also draw from information system
literature to form the basis of my thinking on how technology adoptions affect productivity. The
prior literature on IT and firm productivity has inspired me to explore how technology adoption
translates to gains in individual workers' performance. In addition to drawing from these three
fields for theoretical foundations, I leverage techniques from machine learning and information
retrieval to quantify various hard-to-capture properties of information derived from social
networks. Together, these fields provide ideas and tools that help me to understand the
importance of information, information technology, and social networks in information worker
My work differs from prior literature in four important ways: 1) by addressing the causal
relationship between network structures and performance through designing and implementing
a set of randomized field experiments; 2) by analyzing the actual information content
transmitted inside the network rather than using network structures as proxies for information;
3) by examining the network effects on multiple work outcomes in a single setting to explore
how they differ in generating these outcomes; 4) by understanding how media choice affects the
use of social networks to transfer knowledge.
To accomplish these goals, I captured the information content of electronic interactions
of more than 8000 employees in a large international firm as well as detailed face-to-face
interactions at another medium size firm. The data are collected using privacy-preserving
crawlers and sensors. From these data, I was able to map the electronic communication
networks of these employees to understand the performance implications of a structurally
diverse network-as characterized by low cohesion and structural equivalence and richness in
structural holes. I used both panel data and instrumental variables to eliminate factors that
might confound the estimates and found that a structurally diverse network can positively affect
performance. While these econometric techniques help in addressing issues of whether there is
a causal linkage between network structure and performance, they are not sufficient because
these techniques require the assumption of exogeneity. However, only a randomized experiment
in the field can be truly exogenous and definitely address the causal impact of networks on
performance in the real world. To address this previously intractable problem, I designed and
implemented an online experimental platform for conducting randomized field experiments at a
large technology firm. Measuring the network and performance change for both the treatment
and control groups, I can definitively answer the question of whether certain changes in the
network cause performance improvements. From these causal analyses, I will be able to suggest
building features into a social networking tool that recommend optimal connections to
maximize employee performance. From these experiments, I can also address many important
research questions. For example, I will be able to empirically show whether social networking
tools can generate long-term change in a person's social network and the time it takes for such a
change to generate value for the employees and the organization.
I also contribute to the literature on the economics of information. By directly observing
the information content transmitted in a network, I measure the information benefits derived
from a structurally diverse network and their impacts on performance. While Burt (1992)
theorized about three types of information benefits-access, timing, and referrals-there is scant
evidence documenting them and especially comparing them in the same setting. This is
problematic, because information benefits are theorized to be the primary reason why actors
with a structurally diverse network can derive rents. Thus, it is important to verify if such
networks actually generate these benefits. To address this problem, I measure information
diversity (a combination of the access and timing of information) and friendship (a proxy for
referrals) as two types of information benefits to explain why structurally diverse networks can
generate work advantage and how they differ in affecting various work outcomes.
I find that information diversity is more correlated than friendship for predicting
objective performance measures such as billable revenue. However, friendship is more
important than information diversity with regard to reducing the risk of layoffs. While having a
large repository of information is crucial for performance in a knowledge-intensive industry,
friendship is also important due to the referral process, because friends are more likely to
advocate for an individual to avert a crisis, such as being laid off. Furthermore, I find that while
a structurally diverse network can generate both types of information benefits, there is a
fundamental tradeoff between the two. The constraint here is that a person only has limited time
and resources to devote to gathering diverse information or forming friendships. While
spending time to accumulate diverse information is helpful for generating billable revenue, it
takes away time and energy from forming friendships, which is crucial for avoiding being laid
Although studying networks derived from electronic media is important, face-to-face
interactions, such as the proverbial "water cooler conversations," remain a significant part of
communication. In a coauthored paper, "Water Cooler Networks:
Performance Implications of Informal Face-to-Face Interaction Structures in InformationIntensive Work," I examine the performance implications of face-to-face networks. We
introduce a new methodology, using Sociometric badges, to record precise data on face-to-face
interaction for a group of IT workers. Combining these data with detailed performance metrics,
we find that, contrary to previous findings in email networks, network cohesion (the lack of
structural holes) is associated with higher worker productivity. This result augments our
understanding of how media choice and network structure interact, shedding light on the
organizational implications of face-to-face interaction. The methods and techniques are
replicable, creating opportunities for new lines of research into the implications of face-to-face
interactions in organizations.
The methods and the experimental platform designed in this thesis are replicable and
portend a new frontier in understanding the mechanisms behind why information and social
networks matter for information worker productivity and whether the relationship is causal. As
estimated in 2006, the amount of digital information created, captured, and replicated is 161
billion gigabytes, about 3 million times the information in all the books ever written (Barnette
2006). The speed of information growth has only been increasing since then. During this time,
the simultaneous explosion of social media, knowledge management, and networking tools is
not a mere coincidence. These tools have made it possible to cheaply share and disseminate the
vast amount of information recently created. Thus, understanding how workers acquire
information through social networks and how that ultimately affects productivity is only going
to grow in importance.
Gantz, J. F., Chute, C., Manfrediz, A., Minton, S., Reinsel, D., Schlichting, D., and Toncheva, A.
2008. "The Diverse and Exploding Digital Universe", EMC White Paper.
Social Network Effects on Performance and Layoffs:
Evidencefrom the Adoption of a Social Networking Tool
Lynn Wu
By studying the changes in employees' networks and performance before and after the
introduction of a social networking tool, I find that a structurally diverse network (low in
cohesion and rich in structural holes) has a positive effect on work performance. The size of the
effect is smaller than traditional estimates, suggesting that omitted individual characteristics
may bias the estimated network effect. I consider two intermediate mechanisms by which a
structurally diverse network is theorized to improve work performance, information diversity
(instrumental) and social communication (expressive), and quantify their effects on two types of
work outcomes: billable revenue and layoffs. Analysis shows that the information diversity
derived from a structurally diverse network is more correlated with generating billable revenue
than is social communication. However, the opposite is true for layoffs. Friendship, as
approximated by social communication, is more correlated with reduced layoff risks than is
information diversity. Field interviews suggest that friends can serve as advocates in critical
situations, ensuring that favorable information is distributed to decision makers. This, in turn,
suggests that having a structurally diverse network can drive both work performance and job
security, but that there is a tradeoff between either mobilizing friendship or gathering diverse
Keywords: Social Network, Productivity, Layoffs, Information Diversity, and Friendship
Social network theory predicts a structurally diverse network that is low in cohesion and
spans structural holes to be associated with higher work performance. By linking unconnected
groups, the brokers, who bridge these holes, are endowed with early exposure to novel
information and can act as hubs to facilitate information flow between otherwise disconnected
groups. Studies have shown that people whose networks are rich in structural holes have a
competitive advantage over their peers. They tend to receive superior performance ratings and
higher compensation (Burt
Podolny and Baron 1997; Burt 2005; Cross and Cummings
2004; Lin 2002). For example, bankers with structurally diverse networks are more likely to be
recognized as top performers (Burt 2000). Similarly, employees in research and development
positions maintaining diverse contacts outside of the team are more productive than their peers
(Reagans and Zuckerman 2001).
While previous research has provided important theoretical insights (e.g., Burt 1992;
Coleman 1988), the question of how social network positions drive productivity gains remains
open. Information benefits have been theorized to be the primary reason why a structurally
diverse network endows individuals with work advantage. Often network structures are treated
as a proxy for accessing more information and more diverse information (Burt 2008), and thus
having a structurally diverse network is assumed to give individuals information advantage.
However, information transmitted inside a network is rarely directly observed. Thus, it is
difficult to verify if a structurally diverse network actually generates information benefits that
ultimately affect performance. Burt has theorized that three forms of information benefitsaccess, timing and referrals-are responsible for driving superior work performance (Burt 1992:
13-15). If so, it is important to separately quantify these benefits and their relationships to work
I examine whether structurally diverse networks can generate information benefits by
focusing on how information diversity and social communication-two types of information
benefits that emerge from structurally diverse networks-can lead to superior work
performance. I define information diversity as the heterogeneity of the information content in
individuals' electronic communications. As a measure, it combines access to and timing of
information,' the first two types of information benefits in Burt's framework. Earlier access to a
variety of information sources allows an individual to gather more, and more diverse,
information, which can be instrumental to performance. I also create a friendship index that
measures the frequency of social communications and informal activities in a person's electronic
communications. Because social communications can help generate friendships and friends are
more likely to serve as advocates, the friendship index can serve as a proxy for the referral
process, Burt's third type of information benefit.
By examining the two types of information benefits and their instrumental and
expressive nature, I attempt to bridge the literature on network structures with the literature on
tie content. I find a structurally diverse network can generate both instrumental and expressive
types of information benefits, with information diversity being instrumental and social
communication being expressive. This finding runs contrary to the belief that due to their
contrasting natures, there is a tradeoff between having both expressive and instrumental
relationships in a networks (Bale and Slater 1955:
Etzioni 1965: 696-97; Slater 1965).
While it is possible to have both kinds of benefits in a structurally diverse network, there is a
tradeoff between the two in the relative returns on the investment from either socializing to
form friendships or gathering diverse information. The decision to mobilize friendship or
information diversity may depend on the work outcome one hopes to achieve.
To better understand the tradeoff, I examine the impact of information diversity and
social communication on two types of performance measures-billable revenue and layoffs-for
1Although this approach does not capture all communications by a person, email and instant messaging represent
a significant proportion of the overall communication. Furthermore, calendar events also capture some of the
a group of technology consultants at a large information technology firm. I choose billable
revenue as an objective measure of a worker's productivity, because it is one of the most
important performance metrics for evaluating employees in the consulting industry. Because
accessing diverse information is critical for solving difficult problems, information diversity
derived from a structurally diverse network is more likely to be beneficial in generating billable
revenue. I also explore the effect of information diversity and social communication on layoffs, a
negative and traumatic experience for most workers, and one which can negatively affect the
remaining employees through network destruction, especially when friends are laid off
(Krackhardt and Porter 1985; Shah 2000).
It is possible that the mechanism for generating billable revenue may fundamentally
differ from that for determining the risk of layoff. Often, firms delegate layoff decisions to
managers. A manager's favorable opinion is likely to protect a person from being laid off.
Effective promotion from referrals gets the actor's name mentioned at the right place and the
right time, maximizing the job retention rate. I find that information diversity derived from a
structurally diverse network is more associated with improving objective performance, such as
billable revenue, than is social communication, but that social communication is more positively
correlated with job retention than is information diversity. To exemplify their possible tradeoff,
I show whether information diversity and social communication are substitutes in generating
billable revenue and avoiding layoffs.
To lend a causal interpretation to the analysis, I take advantage of a variation generated
by a technology that can change the network positions of its users over time. By examining the
change in work outcomes before and after the adoption, it is possible to determine if this
technologically induced network change can actually alter billable revenue and layoff risks.
Similarly, by examining the change in information diversity and social communication, it is
network change and the changes in information diversity and social communications before and
after the adoption, it is possible to determine if a structurally diverse network can actually
generate different types of information benefits that may ultimately affect work outcomes.
Network Positions and Performance
The structural perspective of network studies (Coleman 1988; Burt 1992) often focuses
on the configuration of ties as opposed to the content of ties in the ego-network. One of the most
prominent features of social network structure that has received an enormous amount of
theoretical and empirical attention is brokerage or network diversity (e.g., Burt 1992;
Granovetter 1973), characterized by a network that is low in cohesion and structural equivalence
and rich in structural holes. Such networks are often positively correlated to various
measurements of work performance. For example, Burt
2004) shows that
structural holes can create a competitive advantage for individuals in dimensions such as wages
and promotion. He attributes the normalized performance differences to actors' ability to access
and gather information from non-redundant social groups (Burt 1992; Ancona and Caldwell
Sparrowe et al. 2001; Reagans and Zuckerman
Cummings and Cross 2003; Zaheer
and Bell 2005). This information advantage is particularly important in knowledge-intensive
industries where the success of a project relies on identifying and assimilating existing
information in order to create new knowledge and innovation (Burt 1992).
Thus, a structurally diverse network is assumed to confer information benefits by
providing the access to novel information from loosely connected network neighborhoods (Burt
The economic value of information stems from the fact that information is distributed
unevenly in a network and thus, tapping into various information sources that are distributed
throughout the network is important for solving difficult problems and finding new
opportunities. Structurally diverse networks can provide actors with the capability to reach out
to distant information sources. A redundant network, on the other hand, tends to provide repeat
information. In such a network, no one can monopolize information long enough to derive rents
because the dense network of strong ties can quickly disseminate any information throughout
the network.
In addition to information diversity, brokers are also theorized to control the flow of
information and reap rents from brokering between two disconnected parties (Burt 2004,
Obstfeld 2005). Endowed with preferential access to information, brokers are in a unique
position to identify arbitrage opportunities and reap benefits through strategically linking
disconnected actors. However, as Reagans and Zuckerman (2008) commented, there is a
fundamental tradeoff in the social-structural foundations of power and knowledge. The same
mechanism that endows brokers with power as the providers of information also reduces their
power as acquirers of information because network contacts in a non-redundant network are
also monopolist themselves when the broker tries to acquire information from them (Reagans
and Zuckerman 2008). However, regardless of the control benefits, information benefits derived
from a structurally diverse network are still greater than what is provided in a redundant
network. In this paper, I focus on the information benefits and examine how they affect
performance independent of whether individuals control the information flow to their
advantage. Thus, I hypothesize that a structurally diverse network can affect work outcomes
such as objective work performance measured using individual billable revenue as well as
subjective performance as measured by the risk of being laid off
Hypothesis ia: Structurallydiverse networks cause increasein billable revenue
Hypothesis 1b: Having a structurallydiverse network reduces layoff risks.
Network Diversity, Information Diversity and Social Communication
While the information benefit derived from a structurally diverse network has received
much scholarly attention, few have actually measured it; the vast majority of empirical work on
network and information is content-agnostic (Hansen 1999).
As Burt explained, network
structure is often used as a proxy for information flow because structures can be measured more
easily than the actual content of what is transmitted in the network (Burt 2008). He calls for the
next phase of network research to investigate how individuals gather information from their
network positions. While some research, especially in the connectionist perspective, has also
stressed the importance of measuring the content of the network, they often characterize the
network as channels, pipes or conduits, and the content as attributes of the nodes (Podolny
2001; Rodan and Galunic 2007). Under this assumption, information flow is implicitly assumed
to be proportional to the distribution of links among nodes of the network (Granovetter 1978;
Schelling 1978). However, information exchange may occur strategically; individuals do not
always share all available information (Reagans and McEvily 2003; Aral and Van Alstyne 2007).
Hence, it is critical to open the black box of networks to investigate information that is being
transferred between individuals inside a network. Instead of using characteristics of nodes as a
proxy for information content and structural topology as a proxy for information flow, the next
phase of research should examine if and how network positions generate information benefits
and whether these information benefits ultimately induce superior work performance (Burt
2008; Aral and Van Alstyne 2007).
One notable exception in advancing the information content analysis of networks is the
recent work by Aral and Van Alstyne (2007). Using encoded email content, the authors analyzed
the email traffic at an executive recruiter firm and showed that brokers are more likely to have
more heterogeneous information, which is also associated with higher work performance (Aral
and Van Alstyne 2007). While calculating information heterogeneity is a notable breakthrough,
it is also important to measure other aspects of information benefits, especially comparing them
in the same setting to show how their capabilities differ in influencing work outcomes. As Burt
(1992) theorized, there are three forms of information benefits: access, timing, and referrals.
Access refers to receiving a valuable piece of information, while timing refers to the ability to
receive a key piece of information faster than others. Referrals are a process in which personal
contacts promote the actor to others. As Burt explained, "they are motors expanding the third
category of people in your network, the players you don't know who are aware of you.... [They]
are strong personal advocates in decision-making ensure both favorable information
and response to any negative information get distributed during decisions" (Burt, 1992: 14-15).
However, measuring access, timing and referrals is extremely difficult, because it is hard
to directly observe the information content in people's interactions. I address this issue by
quantifying two types of information benefits: information diversity and friendship, through
encoded electronic communication such as email, text messages, and calendar events.
Information diversity can be viewed as a combination of information access and timing.
Specifically, I compute the diversity of information content by counting the number of distinct
topics in an actor's electronic communications. Obtaining information from diverse sources is
the key to making better decisions, solving difficult problems, and generating innovative
solutions. Thus, I hypothesize information diversity to be the primary mechanism for a
structurally diverse network to generate rents and competitive advantage.
Hypothesis 2a: Structurally diverse networks can generate informationdiversity.
Social communication contributes to the referral process. It measures how much of an
actor's communications is related to socializing and informal social activities. Through these
activities, network contacts get to know an individual better and are more likely to serve as
strong personal advocates, particularly in situations of crisis and uncertainty (Ibarra 1995).
Having a diverse circle of friends is more likely to help the actor, trumpeting his
accomplishments and advertising his work to a diverse group of people, including decision
Hypothesis 2b: Structurallydiverse networks can generate referrals.
Content of Ties: Expressive and Instrumental Network Relationships
Information diversity and social communication can also be viewed in the framework of
expressive and instrumental network relationships where information diversity is a proxy for
instrumental relationships and social communication is a proxy for expressive relationships.
Research on expressive and instrumental networks often focuses on the content of relationships
(Borgatti and Foster 2003), as opposed to the structural properties. It argues that topological
studies of networks often neglect the resources flowing between ties and focus exclusively on the
structural perspectives (Lin 2001; Snijder 1999). This is problematic because actors are only
successful when they can mobilize resources from their network contacts (Podolny and Baron
2001). One way to classify these resources is through their instrumental or expressive nature.
Instrumental ties are often used to exchange work-related resources, and they typically involve
actions that seek information, expertise, and professional advice (Ibarra 1993; Fombrun 1982;
Lincoln and Miller 1979; Podolny and Baron 1997). Expressive ties, on the other hand, are often
affective and friendship-based and involve the exchange of alliance, trust, and social support
(Krackhardt 1995).
Although instrumental ties and expressive ties are theorized to be distinct, they can also
overlap in a dyadic relationship. Often expressive ties have instrumental values and some
instrumental ties are also affective; thus the line separating the two often blurs (Scott 1996).
Some have suggested that the two types of networks may interact positively: workers with
overlapping instrumental and expressive ties are more effective (Ibarra 1992). On the other
hand, expressive and instrumental activities often conflict because it is difficult to fill both roles
at the same time (Bales and Slater 1955; Etzioni 1965; Slater 1965). For example, having
friendship or expressive ties can make it difficult for a manager to enforce rules and sanctions to
subordinates. Thus, expressive ties dampen the effect of instrumental actions or vice versa,
leading to a tradeoff in having either one or the other (Fernandez 1991). Similarly, because
people tend to distinguish between the two roles, expressive and instrumental networks can
have a substitutive relationship (Homan 1974; Fernandez 1991).
Drawing upon the literature on expressive and instrumental networks, I show that
information benefits derived from a structurally diverse network can have both instrumental
and expressive elements. Specifically, information diversity is coupled with instrumental
actions, much as social communication is to expressive actions. Instrumental actions, which
generate task-related information and advice, increase the information diversity crucial to
higher work performance. Expressive actions such as social communication generate friendship
and active referrals that are more likely to advocate and promote an individual, helping the
person avert crises and find new opportunities. However, the literature on instrumental and
expressive networks focuses more on tie contents rather than on the network's structure, while
the structural perspective often disregards instrumental and expressive elements in the network
perhaps due to the difficulty of directly observing the information flow. I bridge the gap between
the structure- and tie content-centric views by showing that structurally diverse networks can
generate both instrumental and expressive properties. This is contrary to the notion that it is
difficult to build both expressive and instrumental networks because they can be at odds (Bales
and Slater 1955: 290-92; Etzioni 1965: 696-97; Slater 1965).
However, unifying social communication and information diversity in a network comes
with its own costs and tradeoffs. Because an actor's time and energy are necessarily limited,
gathering information must in effect be traded off against forming friendship through social
communication. Although they may overlap, social communication and information diversity
are still distinct. For instance, to increase information diversity, actors can form ties with
individuals whom they do not like or normally interact with for the sole purpose of gathering
information. However, one can spend the same time and energy socializing, making friends and
thus facilitating the referral process, even if this effort does not necessarily increase information
diversity. Thus, because the effort to generate information diversity and social communication
can be orthogonal, the tradeoff between the two lies in the return to these investments.
Network Effect on Billable Revenue and Layoffs
To understand how information diversity and social communication are used to achieve
different goals, I examine their effects on two types of outcome: billable revenue and layoffs.
Billable revenue is an objective measure of work performance and is one of the most salient
metrics for evaluating consultants. Information workers, such as consultants, are especially
valued for their ability to access valuable information, which can have two effects in enhancing
work performance. First, accessing information related to the task at hand directly improves the
quality of work. Second, accessing diverse information also exposes actors to new opportunities
and valuable resources (Burt 1992, 2004). Consequently, these actors would be the first to learn
a new opportunity, placing them at the front of the queue to strategically seize the opportunity.
In the IT consulting business, accessing information expediently is the key to performance.
Since consultants' performance is largely measured by billable revenue, it is crucial to do well in
the current project as well as spending time to look for future opportunities. Knowing where to
obtain expertise through networks helps an individual solve difficult problems and produce high
quality work, enhancing his reputation and his future prospect for finding opportunities. All
else equal, a manager would prefer reputable consultants to handle important projects, because
they are more likely to satisfy customers and generate repeat business. Thus, if a structurally
diverse network is to produce informational benefits, it should have a strong effect on
information workers in knowledge-intensive settings.
Social communication may also help improve billable revenue. By socializing informally
with a diverse group of people, consultants are more likely to encounter opportunities
serendipitously. Friends can also provide important information that eventually results in
billable revenue. The operative factor in these situations, however, is information diversity that
includes information generated by work-related interactions with friends. On the other hand,
social communication, which proxies for friendship, is distinct from informational diversity in
that it captures the referral process. Through informal interactions, an actor's network contacts
are more likely to know his expertise and can serve as his advocates to others. Although having
someone to advocate for an actor is helpful, it rarely generates billable revenue directly, because
having access to useful information, opinions, and perspectives is ultimately responsible for
solving difficult cases and generating profits. Hence, I hypothesize that a performance
improvement arises from a structurally diverse network primarily by means of information
diversity but not necessarily by social communication.
Hypothesis 3a: Structurally diverse networks induce higher billable revenue primarily
through informationdiversity, not through social communication.
While structurally diverse networks are shown to provide information diversity that
directly improves work performance, they can also produce referrals who can enhance a
person's prestige and reputation. Functioning as means to trumpet one's accomplishments and
promote one's work, referrals ensure the actor is protected in crisis situations, such as layoffs.
Thus, the same channel through which actors derive diverse information also provides them
with a diverse network of potential referrals. Through social communication to generate
affective relationships, individuals can mobilize their network contacts to serve as their
Thus, social communication in an employee's structurally diverse network can reduce
the risk of being laid off and increase job retention. An employee is much less likely to be laid off
if a wider range of people, including managers, has a favorable opinion of the person. Referrals
can greatly facilitate this process by functioning as means to broadcast one's achievements to
others. The advantages of the referral process also flow from the theory of recognition heuristics
(Goldstein and Gigerenzer 1999, 2002); according to this theory, people place higher values on
objects they recognize than on objects they don't, regardless of their actual values. Thus, when
key decision-makers have heard of a person, that recognition value alone may keep the person
from being laid off. In contrast, people with comparable, or even superior work evaluations, may
face higher risks of being laid off if they lacked a diverse group of referrals to ensure that the
decision-makers are aware of their contributions. From qualitative interviews of managers who
participated in layoff decisions, many of them expressed the importance of reputation and
general awareness of a person's work through the referral process.
When we sit down at a meeting to make layoff decisions, we discuss people's
work and what we think of their work, not just billable hours. Usually, when more
than one person in the meeting is aware of the person or speaks on his behalf,
this person is much less likely to be laid off than someone nobody has heard of.
This confirms that actors with referrals in a structurally diverse network are able to effectively
advertise their work and promote themselves through referrals. Consequently, their visibility is
increased, and they are less likely to be laid off.
Information diversity can also reduce the risks of layoff through generating more billable
revenue, because firms are less likely to lay off their star performers who disproportionately
contribute to generating profits for the firm. However, social communication plays a more
important role in reducing the risk of layoff than does information diversity, possibly because
layoffs do not only affect the person who got laid off; it can affect the team and other colleagues
who are connected to the person. Once the person leaves the organization, he is removed from
the social network of his contacts and network destruction from layoffs can have a drastic effect
on the remaining employees (Krackhardt and Porter 1985). Qualitative interviews show that
when a key person is removed from the organizational network, it affects other team members
(Shah 2000). One person during the interview lamented:
We were just in the process of forming a project that involves the collaboration of several
groups when layoff happened. When Bob got laid off, the project also fell apart, because
Bob was the key person connecting all of us together. Once he was gone, we were not
able to mobilize everyone to continue the effort.
Billable revenue, on the other hand, tends to affect the person himself rather than the
group, because generating more billable revenue has less effect on other group members than
layoffs. Accordingly, the mechanism for reducing the layoff risk is different from the one that
generates billable revenue. Friends are likely to protect one from being laid off, because not only
would they lose a potentially important information source, they may also experience the
negative consequences of losing a friend (Shah 2000). Informal activities and social
communication with a diverse group of people can promote friendships that, in turns, shield an
actor from being laid off. Thus, friendships can protect a person from being laid off more than
information diversity, even after controlling for billable revenue. Figure 1 below captures the
theory development and hypothesis testing.
Hypothesis 3b: Social communication is more correlatedwith protecting an actorfrom
layoffs than is informationdiversity.
vs. Expressive
ego vs. group
Billable Revenue
Figure 1: Hypotheses Testing Framework
To test these hypotheses, I have collected data at a large information technology firm.
High-tech firms have been a fertile ground for researchers to understand how network
characteristics play important roles in information-intensive work settings such as the search
and transfer problems across organizational units (Hansen 1999), research and development
productivity (Reagans and Zuckerman 2001), and mobility in the workplace (Podolny and Baron
1997). If information benefits derived from network positions matter for performance, they
should matter especially in a knowledge-intensive setting, such as in the high-tech sector.
To characterize the social network in the firm, I captured its internal electronic
communication exchanges. Previous work has validated the benefits of using electronic
communication data to understand intra-organizational networks within a firm or an institution
(Wu et al. 2004; Kossinets and Watts 2006, 2009). While using digital traces left by users can
construct a more accurate portrait of a network, more importantly it allows for the direct
observation of the information transmitted inside the network. Examining the variation of the
information content across individuals can confirm whether the information-based assumptions
about the network are valid (Burt 2008). As explained by Aral and Van Alstyne (2007),
analyzing the content of communications as well as the topological structures of networks can
open new avenues for answering questions at the heart of the sociology of information. Thus, by
examining the content, I can capture the information heterogeneity across individuals by
computing the total number of distinct topics in each person's electronic communications.
Furthermore, I extend the content analysis by constructing a friendship index that quantifies
informal and socializing activities in individuals' communication.
Without recording of the content of people's communication transmitted inside an
electronic communication network, it would be difficult to measure and classify different types
of information benefits. Traditionally, content analysis is often done through detailed
ethnography studies. While these studies are useful, they are also limiting because it is difficult
to capture the communication content for a large group of people using ethnography. Similarly,
traditional social network data is generated using self-reports such as surveys and
questionnaires that require the subjects to recall their network connections. While respondents
are generally good at remembering recent and frequent interactions, they are poor at recalling
weak and distant ties (Marsden 1990; Krackhardt and Kilduff 1999). The recall bias as well as
the general inaccuracy in memory can be problematic for constructing network relations that are
socially distant, resulting in errors in measuring many network parameters (Marsden 2005;
Kumbasar, Romney and Batchelder 1994). Using the archive of electronic communications
directly can greatly alleviate this type of bias, because electronic records can precisely capture
when and what exact information content is exchanged between actors.
In particular, I focus on employees in the consulting division of this firm whose primary
function is to solve problems for clients and generate profits from billable revenue. Typically,
consultants are involved in four broad categories of projects: IT consulting, business processes,
application supports, and outsourcing services. Consulting projects are in general informationintensive and require solving difficult problems for the client.
According to qualitative
interviews, consultants often spend a large amount of time assembling, analyzing, and assessing
information gathered from various sources to fully understand clients' problems and make
decisions and recommendations based on the information. To access diverse information that is
critical for decision-making, consultants often need to reach out to experts in the organization.
Having connections to the experts either directly or through colleagues is crucial for the
consultants to gather and integrate information into viable solutions. Satisfying clients is
extremely important because generating repeat business is the key for maintaining a continuing
stream of revenue and avoiding bench time.
In addition to working on the current project, consultants also have to look for future
projects. Consulting work in this firm functions like an internal labor market. To avoid bench
time, consultants constantly spend time searching for future opportunities. While an internal
placement manager is assigned to each consultant, the manager has limited capacity to help
individual consultants. Qualitative interviews indicate that a typical placement manager is
responsible for 50-100 consultants at a time, and thus, relying on the placement manager alone
is not enough to find suitable projects as needed. Accordingly, consultants need to be proactive
to find opportunities, and social contacts can play an important role in this process. Having
access to information about project opportunities from social contacts is useful, because hearing
about an opportunity early gives a consultant a timing advantage in applying for the job.
Obtaining more information about the project and the person leading the project also help the
consultants to present their skills strategically to suite the needs of the project lead. Hence, they
are more likely to be hired.
To understand how social networks affect billable revenue and the risk of layoff, I
analyze an electronic communication social network of 8037 employees over 2 years. The data
contains email, calendars and instant messaging activities inside a global information
technology firm. To the best of my knowledge, this is the largest social network ever tapped to
study the impact of social networks on information worker productivity. The data is collected
using a privacy-preserving social network analysis system (Lin et al. 2008) that deploys social
sensors to gather, crawl, and mine various types of data sources, including the hierarchical
structure of the organization, and individual role assignments as well as the encoded content of
email and instant messages and calendars of employees who volunteered their data for the
The system is deployed globally and has collected detailed electronic communication
records of 8,037 volunteers. Although the volunteers only represent about 5% of the global
population of the firm, it represents about 15% of employees in English-speaking regions and
23% of employees in the consulting services, which will be the primary focus in this study. To
alleviate the potential problems arising from the missing parts of the whole company's network,
I only examine the local network structure of each volunteer, because the system captures all the
direct communications that the volunteers are involved in, including communication to nonvolunteers. Furthermore, because more than 50% of the direct contacts of these volunteers (i
degree away) are also volunteers themselves, I can also determine their dyadic relationships. For
the case when the network contacts are not volunteers, it is possible to make some inference
about their network connections by examining if they co-occur frequently in the same email, IM,
or calendar event. When two people (B and C) are listed together as the correspondents of a
third (A), they (B and C) are likely to be connected to each other as well. However, it is still
possible to miss some connections among the non-volunteers, and the network structural
parameters may be biased as a result. This is a common problem for network studies in the field
that requires setting a boundary on the population studied. But the missing connections among
non-volunteers would not bias the content analysis that calculates the parameters of
information benefits using the electronic communication records of the volunteers, which is
fully captured.
From these volunteers, it is also possible to derive a partial social network of everyone in
the firm. However, I constrain the analysis to focus on the sub-network for the 8,037 volunteers
whose complete electronic communication data is available. To eliminate any potential bias
from the volunteered data, I compare the job roles, demographics, the types of business
functions, and hierarchical ranks of the volunteers with the rest of the firm. I find minimal
differences between the two populations. However, the volunteers in my sample are on average
less likely to be laid off than others in the firm. Perhaps, these volunteers are more likely to be
high performers or they are more socially connected than the rest. After all, that they have
donated their data for this research in exchange for accessing social networking tools signals
that they are more interested in social networking than others. However, with a large sample,
more than 8,ooo people, there is sufficient variation to detect the local average effect from
networks in this sub-population of more socially inclined people.
To construct a precise view of the network that reflects the real communication patterns
among employees, I eliminate spam and mass email announcements. Since each electronic
message includes a timestamp, I can map a dynamic panel of social networks from January
2007 to January
2009. Each monthly network is built using a sliding window of 6 months with a
i-month step size and includes all electronic messages in the current month, plus three months
prior and two months after the current month. This construction of network panels can more
accurately reflect the network relationships than the network activities in only a single month.
Using the communication data, I construct a network panel of 17 periods for 8,037 employees,
which provides an opportunity of rare scale and scope to study how a person's social network
evolves over time.
To explore how social networks are related to work performance, I obtain detailed
financial performance records of more than 8,000 consultants. I focus on 2,038 consultants in
this sample who have volunteered their electronic communication data and the 2,592 projects
that these consultants participated in from January 2007 to January 2009. The sheer volume of
the data allows a more precise estimate of how population-level topology in a network,
information diversity, and social communication affect objective performance measures and
layoff risks. To protect the privacy of the volunteers, their identities are replaced with hash
identifiers, and the content of their messages is encoded. Tables 1 and 2 show the summary
statistics of these consultants, including their demographics and job roles as well as network
Table 1: Summary Statistics for Person-Level Networks
Std. Dev.
Direct Contacts
Network Constraint
Ties to managers
Ties to divisions
Table 2: Summary Statistics on Consultants
Std. Dev.
Job Rank
To study the network effect on the risk of being laid off, I use data during a round of
layoffs in January 2009 when approximately 8% of the work force is eliminated. The firm's
corporate policy allows for a two-month grace period during which the laid-off employees could
retain their work privileges, including access to the corporate email system, intranet, and
internal job postings. If they were able to find other positions within the firm during the grace
period, they could be internally transferred and thus remain at the firm. However, due to the
recession's severity, the firm simultaneously instituted a worldwide hiring freeze, making such
internal transfers unlikely. Although I have no roster of exactly who got laid off, I can infer one
by comparing the human resource (HR) directory shortly after the layoff announcement and
right after the actual layoff event. From the difference between the two HR databases, I can
derive who has left. It is possible that some employees may have left voluntarily, although
unlikely in light of the severe recession and the difficult labor market worldwide, especially in
North America. Several regional offices were closed and everyone in them was laid off. I exclude
them from the dataset.
Dependent Variables
The dependent variables are two types of work performance outcomes. First, I measure
the objective work performance using the monthly billable revenue generated by each
consultant in a two-year period from January 2007 to January 2009. Because billable revenue
is the benchmark for gauging productivity in the consulting industry, it is a clear and objective
performance measure widely adopted for evaluating the performance of information workers
such as consultants, lawyers, and accountants. The second dependent variable is whether a
consultant was laid off in January 2009. Measured using job retention, the variable is binary,
equaling o if a person is laid off and 1 if a person is retained. I explore how network positions
and the information benefits derived from these positions can increase the rate of job retention.
Explanatory Variables
I use Burt's measure of network constraint (Burt 1992) to measure network diversity, or
brokerage positions.
Network _Diversity =1 - C
,q qi,j.
Network constraint Ci measures the degree to which an individual's contacts are
connected to each other as well as their connections to the individual. Pij is the proportion of
actor i's network time and energy invested in communicating with actorj. Network constraint is
a local property that measures the cohesiveness of a person's network (Burt 1992), and
network diversity is the opposite of network constraint and is computed as 1-C. Since
relationships may erode over time, I use a 6-month sliding window of electronic communication
to gauge the network relationships in the current month.
Pij is calculated from the tie strength, which is measured using the frequency of one's
electronic communications. Granovetter (1982) described four identifying properties for the
strength of ties: time, emotional intensity, intimacy, and reciprocity. In practice, tie strength has
been measured in many ways. Some use reciprocation to represent strong ties and a lack of
reciprocation to represent weak ties (Friedkin 1980). Others have included the recency of
contact or the frequency of interactions as a surrogate for tie strength (Granovetter 1973). To
measure the tie strength in electronic communications, I primarily use the frequency, but with
some modifications.
Because a single electronic message does not constitute an actual tie, especially when it is
sent to a large number of people, counting any message exchange between actors as a dyadic tie
would overestimate the number of ties and the overall tie strength. Thus, I eliminated all
messages that have more than 15 recipients (Lin et al. 2008). In addition, to accurately reflect
the tie strength between two actors, I normalized the measure to an interval between 0 and 1,
with o indicating no tie between the two actors and 1 indicating the maximal tie strength (Lin et
al. 2008). The detailed calculation is described below.
TieStrength.. =
~ Maxlog(X',)
0: iff{Xy s 3 +log(X
Xi: otherwise
where Xij is the total number of electronic messages between actors i and
Basically, the
formula indicates that a tie exists only when the number of electronic messages between two
actors reaches a certain threshold. This threshold is personalized; for active users of electronic
media, the threshold to register a tie is higher than for those who seldom use electronic media.
This measure of tie strength has been extensively tested and shown to accurately reflect the tie
strength between actors (Lin et al. 2008).
Content Analysis
To measure information diversity and social communication, I use the content of
electronic communications, after ensuring privacy is preserved. Individuals are hashed with
unique classifiers, so it is impossible to determine their identities. To preserve the privacy of
each message, the original textual content is also not recorded. Instead, I create a set of
tokenized one-gram and two-gram keywords after eliminating stop-words and stemming. Stop
words are common words such as articles ("a," "an," "the") and prepositions (e.g., "from," "of,"
"to"). Stemming involves stripping each word to its root. For example, the word "running" will
be recorded as its root, "run." With these precautions, it is virtually impossible to reconstruct the
original message from these tokenized keywords, which are further anonymized with hash
identifiers to preserve privacy.
I model the diversity for information content using Latent Dirichlet Allocation (LDA) to
classify the content into distinct topics. LDA is an advanced statistical technique that is widely
used in information retrieval and machine learning. It is a generative probabilistic model that
extracts topics from a corpus of documents2 (Wen and Lin 2010). Each topic is a vector of words
that are statistically related to each other. For example, in Figure 2, LDA classifies the sample
text into four specific topics. The topic "Children" has words including "women," "child," "care,"
and "parents." Similarly, the topic "Budget" includes words such as "tax," "federal," "state," and
Lincoln Center
New York Philharmonic
The William Randolph Hearst Foundation will give $1.25 million to Lincoln Center,
Metropolitan Opera Co., New York Philharmonic and Juilliard School. "Our board
felt that we had a real opportunity to make a mark on the future of the performing
arts with these grants an act every bit as important as our traditional areas of
support in health, medical research, education and the social services," Hearst
Foundation President Randolph A. Hearst said Monday in announcing the grants.
Lincoln Center's share will be $200,000 for its new building, which will house
young artists and provide new public facilities. The Metropolitan Opera Co. and
New York Philharmonic will receive $400,000 each. The Juilliard School, where
music and the performing arts are taught, will get $250,000. The Hearst
Foundation, a leading supporter of the Lincoln Center Consolidated Corporate
Fund, will make its usual annual $100,000 donation, too.
Figure 2: An Example of LDA Classification3
LDA classifies topics in two specific steps. The first step is a discovery phase that
Given a document corpus, LDA models each document d as a finite mixture over an underlying set of topics,
where each topic t is characterized as a distribution over words. A posterior Dirichlet parameter g(d; t) can be
associated with the document d and the topic t to indicate the strength of t in d. For details of the algorithm,
please refer to D. Blei, A. Ng, and M. Jordan, Latent dirichlet allocation, Journal of Machine Learning Research
3:993-1022, 2003.
3 Adopted from Blei & Jordan, 2003, Latent dirichlet allocation, Journal of Machine Learning Research 3:993-1022.
searches the entire topic space using every document in a corpus. Once words are classified into
topics, LDA finds the topic space in each individual document. A document, in this setting, is an
aggregate of all the electronic messages in a person's communication. LDA is used to classify
100 topics using the entire corpus of electronic communications of 8,037 volunteers from
to January
Information diversity is then calculated for each person in every
month as the total number of topics in the person's electronic communications during that
To measure referrals, I create a friendship index. First, I obtain a dictionary of every
word ever used in the corpus of electronic messages. Each word is ranked by its TF-IDF4 weight,
which measures how important a word is to a document. I then used this list to create a sub-list
of words that are related to social communications and social activities with friends but also
have relatively high TF-IDF values. For example, some examples of keywords are "lunch,"
"coffee," "football," and "baseball." Two firm employees also verified that the words on the list
are often used for social and informal activities. An employee then calculated the frequency of
this set of words in each person's monthly communications. I created a friendship index as the
ratio of words relating to social activities to the total number of words.
Ffiendsh p Index
Words related to social activities
Total nwnber of _words
Control Variables
I include controls for individuals' demographics such as gender, managerial roles, and
job ranks. A managerial role is a dummy variable indicating whethek the person is a manager.
Job ranks have an ordinal value ranging from 6 to
where level 6 is a junior consultant while
4 TF-IDF stands for "term frequency-inverse document frequency." It is often used in information retrieval and text
mining. The weight is a statistical measure used to evaluate how important a word is to a document. The
importance increases if the word is rarely used. Frequently occurring words such as "have" will have relatively low
TF-IDF weights, whereas a relatively exotic word such as "haematoma" has a high weight.
level 12 is an executive vice president. A dummy variable is also created for each job rank but
results do not fundamentally change between using a set of dummy variables and the ordinal job
rank. To control for the differences across various divisions and geographical locations, I include
dummies for the four business divisions as well as a dummy for each geographical location. To
control for the current workload, I include the average monthly revenue billed in the past six
months. Lastly, to control for individual preferences to use electronic media, I include a person's
total number of electronic messages (email, calendar, and instant messaging) in a month.
Despite the overwhelming evidence for strong correlations between network positions
and work performance, the causal mechanism underlying the association is underexplored
(Reagans and McEvily 2003). A plausible explanation is that people actively seek high
performers for advice and collaboration opportunities, and hence high performers tend to
display a structurally diverse network. Similarly, certain individual characteristics may manifest
in their social networks. For example, a popular person tends to have a more diverse network,
which may also enable the person to be an effective employee in an organization. In essence,
individual traits are the missing variables that mediate both network positions and performance,
so their observed relationship may be spurious. That there are positive correlations between
certain individual characteristics and network positions suggests that individual heterogeneity
may moderate the relationship between network positions and performance (Burt 2004, 2007;
Hargadon and Sutton 1997). For example, Burt and Ronchi (2007) suggest that high-status
individuals such as executives are more likely to occupy brokering positions in the firm because
their roles as executives require them to reach out to a diverse range of people. Similarly, Burt
(2007) suggests that inherent abilities, such as possessing performance-enhancing cognitive
skills, are ultimately responsible for improving work performance. In short, in this view,
network positions are a function of human capital.
To detect a causal relationship between network positions and performance, an
exogenous source of variation is needed (Munshi 2003). I exploit the adoption of a social
networking tool that can exogenously change a person's network position over time. The
primary function of this technology (Expertise-Find) is to allow users to search for experts using
keywords. Because people resort to technologies only when they cannot find relevant experts in
their immediate network neighborhood (Borgatti and Cross 2003), the experts in a user's search
are often outside of the person's existing social circle. If users decide to reach out to these
experts, the network diversity of the users is likely to increase after using this tool. Accordingly,
the adoption of Expertise-Find could exogenously change a person's network position. If
observable improvements in work performance are detected after the technology adoption, it is
likely that the increase in network diversity induces the performance improvement, suggesting a
causal relationship between network positions and performance.
Expertise-Find is similar to popular search engines on the Web, such as Google, with the
only difference being that instead of URLs, it returns a list of people whose expertise is relevant
to the search query. This tool aggregates as much information as it can about the employees
inside the firm using the intranet. For example, the tool can crawl and mine public information
on the intranet about employees using their online profiles, resumes and online forums as well
as private communication exchanges if they decide to volunteer their data. In aggregates, these
data serve as the basis to infer individual expertise at the firm. For example, when searching for
the phrase "Social Networks," Expertise-Find would return a list of people ranked by whether
their expertise is relevant social networks (Figure 3). Each search result lists the name of the
expert, a picture (if available in the public HR directory), the job role, and the division the expert
belongs to. If one clicks on the person, the system shows more details, such as the physical work
location and contact information. In order to understand how employees use the search tool and
how often they actually contact the experts from the search, I conducted an extensive survey
about the general usage and search behaviors. The consistent pattern from the survey reveals
that the vast majority of people use Expertise-Find when they have already exhausted their
existing local networks, conforming to earlier studies (Borgatti and Cross 2003). By contacting
experts suggested by the tool, users are more likely to find the information they need either
directly, or through further recommendations from the expert. Evidence also suggests that the
relationship formed between the expert and the searcher can become more permanent with
repeated interactions. One person interviewed commented that she made a friend after
contacting an expert through the tool. One of the experts who had helped her earlier was
transferred to the same work location as she was, and she offered to help the expert through the
transition and they became friends afterward. Some experts also mentioned that they received
thank-you gifts from the searchers they helped, and this helped to enhance their relationship.
The firm has a program that sponsors this type of gifts so that individuals can use them to thank
people in the organization who are helpful to their work.
"0 W l LnIII11 LAW%*
Di i
Job Role
Job Role
Figure 3: Snapshot of Expertise-Find. Search result from searching for the phrase "Social
Overall, by contacting people from the search result, users are more likely to reach out to
a distant group of people, increasing their network diversity. Because I have the historical
electronic communication data of the volunteers, it is possible to measure the network change
for the same person before and after the adoption. If there is a change in the network position
after the adoption, it is plausible to attribute the change to using the search tool. If we
simultaneously observe a performance change, it is likely that the performance gain is due to the
change in network positions. However, there may be self-selection factors that could induce
both a network change and the adoption of the search tool, and it is important to address them.
Selection Effect
An important concern is that there is a selection bias in choosing when and why to sign
up for Expertise-Find. The bias can simultaneously drive the adoption of the tool as well as any
change in network positions. However, three factors help alleviate the bias. First, I examine the
change in a person's network position before and after the adoption. If there are any unobserved
individual characteristics, such as the propensity to use new technologies, that can drive both
the adoption and the network change, I can eliminate this type of bias through a fixed-effect
specification. Second, people adopted this tool at different times throughout the study, allowing
me to control for any temporal shocks that can affect the adoption choice. For example, if people
are more likely to sign up for the tool after their annual performance review in February,
controlling for the February-effect can eliminate this bias. It is also plausible that people would
choose to use Expertise-Find when they already have many consulting projects. Consequently, it
may seem that a network change is affecting the change in billable revenue, but it is actually a
reverse causality in which having a heavy workload induces people to use the technology and
change their network positions as a result. In order to eliminate this bias, I use the average
monthly billable revenue in the past 6 months to control for the existing workload. It is also
possible that a person chooses to adopt the tool when a project requires different knowledge
from what they had before. Hence, the person uses the tool to access information. I argue that
the adoption of Expertise-Find can be particularly helpful because it provides a means for the
person to reach experts in distant pockets of the organization. Thus, the tool can reduce the
search costs of finding information and help the person complete projects and satisfy clients.
One could also argue that it is not the network, but the ability to locate information
quickly, that is ultimately responsible for inducing the performance change. Because ExpertiseFind can effectively locate the source of information, it reduces the search cost of information
that is ultimately affecting performance. However, as I argued earlier, a structurally diverse
network is reason for reducing the search cost because such networks can generate information
benefits that expose people to more information, and more unique information, than their
peers. Hence, using Expertise-Find as an instrument for network diversity, I can directly
observe if network diversity produces information benefits in the forms of information diversity
and social communication.
After controlling for factors that may drive the adoption choice, it is plausible that the
adoption is exogenous for changing the network position. Although I am aware that there could
still be other unobserved heterogeneities that violate this assumption, interviews and surveys on
user behaviors do not show any other consistent pattern that could drive both the technology
adoption and the change in network positions.
Empirical Methods
I estimate the relationship between network diversity and work outcomes using the
adoption of Expertise-Find as an instrument for network diversity. To understand how a
structurally diverse network induces superior work outcomes, I examine if network diversity
actually generates information diversity and social communication, the two types of information
benefits theorized to arise from a structurally diverse network. Using the adoption of ExpertiseFind, I hope to find evidence of causal relationships between network diversity and information
diversity, between network diversity and social communication, and between network diversity
and work outcomes.
Furthermore, I am interested in how information benefits-information diversity and
social communication-ultimately affect different types of work outcomes: billable revenue and
However, the instrumental variable approach is not sufficient to identify the
relationship because I have two potentially endogenous variables but only one instrument.
Hence, in order to control for the differences in individuals' characteristics, I incorporate
attributes such as gender, demographics, and job roles that may affect both information benefits
and work outcomes. If the unobserved heterogeneity in individuals' characteristics is correlated
with the error terms in the model, estimates using pooled OLS will be biased. To address this
issue, I examine the variation within and across individuals over time using both fixed and
random effect models to control for the bias. However, this technique can only be applied when
studying the impact of network positions on billable revenue, because I have a longitudinal
panel of both. But layoffs are cross-sectional because being laid off is a one-time event. Thus, I
can only control for observable individual characteristics, instead of exploiting the variation
within individuals as I could with the analysis on billable revenue. To alleviate the endogeneity
concern in analyzing layoffs, I employ the lagged measurements of network characteristics at
time t-1 to predict layoffs at time t. Specifically, I use the electronic interactions six months prior
to the layoff events to calculate network variables. If networks are to have an effect on layoffs,
the network of communications prior to the layoff event should have an important impact.
Furthermore, I also included individuals' objective performance, billable revenue, to predict
layoffs, because employees with superior performance should have lower probability of being
laid off. To mitigate the estimation problem arising from the endogenous relationship between
network characteristics and billable revenue, I use the billable revenue generated 6 months prior
(at time t-2) to when network characteristics are calculated (at time t-1).
Network Changefrom the Technology Adoption
First, I examine if the adoption of Expertise-Find can actually induce a change in
network positions. It is possible that the adoption is not a random event. However, because I
examine the network change for the same person over time, fixed-effect models can eliminate
many individual heterogeneities, such as human capital, that might bias the estimates. I also
control for temporal shocks to mitigate some biases from time-varying characteristics. By
including a dummy for each calendar month, I eliminate the seasonal effects that can drive the
adoption choice. However, there might still be time- and individual-varying biases. For instance,
it is possible that people are more likely to adopt this technology when facing high workloads.
Thus, I use the average billable revenue in the past six months to control for the general
workload at the time of the adoption.
Event Study: Structural Diversity and Technology Adoption
Months Since Signup
coeff high
Figure 4: Event study for when people adopted Expertise-Find. Each point on the graph is the
coefficient estimates calculated from regressing network diversity on each month since adoption
after controlling for calendar-month dummies, past billable revenue and individual fixed-effect.
A value of zero on the X-axis indicates that Expertise-Find is just adopted. Negative values on
the X-axis indicate the number of months before the adoption has occurred and the positive
values indicate the number of months that have passed since the adoption. From this graph, it
shows that the effect on structurally diversity gradually goes up since the adoption event at X=0.
To construct the technology adoption variable, I use a binary variable that equals i for
every month after the person has adopted Expertise-Find and zero before the adoption has
happened. Overall, there is a positive and significant correlation between the instrument and the
endogenous variable in the first-stage regression. Using the fixed-effect model, I find that the
correlation between network diversity and the adoption of Expertise-Find is .114 (t = 17.86) after
controlling for seasonality and past performance. To estimate the validity of the instrument, I
calculate the concentration parameter, which is 86.7, indicating that the adoption of ExpertiseFind is not a weak instrument (Hansen, Hausman, and Newey 2004).5 Figure 4 shows the
relationship between network diversity and the timing of the adoption in an event study. Each
data point on the graph shows the coefficient estimates calculated from regressing the network
diversity on each month before and after the adoption event in the 2-year period in my sample.
After factoring out seasonality, individual fixed-effects, and past performance, the coefficient
estimate for months after the adoption event (X>o) is increasing over time, indicating that
Expertise-Find can induce a change in network diversity (Figure 4).
The reduced-form regressions in Table 3 show that the adoption is positively associated
with generating billable revenue as well as reducing the risk of layoff (or increasing job
retention). After controlling for temporal shocks, individual fixed-effects, and a person's past
performance, Model 1 of Table 3 shows that the adoption of Expertise-Find is positively
associated with generating more billable revenue. As with layoffs, the adoption is also positively
correlated with job retention. However, because the layoff event has only one observation for
each person, it is impossible to use the earlier instrumental variable approach that relies on the
network change before and after the adoption. Instead, I use the number of months since a
person has signed up for Expertise-Find to instrument for network diversity. As shown in Figure
4, network diversity gradually increases after a person has started to use the tool. Thus, other
s The test for weak instrumental variable requires the concentration parameter to be greater than 10
(Hansen, Hausman, and Newey 2004). Any value less than 10 indicates the presence of a weak
things being equal, early adopters should have more structurally diverse networks than late
adopters. To test the validity of this instrument, I calculate the concentration parameter in the
first-stage regression and the value is ni, slightly above the cut-off for the validity of the weak
instrument test. The reduced-form regression (Model 2) shows that the number of months
passed after the adoption is positively correlated with job retention.
Table 3: Reduced-form Regressions: Adoption on Billable Revenue and
Dep var:
Individual Fixed Effect
Month Dummies
Control variables
Communication Volume Communication Volume
Past Billable Revenue
Past Billable Revenue
divisions, Work
geographical locations
geographical locations
Clustered standard error, ***p<o.01, ** p<O.05, * p<0-1
Network Effect on Billable Revenue
Next, I show if a technology-induced change in network positions can-induce a change in
performance over time. Model 1 of Table 4 shows the OLS estimate on the correlation between
network diversity and billable revenue, after controlling for demographics, the work division as
well as the managerial and technical level for the person. This is what has been traditionally
estimated in understanding the relationship between network diversity and performance in
previous work. As shown in Model 1, coefficient estimate for network diversity is positive and
the effect is relatively large. A 1%increase in network diversity is correlated with billing 886 US
dollars in a month. However, when a fixed-effect specification is used (Model 2), the size of the
coefficient is reduced by 17% (Pnetwork diversity = 733.0, p <.01). This shows that unobserved time-
invariant individual characteristics could drive changes in both network diversity as well as work
performance. In Model 3, I estimated the effect of network diversity on billable revenue using
the adoption of Expertise-Find as an instrumental variable (IV) for network diversity. The
coefficient from this IV regression is reduced dramatically by 82% (Pnetwork diversity = 126.5, P <.1),
demonstrating time-varying individual heterogeneity can still bias the estimate upward.
However, network diversity in the IV regression continues to be positive and statistically
significant, demonstrating that it can induce a positive change in performance.
Table 4: Network Dive sity and Performance
Dependent Variable
Monthly Monthly
-. 320
Average billable revenue
in the past 6 months
Gender (0-male)
Manage (dummy)
Job rank (6-12)
Division (dummy)
Division (dummy)
Sales Division (dummy)
Headquarter (dummy)
Controls: monthly dummy for each of the 24 months
In Model 4, 5 in Table 4, I incorporated the total number of electronic messages
exchanged over a month as a control for individual differences in online media use. It is possible
that tech-savvy individuals are more likely to adopt a new technology and simultaneously be
high performers. After controlling for usage of electronic media, the results largely mirror earlier
results. The parameter estimate of network diversity in IV model is significantly less than the
estimates in the fixed-effect model, but the coefficient is still positive and statistically significant.
It is also possible that existing workload may drive the adoption of Expertise-Find as people
seek to use this tool to help with their high workload. To address this potential bias, I controlled
for the average monthly revenue in the past 6 months. As shown in Model 6 and 7, while past
performance is strongly correlated with the current billable revenue, the IV estimate for network
diversity continues to be positive and significant. However, the size of the effect estimated in the
IV regression is much smaller than estimates using the fixed-effect model and the OLS model.
Overall, these results largely support Hypothesis la.
Network Effect on Layoffs
While I show evidence of a causal relationship between network diversity and billable
revenue, I also examine if network positions can affect a person's risk of being laid off. If
network positions are to have an impact on work outcomes, it should have an even more
pronounced impact on layoffs, because unlike promotions and performance evaluations, layoffs
are a more traumatic experience for most people and network contacts should play an important
role in keeping a person from being laid off. Table 5 shows the cross-sectional analysis of the
network effect on layoffs using the network characteristics calculated from six months of
electronic communications prior to the layoff event.
The first model of Table 5 shows the effect of network diversity on job retention (ilayoffs). Gender and job roles do not show any statistically significant effect on job retention, but
geographical locations appear to have an effect. Compared to the European Union, workers in
the US are more likely to be laid off. This difference is probably due to stronger labor laws in
Europe, which make it harder for the firm to downsize. I also control for the usage of digital
media and show that network diversity is positively correlated with job retention, but it is just
short of being statistically significant (Model i).
Table 5: Networks Diversity and Layoff/Retention
Dependent Variable
ln(network diversity):
ln(i- constraint)
IV Probit
IV Probit
ln(Billable revenue)
Volume of email/IM/
calendar events
Gender (0-male)
Job Role (level 6-12)
US (dummy)
-. 362**
Division .133
Standard errors in parentheses
p<o.oi,** p<o.o5, * p<o-1
In Model 2, I examine if the relationship between network diversity and the risk of being
laid off could be causal using instrumental variables. I use the number of months that have
passed since a person has adopted Expertise-Find as an instrument for network diversity.
However, the instrument can be problematic because there might be individual characteristics
that drive both the network change and the likelihood to adopt early. This is more problematic
in a cross-sectional analysis, because it is impossible to exploit the fixed-effect model to
eliminate any time-invariant individual characteristics, such as the propensity to be early or late
adopters. To address this issue, I control for demographics, gender, job role, and rank and other
observable individual traits. However, I am aware that there might still be unobserved factors
that may drive both changes in network positions and the risk of layoffs.
The instrumental variable approach shows that the coefficient on network diversity is
positive. Specifically, a one-percentage increase in network diversity is correlated with an
increase of 15.7 percentages in job retention, providing evidence that peripheral actors are more
likely to be laid off than those who occupy more central positions in the network. However, it is
possible that those with a structurally diverse network may just perform better and for that
reason are less likely to be laid off. To address this issue, I control for the objective work
performance using billable revenue. However, billable revenue is an endogenous variable
because network positions can simultaneously affect billable revenue and the risk of being laid
off. Hence, including billable revenue as an independent variable is problematic (Angrist and
Pischke 2009). To address this problem, I use the lagged billable revenue 6 months before
measuring the network characteristics. The timing difference implies that the billable revenue is
predetermined before network positions are calculated. Thus, they are less likely to be the
outcomes in the causal nexus (Angrist and Pischke 2009). Using billable revenue from an earlier
period is also beneficial because it controls for possibilities that people who are finishing their
current projects may also face increased risk of layoff when they have not lined up any future. As
expected, the objective performance, measured by billable revenue, is a strong predictor for job
retention (or reduced risk of layoff). Similarly, network diversity continues to be positively
correlated with job retention (Model 3, Table 5). If the main advantage to having a structurally
diverse network is the access to relevant information and expertise, the billable revenue
generated should capture the performance impact from network diversity. But Model 3 shows
that a structurally diverse network provides additional shields against layoff, even after
controlling for the objective performance, and interestingly, the effect from network diversity is
actually greater than that from billable revenue
(Pbilable revenue = .090,
pnetwork diversity = .150).
The F-
test shows the significance of the test is at p < .001 level, demonstrating that in addition to
information diversity, a structurally diverse network can protect a worker from being laid off,
beyond generating more billable revenue for the person.
A possible explanation for why network diversity can reduce the risk of layoff even after
controlling for billable revenue is that actors with a structurally diverse network can be
instrumental for helping others to generate revenue for the firm. By providing key information
and expertise to their network contacts, the actors can indirectly contribute to the profitability of
the firm and, accordingly, they are less likely to be laid off. If this is the case, we would expect
that the billable revenue generated from network contacts can reduce a person's risk of being
laid off. However, the average billable revenue generated from a person's network contacts (1
degree away), is not statistically significantly correlated with retention (Model 4). This result
provides some evidence that helping others does not reduce a person's risk of being laid off.
Lastly, I examine if the results could be causal using the number of months that have
passed since a person has adopted Expertise-Find as an instrument for network diversity. Model
4 shows that a 1%increase in network diversity is associated with an increase of 1n.8 percentages
in job retention, demonstrating that network diversity has a significant impact on layoffs.
However, the sizes of the effects from network diversity and billable revenue are comparable.
Interestingly, the average billable revenue of network contacts increases the risk of layoffs. This
is possibly because when others perform well, it actually decreases the relative performance of
the person and thus increases the risk of layoff for the person. Taking these results together, a
structurally diverse network can positively associated with job retention, supporting hypothesis
ib. To understand exactly how a structurally diverse network can increase the rate of retention
as well as generating more billable revenue, I examine the effect of information benefits derived
from a structurally diverse network. In particular, I focus on information diversity and social
communications and their impacts on work outcomes.
Information Diversity and Social Communication as a Function of
Network Diversity
I explore if a structurally diverse network, as measured by network diversity, actually
generates information benefits, specifically in the forms of information diversity and social
communication. Information diversity is calculated as the number of topics in a person's
electronic communications. Social communication is calculated using a friendship index that
measures the frequency of words in the messages that are related to socializing and informal
activities. Friendship index is a proxy for the referral process, because friends are more likely to
advocate for the person, trumpeting his or her accomplishments at key junctions such as during
Table 6 shows the relationship between information diversity and network diversity, and
between the friendship index and network diversity. I find strong evidence that network
diversity generates information diversity. Using a fixed-effect model, a one-standard-deviation
increase in network diversity is associated with finding an additional 1.5 topics in one's
electronic communication (Model 1). The effect continues to be positive (Pnetwork diversity = 6.11, p <
.05) when an instrumental variable is used for network diversity. This is also similar to findings
in Aral and Van Alstyne (2007), which also finds a structurally diverse network to be positively
correlated with accessing diverse information. Next, I examine if a structurally diverse network
can also facilitate the referral process as approximated by social communication. As with
information diversity, the fixed-effect model shows that a one-standard-deviation increase of
network diversity is correlated with gaining .01 points in the friendship index, which is about a
one percentage increase (Model 3). Network diversity continues to be positively associated with
the friendship index (Pnetwork
diversity =
.216, p < .05), using the adoption of Expertise-Find as an
instrumental variable. Overall, the fixed-effect and the IV regressions show a causal relationship
between network diversity and social communication. Having friends in a diverse network can
facilitate the referral process where friends can serve as advocates for the person. These results
support Hypotheses 2a and 2b.
Table 6: Relationships among Network Diversity, Information Diversity and Social
Network diversity
Friendship Index
Friendship Index
Volume of email
Controls: monthly dummies for each of the
dummies and individual fixed effect
Standard errors in parentheses *** p<o.oi, ** p<o.05, * P<o-1
In Table 7, I explore whether information diversity and social communication are
complements or substitutes by examining their correlations. After controlling for temporal
shocks, individual fixed effects, and past performance, the correlation between information
diversity and the friendship index is negative, suggesting a potential substitutive relationship
between the two. Though they can overlap, gathering information and socializing are two
distinct activities. This shows that a structurally diverse network can generate both expressive
and instrumental elements, as shown by information diversity and social communication,
respectively. However, there might be a tradeoff between the two in generating the desired work
outcome, as I explore in the next section.
Information Diversity, Social Communication and Their Relations to
Billable Revenue and Layoffs
To examine how a structurally diverse network improves work performance, I explore
how information diversity and social communication differ in generating billable revenue and
reducing the risk of being laid off. Table 8 shows the effect of these factors in generating billable
revenue. Model 1 shows that, after controlling for the volume of communication, a onestandard-deviation increase in information diversity is correlated with generating an additional
$187.50 of billable revenue, while the friendship index is not statistically significantly correlated
with billable revenue. When both information diversity and the friendship index are treated as
independent variables in the same model (Model 3), I find that information diversity, but not
the friendship index, is positively correlated with generating billable revenue. In Models 4-6, I
control for the past billable revenue, because it could be serially correlated with the current
billable revenue. Results in these models largely mirror the earlier results in Models 1-3: only
the coefficient on information diversity is statistically significantly correlated with billable
revenue; the coefficient on the friendship index is not. Overall, these results support Hypothesis
4a. In Model 7, I explore whether information diversity and the friendship index serve as
complements or substitutes. The interaction effect
(Pinfornation diversity Xfriendship index = -219.18,
p <
.05) is negative. Together with the earlier negative correlation between information diversity
and friendship index (Table 7), these results show that the two serve as substitutes for
generating billable revenue (Athey and Stern, 1998; Brynjolfsson and Milgrom, 2010).
and Social
Dependent Variable
Information Diversity
Friendship Index
Information Diversity X
Friendship Index
volume of communication
volume of communication
past billable revenue
Controls: monthly dummy for each of the 24 months, individual-level fixed-effect
Standard errors in parentheses ***p<o.oi, ** p<o.o5
Next, in Table 9, 1 explore the effect of information diversity and the friendship index on
job retention (reducing risk of layoff). The first model shows the correlation between
information diversity and job retention after controlling for demographics, gender, job ranks,
and dummies for regions and business divisions. Contrary to the result in the performance
analysis, information diversity is not correlated with retention (Model 1). However, a onestandard-deviation increase in the friendship index is associated with an 11 percentages increase
in job retention (Model 2). When both information diversity and the friendship index are jointly
used in the model, the friendship index is still positively associated with retention but the
coefficient on information diversity is not. The F-test shows that the effect of the friendship
index is greater than that of information diversity at p = .01 level. This set of results suggests
that social communication, which approximates the referral process, is more important for
avoiding layoffs than is information diversity. This is the exact opposite from the performance
analysis where information diversity is more correlated with generating billable revenue than is
social communication.
Because work performance could also have a significant impact on layoffs, I control for
the past billable revenue, using data 6 months prior to the layoff event (Models 4-6 of Table 9).
All else being equal, high performers are more likely to be retained than low performers. It is
also possible that a person contributes indirectly to firm profits by helping his colleagues. Thus,
I control for the average billable revenue of the network contacts in Models 4-6. As in Model 1-3,
I find that compared to information diversity, social communication as measured by friendship
index is more correlated with job retention (Model 4). The F-test shows that the coefficient of
the friendship index is greater than that of information diversity in maximizing job retention (p
< .001). Together these results demonstrate that social communication, which approximates the
referral process, is the primary channel through which a structurally diverse network drives job
retention, lending support to Hypothesis 4b. From qualitative interviews, managers state that a
person is less likely to be laid off when others have heard about his or her work either directly or
indirectly. Because friends are more likely to advocate for friends, having a diverse group of
friends is helpful in averting crises, such as layoffs.
Next, I explore whether information diversity and social communication are substitutes
in reducing the risk of getting laid off. Model 7 shows the interaction effect of information
diversity and the friendship index to be negative and statistically significant. Together with their
negative correlation (Table 7), these results demonstrate that information diversity and social
communication are substitutes (Athey and Stern, 1998; Brynjolfsson and Milgrom, 2010).
Overall, these results show that a structurally diverse network can have both
instrumental (information diversity) and expressive (social communication) elements, contrary
to the notion that it is rare to have both because one may dampen the effect of the other.
However, the tradeoff re-emerges in the ability of a person to mobilize either information
diversity or referrals to achieve a desired work outcome. The substitutive relationship between
the two shows the limitation in mobilizing them together. The return from investing in social
communication will dampen the return to investment in information diversity. An individual
could develop both kinds of social capital but an individual's couldn't benefit from them both
Table 9: Networks and Layoff Risks: Information Diversity vs.
retention retention retention retention
Dependent Variable
Social Communication
Friendship Index
Log(Friends' Billable
Controls: Volume of email/IM/ calendar events, gender, job rank, regional dummies, business division
Standard errors in parentheses *** p<0.01, ** p<0.05, * p<o-1
Discussion and Conclusion
In this study, I examine the impact of social networks on billable revenue and layoffs.
Using the adoption of a social networking tool that could change a person's network position
over time, I show evidence of a causal relationship between network diversity and billable
revenue and between network diversity and layoffs. However, the size of the effect is much
smaller than the traditional OLS and fixed-effect estimates. Because this tool can improve a
person's network position primarily through information-seeking activities, the improvement in
work performance is likely to come from the information benefits derived from having a
structurally diverse network. However, it is worth noting that there are different types of
information benefits. I show two types of information benefits-information diversity and
referrals-and they could have different effects in generating billable revenue and avoiding
layoffs. Using the adoption of Expertise-Find as an instrument for network diversity, I show
that a structurally diverse network can generate both referrals, as approximated by social
communication, and information diversity.
To examine how the effect of information diversity differs from the effect of referrals in
generating superior work outcomes, I take advantage of information technology that captures
the digital traces from people's daily communications.
I use advanced machine-learning
techniques to assess the content of people's electronic communications. To measure the
diversity or the novelty of information content, I calculate the number of distinct topics in a
person's communications. To measure referrals, I calculate a friendship index that captures the
frequency of words in the electronic communications that are related to informal and social
activities. Comparing the measurement of information diversity with the friendship index, I
show that the former is positively correlated with generating billable revenue, whereas the latter
is not. However, I find that in the case of layoffs, the friendship index is positively associated
with retention, while information diversity is not. Interviews with managers who participated in
the layoff decisions suggest that the referral is more important for job retention because layoffs
can have a dramatic effect the remaining colleagues. Thus, these colleagues are likely to serve as
advocates in critical situations such as impending layoffs, promoting one's work and
accomplishment to others. This can reduce the probability of being laid off.
The two types of information benefits can also be classified as instrumental and
with information
diversity being the
element and social
communication being the expressive element. Contrary to the notion that it is difficult to have
both in a network because instrumental ties may dampen the effect of expressive ties
(Fernandez 1991), I show it is possible to have both in a structurally diverse network. However,
information diversity and the friendship index are shown to be substitutes for generating
billable revenue and reducing the risks of getting laid off, suggesting a tradeoff in the returns
from investing in instrumental (information diversity) or expressive elements (social
communication) in a structurally diverse network. While it is possible to have both types of
information benefits from a structurally diverse network, one cannot benefit from both equally.
Thus, looking at their effects on different work outcomes is important. Information
diversity primarily drives billable revenue, which is an objective and contractible performance
metric. On the other hand, social communication is intangible and un-contractible. For
example, those with more affective relationships could be great team players, facilitating
collaborations and distributing their resources to other team members when needed. Because
their services are instrumental to the success of the team, they are less likely to be laid off
despite having lower objective performance evaluations. However, social communication can
also be viewed negatively. For example, if multiple factions and cliques have formed as the result
of politics, members of the same faction are more likely to protect their own members even if
their objective performance evaluations are inferior. Thus, the amount of time and energy
devoted to generate information diversity and social communication may depend on the reward
structure and the firm culture. If the reward structure is more aligned with generating more
billable revenue, employees would spend more time and energy gathering diverse information
which is shown to the key to generating profits for the firm. However, when the culture of the
firm is more group-focused or when the work outcome of individuals, such as layoffs, may also
depend on the team, employees are more likely to spend time socializing, forming friendships,
and lobbying supporters. In the case of layoffs where the decision is not purely based on
observable performance metrics such as billable revenue, having supporters to advocate on
one's behalf can significantly reduce the layoff risk. From the firm's perspective, delegating the
layoff decisions to managers would be optimal if social communication can improve
effectiveness of collaborations among team members and contribute to the profitability of the
firm, because managers have private information about the employees, including friendships,
that the firm cannot observe. But, it is also possible for the managers to have a different
objective function from that of the firm; and managers would choose to lay off a person to
maximize his own power inside the organization, even at the expense of the firm. My data
analysis suggests that social communications do not contribute to one's own billable hours or to
the billable hours of one's contacts. This suggests that the impact of social communication on
layoffs is evidence that delegating layoff decisions to managers has important costs. Future work
could attempt to examine more fully the costs and benefits of such delegation in order to
improve our understanding of the optimal allocation of decision rights within firms.
Athey, S., and S. Stern (1998) "An Empirical Framework for Testing Theories about
Complementarities in Organizational Design," NBER Working Paper No. 66oo.
Ancona, D.G. and Caldwell, D.F. 1992. "Demography and Design: Predictors of new Product
Team Performance." Organization Science, 3(3): 321-341.
Angrist J, Pischke J. Mostly HarmlessEconometrics, Princeton University Press, 2009
Aral S, Van Alstyne M. 2007. Networks, Information &Social Capital. InternationalConference
on Network Science 2007
Bales, R. F. and Slater P. 1955. "Role Differentiation in Small Decision Making Groups." Chapter
V in Family, Socialization and Interaction Process, edited by Talcott Parsons and Robert F.
Bales. Glencoe, IL: Free Press.
Berger, J and Milkman, K.L. 2010, Social Transmission, Emotion, and the Virality of Online
Borgatti S., and Cross R. 2003, "A Relational View of Information Seeking and Learning in
Social Networks," ManagementScience 49(4): 432-445.
Borgatti, S. and Foster, P. 2003. "The network paradigm in organizational research: A review
and typology." JournalofManagement.29(6): 991-1013
Brynjolfsson, E., Milgrom, R. (2008) "Complementarities in Organizations", NBER Workshop
on Economics of Organization
Burt R. 1992. Structural Holes: The Social Structure of Competition. HarvardUniversity Press,
Cambridge, MA.
Burt, R. 2000. "The network structure of social capital" In B. Staw, and Sutton, R. (Ed.),
Research in organizationalbehavior (Vol. 22). New York, NY, JAI Press.
Burt R. 2004. StructuralHoles & Good Ideas.American Journalof Sociology, 110: 349-99.
Burt R. 2005. Brokerage and Closure: An Introduction to Social Capital, Oxford University
Press, New York, NY
Burt R. 2007. Secondhand Brokerage: Evidence on the Importance of Local Structure for
Managers, Bankers, and Analysts, Academy of Management Journal, 2007, 50(1), pp. 119-
Burt R., Ronchi, D. 2007 Teaching Executives to See Social Capital: Results from a Field
Experiment, Social Science Research, 2007
Burt, R. 2008. "Information and structural holes: comment on Reagans and Zuckerman."
Industrialand CorporateChange, 17(5): 953-969.
Coleman, J.S. 1988. Social Capital in the Creation of Human Capital. American Journal of
Sociology, (94): S95-S120.
Cross, R. and Cummings, J. 2004. "Tie and Network Correlates of Performance in Knowledge
Intensive Work."Academy ofManagement Journal.47(6): pp. 928-937.
Cummings J, Cross R. 2003. Structural properties of work groups and their consequences for
performance. Social Networks, 25(3):
Etzioni, A. 1965. "Dual Leadership in Complex Organizations." American Sociological Review
30: 688-98.
Fernandez, R.B. 1991. "Structural Bases of Leadership in Intraorganizational Networks." Social
Psychology Quarterly 54:36-53
Fombrun, C.J. 1982. Strategies for network research in organizations. Academy of Management
Review, 7: 280-291.
Friedkin, N. 1982. Information Flow Through Strong and Weak Ties in Intraorganizational
Social Networks. Social Networks, 3 (1982) 273-285
Goldstein, D. G., Gigerenzer, G. (2002). Models of ecological rationality: The recognition
heuristic. Psychological Review, 109, 75-90.
Goldstein, D. G., Gigerenzer, G. 1999. The recognition heuristic: How ignorance makes us smart.
In G. Gigerenzer, & P. M. Todd, (Eds.). Simple heuristics that make us smart. Oxford:
Oxford University Press.
Granovetter, M. 1973. The strength of weak ties. American JournalofSociology, 6: 1360-1380.
Granovetter, M. 1982. The strength of weak ties: A network theory revisited. In P. V. Marsden
and N. Lin(eds.), Social Structure and Network Analysis: 105-1 30.
Hargadon, A. and R, Sutton. 1997. "Technology brokering and innovation in a product
development firm." Administrative Science Quarterly,(42): 716-49.
Hansen, M. 1999. The search-transfer problem: The role of weak ties in sharing knowledge
across organization subunits. Administrative Science Quarterly(44:1): 82-111.
Hansen, C., Hausman, J., and Newey, W., Many Weak Instruments and Microeconometric
Practice,mimeo July 2005.
Homans, G.C. 1974. The Human Group, Social Behavior: Its Elementary Forms, Harcourt,
Brace &World
Ibarra, H. 1992. "Homophily and Differential Returns: Sex Differences in Network Structure and
Access in an Advertising Firm," AdministrativeScience Quarterly 37(3): 422-447.
Ibarra, H. 1993. "Personal Networks of Women and Minorities in Management: A Conceptual
Framework," The Academy of ManagementReview 18(1): 56-87.
Ibarra, H.1995. Race, opportunity, and diversity of social circles in managerial networks.
Academy of ManagementJournal,38: 673-703.
Kossinets, G. and D. Watts. 2006. "Empirical Analysis of an Evolving Social Network." Science
(311:5757): 88-90.
Kossinets, G. and D. Watts. 2009. "Origins of Homophily in an Evolving Social Network."
American Journalof Sociology, 115(2): 405-50.
Krackhardt, D. 1992. The strength of strong ties: The importance of philos in organizations. In
N. Nohria & R. G. Eccles (Eds.), Networks and organizations:Structure,form and action.
Cambridge, MA: Harvard Business School Press.
Krackhardt, D., 1995. "Entrepreneurial Opportunities in an Entrepreneurial Firm: A Structural
Approach," in Entrepreneurship Theory and Practice, Spring, pp. 53-69.
Krackhardt, D. and Kilduff, M. 1999. "Whether close or far: Social distance effects on perceived
balance in friendship networks." Journalofpersonalityand socialpsychology, (76) 770-82.
Krackhardt, D. and Porter, L. 1985. "When Friends Leave: A Structural analysis of the
Relationship between Turnover and Stayer's Attitudes." Administrative Science Quarterly.
30: 242-261.
Kumbasar, E., Romney, A.K., and Batchelder, W.H. 1994. "Systematic biases in social
perception." American JournalofSociology, (100): 477-505.
Lin, C.Y., Ehrlich, K., Griffiths-Fisher, V. and Desforges, C. "SmallBlue: People Mining for
Expertise Search", IEEE MultiMedia at Work 15(1)
Lin, N. 2001. Social capital: A theory of social structure and action. Cambridge: Cambridge
University Press.
Lincoln, J. R., Miller, J. 1979. Work and friendship ties in organizations: A comparative analysis
of relational networks. AdministrativeScience Quarterly,24: 181-199.
Marsden, P. 1990. "Network Data and Measurement." Annual Review of Sociology (16): 435463.
McGinn, K.L. and Milkman, K.L. 2010, Shall I stay or shall I go? Cooperative and competitive
effects of workgroup sex and race composition on turnover
Munshi, K., 2003 "Networks in the Modern Economy: Mexican Migrants in the U. S. Labor
Market," The Quarterly JournalofEconomics 118, no. 2: 549-599.
Obstfeld, D. 2005. "Social networks, the tertius iungens orientation, and involvement in
innovation." Administrative Science Quarterly,50, 100-130.
Podolny, J. 2001. "Networks as the Pipes and Prisms of the Market." American Journal of
Sociology, (107:1): 33-60.
Podolny, J. and Baron, J. 1997. Resources and relationships: Social networks and mobility in the
workplace. American SociologicalReview (62:5): 673-693.
Reagans, R. and McEvily, B., 2003. Network Structure & Knowledge Transfer: The Effects of
Cohesion & Range. Administrative Science Quarterly,(48): 240-67.
Reagans, R. and Zuckerman, E., 2001. Networks, diversity, and productivity: The social capital
of corporate R&D teams. OrganizationScience (12:4): 502-517
Reagans, R. and Zuckerman, E., 2008, "Why knowledge does not equal power: the network
redundancy trade-off," Industrial and Corporate Change 17, no. 5 (October 1, 2008): 903 -
944Rodan, S. and Galunic, D. 2004. "More Than Network Structure: How Knowledge Heterogeneity
Influences Managerial Performance and Innovativeness." Strategic Management Journal
(25): 541-562.
Scott, B.D. 1996. "Shattering the Instrumental-Expressive Myth: The Power of Women's
Networks in Corporate-Government Affairs," Gender and Society 10(3): 232-247.
Shah, P.P. 2000. "Network Destruction: The Structural Implications of Downsizing," The
Academy of ManagementJournal(43:1): 101-112.
Slater, P.E. 1965. "Role Differentiation in Small Groups." American Sociological Review 20:30010.
Snijders, T. 1999. Prologue to the measurement of social capital. The Tocqueville Review, 10(1):
Sparrowe, R., Liden, R., Wayne, S., Kraimer, M. 2001. Social networks and the performance of
individuals and groups. Academy ofManagementJournal,44(2): 316-325.
Wen, Z. and Lin, C.Y.,2010. "Towards Finding Valuable Topics", SIAM International Conference
on Data Mining 2010: 720-731
Wu, F., Huberman, B., Adamic, L., and J. Tyler. 2004. "Information Flow in Social Groups."
PhysicaA, 337: 327-335.
Wu, L., Lin, C., Aral, S., Brynjolfsson, E. 2009. "Network Structure and Information Worker
Productivity: New Evidence from the Global Consulting Services Industry." Winter
Conference on Business Intelligence, University of Utah, Salt Lake City, UT.
Zaheer, A. and Bell, G.G. 2005. "Benefiting from network position: firm capabilities, structural
holes, and performance," Strategic ManagementJournal26(9): 809-825.
Identification of Influence:
An ExperimentalPlatformfor Understandingthe Relationshipbetween
Social Networks and Performance
Lynn Wu
This study creates an experimental platform for identifying the relationship between social
networks and performance. While a large body of literature has examined the correlations
between certain network topology and performance, little research has shown a definitive causal
link between social network and productivity. I address this problem through conducting three
sets of randomized field experiments using an on-line experimental platform at a large
information technology firm. The platform enables randomly selected employees to achieve
certain network characteristics. By examining work performance before and after the
experiment, I hope to tease out the causal linkage between networks and productivity.
Furthermore, I plan to distinguish the type of employees (e.g. peripheral actors) that could
benefit the most from a change in network structure.
This study focuses on the identification of influence in social networks. A large body of
literature on social networks and organizations describes the benefit of social networks on work
performance in various settings. However, little research leverages the ample data that is
created by people's interactions, such as e-mail, call logs, text messaging, document repositories,
wikis, and so on. This gap is problematic, because much of the literature on organizational
networks suffers from the same deficits as the social network literature-they both tend to be
focused on small, static networks. As a result, these studies generally show a correlation between
work performance and a person's position in the network. While these correlation studies are
important, they do not demonstrate a causal relationship. Without a collection of detailed,
large-scale longitudinal data and the capability to introduce an exogenous source of variation, it
would be hard to answer important questions like these: "Does the influence of social networks
persist over time?" "How do social networks help people achieve better work performance?" "Is
there a causal relationship between network position and productivity?"
The direction of causality is one of the most important questions that hamper the
progress of network research. While ample evidence has shown strong correlations between
network characteristics and productivity, the causal direction is unclear. If, for example, we find
that the most productive employees in the firm are those whose networks are full of structural
holes, we might conclude that these workers' positions in the social network afford them timely
access to information and consequently help them to be more productive or to outperform their
peers. An equally likely explanation is a reverse causality story, in which workers are more likely
to seek advice from a few high-performing individuals and as a result, high performers tend to
have networks with structural holes. Essentially, they are the stars who perform well because of
their personal attributes, and find themselves in certain network positions as a result of their
I present two ways to address the direction of causality between network characteristics
and work performance. First, I use the instrumental variable approach in Chapter 1 of this
thesis, finding an exogenous source of variation in a person's social network. However, any
inferences from using instrumental variables require an explicit assumption of exogeneity.
Unless the instruments were truly manipulated using randomization, it can be hard, at times, to
justify their validity. Thus, to truly introduce a source of exogenous variation, I create a platform
to conduct randomized experiments that can answer questions such as whether network
characteristics have a causal effect on performance.
The platform, using an online social
networking tool, allows manipulations to potentially alter a person's network position or their
social behaviors. For example, if structural holes are theorized to improve work performance, an
experiment can be designed to expose randomly selected individuals in the treatment group to
people whose network is more central than those in the control group. Examining the work
performance before and after the interventions, we can explore if a change of network position
can in fact alter work performance. If an improvement in work productivity is detected, it is
reasonable to infer a causal relationship, in which occupying a desirable position in a network
causes individuals to be more productive above and beyond their inherent abilities.
Understanding the direction of causality between network properties and productivity would be
a breakthrough in the field of social networks, information productivity, and organizations.
If the direction of causality is established, it is important to explore the type of workers
who benefit the most from occupying a central position in the network. It is possible that the
peripheral actors are more likely to benefit from connecting to a person who is occupying the
center of the network than people who are already at the center of the network. Similarly, it is
possible that a junior employee may benefit more from becoming more central in the network
than someone who is senior.
To precisely capture the social networks of individuals over time, I take advantage of
recent advances in information technology to collect real-time email and instant messaging (IM)
communication data in a large corporation. Since email and IM archives record detailed
communication logs, such as who has talked to whom, the exact time of the interaction, and the
content of the exchange, constructing social networks using email and IM archives allows
researchers to eliminate errors and bias that are often introduced in self-reports. With access to
detailed records of electronic data for more than 10,000 people in a large organization over
three years, I am able to map social network graphs for each person over time.
In advancing our understanding of how information workers generate value, I focus on a
class of information workers that have rarely been examined in the past: consultants that
represent a large population of information workers who generate revenue by logging "billable
hours." To explore the relationship between social network positions and productivity of these
consultants, I collect detailed and objective performance measures of more than 1,ooo
consultants, including the numbers of billable hours and participated projects, and the revenue
generated. To understand how consultants generate economic value, I also conducted extensive
interviews with 15 consultants at various stages of their careers. Through these interviews, I find
that efficient access to useful information is crucial, as timely and valuable information can
facilitate fast and high-quality decision-making and is thus critical to satisfying customers. This
is particularly important for the consulting services, as generating repeat business from existing
clients is a key performance indicator and the cornerstone of a consulting business. If
expeditious access to information improves productivity, understanding the mechanisms of how
information workers access the information through both social and technological means is
important. Continuing with the micro-analysis research on worker productivity pioneered by
Ichniowski, Shaw, and Prennushi (1997), I plan to study a single industry in depth and examine
how information workers obtain information through various communication channels and
social networks. With the cooperation of the company and employees, I have been monitoring
email and instant messaging usage to analyze the flow of information and its relationship to
network structure and work performance.
Data and Setting
To understand the micro mechanisms of how social networks can help work
performance, I analyze an electronic communication social network of more than 9000
employees over 3 years. The data contains email and instant messaging activities inside a global
information technology firm with more than 30 product divisions. To my knowledge, this is the
largest social network ever utilized to study the impact of social networks on information worker
productivity. The data is collected using a privacy-preserving social network analysis system
(Lin et al., 2008) that uses social sensors to gather, crawl, and mine various types of data
sources, including content of individual email and instant message communications, calendars,
and takes into account the hierarchical structure of the organization as well as individual role
assignments. The system is deployed in more than 70 countries and has collected detailed
electronic communication records of 9035 volunteer employees. In this study, I constrain the
analysis to focus on the sub-network for the 9035 volunteers for whom we have their complete
electronic communication data. To eliminate any potential self-selection bias from using
volunteered data, I compare the network characteristics and job roles of the volunteers with the
rest of the firm. I find minimal differences between the two populations, alleviating any
concerns about using only the sub-population consisting of only volunteers.
Table 1: Summary Statistics for Person-Level Networks
Std. Dev.
Betweenness centrality
Network Constraint
Gender: male = 1
To construct a view of the network that reflects the real communications between
employees, I eliminated spain and mass email announcements. Using the timestamp associated
with each electronic communication exchange, I mapped a dynamic panel of social networks
from 2006 to 2009. Each network is built using only the communications occurring in sliding
window of 6 months. I calculated some standard measures of networks such as centrality,
network size, and network constraints. The summary statistics for the network are shown in
Table 1. This set of data provides a rare opportunity to study how a person's social network
evolves over time.
To explore how social network is related to work performance, I obtained detailed
financial performance records of more than 10,000 consultants. I focus on 2,500 consultants in
this sample who have volunteered their electronic communication data, and collected detailed
records of 2,592 projects these consultants participated in from June 2007 to July 2008. The
sheer volume of the data permitted a more precise estimation of how population-level topology
in a network contributed to information worker productivity, after controlling for human
capital, work characteristics, and demographics. To protect the privacy of the participants, their
identities are replaced with hash identifiers. Table 2 in the result section shows the baseline
correlation results between various network measures and performance. They largely conform
to theoretical predictions and are consistent with previous correlation studies, lending
confidence to our data collection.
To understand how information is transmitted between consultants, we record the actual
content of all their electronic exchanges over a three-year period from 2006 to 2009. To protect
their privacy, all context and grammar structure of the content has been stripped and we retain
only a list of word frequencies. Using this set of words alone, it is virtually impossible to
reconstruct the original messages. From the aggregated words of information exchange, we
classify the content into 100 topics of expertise. Monitoring how a person's topics change over
time and tracking how these topics flow in and out of one's network can provide insights on how
employees obtain information through their social contacts. Using this dataset, we can evaluate
both the quality and quantity of information that each person acquires through the network.
Person-level Email Networks
Personal Monthly Personal
Personal Monthly
Dependent Var.
I Revenue
Controls: Project Complexity, Line of Business, Months, Regions, job level month, job role
Random Effect
Fixed Effect
513-35 **
Number of strong
Number of strong
links to managers
- .777
Is Manager
to managers
Reach in 3 steps
p<.1, **p<.05, ***p<.ool. Huber-white robust standard errors are shown in parentheses
Research Design
Below, I describe two sets of experiments that are used to tease out the causal relationships
between network structure and performance. Both experiments manipulate the social structure
of the treatment group and compare it with the change in the control group. The first set of
experiments makes the treatment group more likely to be connected to central actors in the
network than the control group. Both experiments leverage a social networking search tool,
called Expertise-Find that searches for experts in an organization based on keywords.
Expertise-Find, an Experimental Platform
Expertise-Find is similar to popular search engines on the Web, such as Google, with the
only difference being that instead of URLs, it returns a list of people whose expertise is relevant
to the search query. This tool aggregates as much information as it can about the employees
inside the firm using the firm's Intranet. For example, the tool can crawl and mine information
about employees using their online profiles, resumes, and online forums as well as
communication data (if they decide to volunteer their electronic communication data). These
data serve as the basis to infer individual expertise at the firm. For example, when searching for
the phrase "Social Networks," Expertise-Find would return a list of people ranked by whether
their expertise is relevant to social networks (Figure i). Each search result lists the name of the
expert, a picture (if available in the public HR directory), the job role, and the division the expert
belongs to. If one clicks on the person, the system shows more details, such as the physical work
location and contact information. By contacting experts suggested by the tool, users may be able
to find the information they need either directly, or through further recommendations from the
expert. Evidence also suggests that the relationship formed between the expert and the searcher
can become more permanent with repeated interactions. One person interviewed commented
that a friendship was formed after contacting an expert through the tool. An expert who had
helped her earlier was transferred to the same work location as she was, and she offered to help
the expert through the transition and they became friends since. Some experts also mentioned
that they received thank-you gifts from the searchers they helped, further enhancing their
relationship. The firm has a program that sponsors this type of gifts so that individuals can use
them to thank people in the organization who are helpful to their work.
Ir (SAJEd*wMr)
Social Networks
3~w Q
Figure 3: Current search result using the people search tool
The experimental platform primarily relies on changing the search algorithms of
Expertise-Find. Currently, the search algorithm aggregates all the information it can find about
a person in the intranet and creates an expertise index based on the content. It does not yet
leverage any social network parameters into the search. Thus, the current search algorithm
ranks individuals highly when their expertise best matches the searched keywords.
The main manipulation for the randomized experiments is to change the search
algorithms for a group of randomly selected individuals. These algorithms are designed such
that people with certain types of network characteristics are more likely to show up at the top of
the search results. For example, a manipulation can be designed to help a person become more
structurally diverse. By listing in the top search results individuals who are more likely to
increase the searcher's network diversity, the manipulation makes searchers more likely to
connect to these individuals and thus increase their own structural diversity than those who use
the default algorithm that does not rank research results by structural parameters. To verify that
people are on average more likely to view the top search results and actually contact the experts
they find, a user survey on search behaviors was conducted in January 2010. While that top
searches are more likely to be viewed is a general assumption and has been verified in Human
Computer Interaction research, it is nonetheless important to make sure it is the case in this
setting, as this is a crucial assumption underlying the experiments.
Table 3. Surveys on User Behaviors for Expertise-Find
How often do you use Expertise-Find?
o Bi-monthly
How often do you look at the first 5 search results?
90% of the time or more
o 6o% of the time or more
30% of the time or more
10% of the time or more
less than 5% of the time
How often do you contact people in the first 5 search results?
90% of the time or more
o 6o% of the time or more
30% of the time or more
less than 5% of the time
o 1o% of the time or more
How often do you look at the first 10 search results?
90% of the time or more
o 6o% of the time or more
30% of the time or more
less than 5% of the time
io% of the time or more
How often do you contact people in the first 10 search results?
o 90% of the time or more
o 6o% of the time or more
30% of the time or more
90% of the time or more
o 1o% of the time or more
o less than 5%of the time
How often do you look at results beyond the first page?
o 6o% of the time or more
30% of the time or more
10% of the time or more
o less than 5% of the time
On average, how many pages of search results do you go through for a typical search?
Using this platform for conducting randomized experiments, we first explore if changing
the search algorithms of Expertise-Find and manipulating the interface to Expertise-Find could
alter people's network positions. Second, if we find that the experiment could alter a person's
network position, we examine the likelihood of a changed network position affecting
productivity. Lastly, we explore the conditions under which a person's network position has the
most significant change, and how much the change is associated with productivity.
Surveys on Expertise-Find Usage Patterns
For the manipulation to have an effect on a person's social network structure, it is
important to verify that i) users are more likely to view the top search results, and 2) users are
more likely to contact the top search results from Expertise-Find. We verify this by sending a
survey to the existing Expertise-Find users asking questions about their search behaviors. The
detailed survey questions are found in Table 3. Specifically, the questions explore: i) How often
do people use Expertise-Find? 2) When they use Expertise-Find, do they focus primarily on the
top search results? 3) How often do they actually contact the people suggested by ExpertiseFind?
Table 4: Summary Statistics for Survey Results
Std Dev
Looking at Top 5 search
Contacting Top 5 search
Looking at Top 10 search
Contacting Top 10 search
Looking beyond first page
Avg pages viewed
The survey was sent out to a randomly selected group of 8o users of Expertise-Find; about 25
people filled out the survey, which is a 31% response rate. The summary statistics are in Table 5.
We find that people use Expertise-Find between every month to every 2 months, suggesting that
the experimental period should be at least 6 months or possibly longer for the manipulation to
take effect. Table 4 also shows that most individuals tend to view the top 5-10 results. More than
half the surveyed individuals browse only the first 4 pages. These results largely conform to
existing Human Computer Interaction research. Overall, results at the top of the page tend to be
viewed more often, and searchers are also more likely to contact these individuals.
Table 5: Minimum Detectable Size Calculations
sample size
We also followed up by telephoning ten individuals to investigate how they use the tool
and under what conditions they find the tool useful. From this anecdotal evidence, we find that
people are more likely to use the tool when they are starting a new project. They often like to
search for others who are working in similar areas because they are likely to be future resources
or potential competitors. Many times, they would seek these individuals when they encounter
difficult problems. We find that people resort to Expertise-Find to look for potential resources
when they face a new problem in a project and no one in their immediate network can help.
Typically, we find that it is not unusual to contact multiple individuals before finding a
satisfactory answer. However, it does not mean earlier contacts were unsuccessful searches. In
fact, they are often valuable for referring the questioner to other individuals who are more likely
to know the answer, especially once they have taken the time to understand in detail what the
questioner is asking for.
Network Structure and Performance
Three sets of experiments are implemented on the Expertise-Find experimental
platform. They are used to infer the following: 1) the direction of causality between network
positions and performance; 2) the optimal ways to change a person's network over time; 3) the
long-term implications of a temporary shock to a person's network position.
Experiment #1: Passive Network Change through Expertise-Find
This experiment leverages an existing intra-organizational search technology, ExpertiseFind, that is designed to find people in the firm whose expertise is related to the searched
keywords. Currently, each search result shows a photo of the expert with a caption describing
the job role, and the division the person belongs to as well as a link to the contact information
such as email address and work phone number. In order to generate an exogenous change to a
person's social network, we create a manipulation to change the algorithm that ranks the search
result. Specifically, the new algorithm will combine both an expertise relevance score and a
network-based score to create a new score for each search result, with the results being reordered by the new score. Currently, the weight on the search relevance score and weight on the
network-based score are the same. The network-based score can also incorporate various other
network parameters, such as network size, length of longest ties, and structural equivalence,
depending on the desired treatment on the network. For this experiment, the network-based
score uses the structural holes calculated from a person's local network. However, it is not the
structural holes of the person in each search result, but the potential structural holes that the
searcher would have generated when a link is created between the searcher and the person in
the search result. Thus, the network-based rank is ordered by the size of the change in structural
holes when adding a candidate to the searcher's immediate network. The final score is the sum
of the rank on the expertise-relevance score and the rank on the network-based score.
NewScore = Rank(search relevance score) + Rank(network-based score)
For example, if a person is ranked first in the search relevance score but ranked fifth in the
potential change in the searcher's structural holes, the score for the person is one plus five, for a
total score of six. The score is then ranked in increasing order. Individuals with lowest scores
would be displayed as the top search results.
If we assume that people are more likely to click on the top search results, they are also
more likely to be exposed to individuals whose connection to the searcher can generate the
largest improvement in the searcher's network-based score (structural holes, in this case).
Research in Human Computer Interaction has in general validated this assumption in various
settings (Borgatti & Cross 2003). We also verified this assumption through a user survey
designed to provide understanding of people's search behaviors. Figure 2 shows the search
results that the treatment group would see, and Figure 1 shows the default search order that the
control group would see when the same keywords are used in Expertise-Find.
Social Networks
lamb iaskelt hemr)
Sho t
tey b
7 kw
3 Colman
--- --__
Figure 4: Search results reordered by social connectedness
One potential concern is that by altering the ranking algorithm to include a networkbased score, people in the treatment group would not see results that are best matched for the
expertise they sought and this may potentially negatively affect their search experience.
However, the list of the search results are the same for both control and the treatment group,
but only the order shown on the screen would be different. Therefore, people on the control and
treatment group see the same list of potential candidates. This may be a problem if the search
results are long and users rarely go through more than a few pages of the search results, and
thus potentially depriving the treatment group from seeing the best-matched results. In this
case, any performance gain derived from the manipulation would be actually understated
because the gains were made despite not being able to see the search results best matched to the
Another key for the manipulation to work is to ensure that the search order will actually
change for a significant proportion of queries using the new ranking scheme. Ideally, after the
re-ranking, it is unlikely that the top result from the original search algorithm is also the same
person who has the highest network-based score. To ensure that the network-based ranks are
not collinear with the expertise-relevance ranks, we randomly sampled 100 queries and
calculated the ranks using the modified and the default algorithm. The correlation of the two
types of ranking scheme is 0.36, suggesting that the modified ranking order provides significant
deviation from the expertise-based ranking.
Control Group
The control group continues to see results from the default search algorithm, which does
not incorporate any of the network-based score to order the search results. However, we realize
that the treatment group could potentially contaminate the results from the control group.
Because both the control and the treatment group belong to the same firm, it is impossible to
create two independent worlds that prevent one group from communicating with the other. As a
consequence, any change in social structures in the treatment group can affect the control
group, but the effect is second-order. Detailed treatment for network contamination is explained
in a later section. To mitigate some of the contamination effect, we have captured the electronic
communication exchange for both the treatment and control group for two years before the
experiment starts. This will allow us to estimate the change in network structures for the control
group before and after the experiment. If the change is minimal, then we can safely use the
control group data during the experiment. Otherwise, we use a difference-in-difference
specification where we calculate the change in network structure for both the control and
treatment groups before and after the experiment. We expect the changes in network structure
to be greater in the treatment group since connection to the top search results for them should
generate a more structural diverse network.
Related Experiments
While the first experiment uses structural holes as a proxy for social connectedness to
infer their effect on performance, other network measures could also be used. Using different
measurements for social network positions is useful because comparing their productivity
impacts allow us to formally evaluate their effects on work performance. The social network
literature has long extolled the value of certain network positions. However, there is no
systematic way of effectively evaluating their causal impacts. For example, while structural holes
could be beneficial for work performance, they might not be, and even if they are, crosssectional studies do not permit evaluation of the marginal cost of attaining more structural
holes. Designing a platform that can test the performance implications of each network
parameter would be extremely valuable for understanding why and how networks matter for
performance, as well as the costs and benefits of attaining certain network positions. Below is a
list of network measures that can be tested using the Expertise-Find experimental platform.
1. Recommending high-status individuals such as project managers, directors, and
partners. The ranking system places high-status individuals at the top of the search
Recommending long-range ties: high path length. The ranking system places
individuals at the top of the search results when they have a longer path length to the
3. Homophilic recommendations. The ranking system places individuals at the top of
the search results when they have similar traits to the searcher, such as the same
gender, demographics, and job roles.
4. Heterophilic recommendations. The ranking system places individuals at the top of
the search results when they are different from the searcher, such as different
genders, demographics, and job roles.
This research design explores whether there is a causal relationship between social
network positions and work performance. Through modifying the search algorithm of ExpertiseFind, we create a mechanism to induce a change in a person's social network positions in an
organization over time. When a change in network positions is detected for people in the
treatment group, it is possible there is also an associated change in their performance. If the
associated changes for both positive, it would indicate a causal relationship between social
network positions and performance. However, it is likely that performance returns from a
change in networks, if there is any at all, will not be immediately obvious, because it may take a
significant amount of time for an investment in social networks to generate a return in
Therefore, I expect a lag between a change in network positions and any
associated performance change. Similarly, it is an interesting empirical question to examine how
long it takes for an investment in social networks to translate to higher productivity.
While detecting a change in billable revenue with a change in network positions can help
establishing a causal relationship between social networks and performance, it is equally
important to explore the mechanism behind the relationship. In Chapter 1 of this thesis, I
discussed the informational advantage provided by structural holes. To detect if more structural
holes actually generate more information, we can measure the diversity of information before
and after the experiment to see if a change in structural holes also generate more information,
and if so, what type of information. One advantage to measuring information is that while
performance improvement, as measured by billable revenue, may be hard to detect in the shortrun, information advantage as measured using email communications may be more easily
Even if we detect a change in productivity from a change in network positions, it is
important to explore if such change is only temporary. By monitoring people's network positions
and performance over time, it is possible to trace the trajectory of how a temporary shock to
network positions can affect long-term work performance. It is plausible that modifying the
search algorithm can boost a person's work performance in the short run, but it is uncertain if
this type of manipulation can have a detectable longer-term effect on productivity. It would be
interesting to observe the performance differences between the treatment and the control group
after the rank algorithm is restored to the original algorithm where only expertise-relevance is
used to rank results. It is possible that we would observe that performance returns to the preexperimental values, but it is also plausible that individuals in the treatment group would
experience a long-term advantage over those in the control group if the performance differences
continue to persist. This could have tremendous value for the firm, especially if a short-term and
a simple shock to the network position can induce a long-term improvement in performance.
A similar question can also be asked about the network structure. After we return to the
original ranking algorithm, it would interesting to examine if social network structures in the
treatment group could also return to the pre-experimental structure. Without constant
reinforcement of recommending individuals who can generate significant change to a person's
network position, I suspect the network position of the person may return to its original state.
This is because a contact made through Expertise-Find may often involve an arm's-length
transaction. Typically, the contact is made when the searcher needs help with a difficult
problem. Once the problem is resolved, the two people may cease to have future interactions.
However, during interviews, we have heard anecdotal evidence that contacts through ExpertiseFind can also develop deeper working relationships with the searcher. Thus, it would be
interesting to observe if there is network decay after the experiment and, if so, the rate of the
Lastly, measuring how fast an intervention alter network positions is used to gauge the
cost of attaining certain network positions, such as having more structural holes or having more
high-status network contacts. Combining the potential gains in performance with the costs of
generating a change in network positions allows us to evaluate the net benefits of attaining
certain network positions. Most research in the past only gauged the positive correlations
between certain network positions and performance without explicitly addressing the cost
associated with arriving at the network position. While the literature has shown a positive
correlation between structural holes and work performance, it may still be unfavorable for a
person to strive for structural holes when the costs are high. For example, consider the following
hypothetical scenarios. In the first one, a 1%increase in structural holes takes 3 months to attain
and is associated with a $328 increase in monthly revenue. Compare this with a second
hypothetical scenario where a 1%increase in the number of strong ties to managers would take
at least 12 months to attain but is associated with generating an $800 increase in monthly
revenue. Without considering the costs, we may conclude that having strong ties to managers is
more beneficial, but after taking costs into account, it may be worthwhile for an employee to
pursue structural holes, especially when the employee is seeking a faster return on performance.
Gauging the costs of generating a certain type of network is important because it is an
underexplored issue in network research. Without a cost analysis, the performance
improvement from a network change alone is by itself insufficient for deriving the net benefits
generated from social network positions. The summary of questions addressed by Experiment
#1 is in Table 7.
Table 7: Research questions addressed by Experiment #1
Can a simple manipulation such as an online experiment change a person's network
position? If so, how long does it take to change a person's network position?
If a network change is generated, how long does it last, what is the rate of decay if any?
Can a network change generate a change to accessing more information?
Can a network change improve work performance as measured by billable revenue?
What is the cost-benefit analysis for changing network positions? Which network position
generate the largest reward?
Network Contamination
Ideally, the treatment and the control group are two independent worlds with no
interactions between them, ensuring a precise estimate of the performance effect from a
network change. While separating individuals into independent worlds is easily achievable in a
laboratory setting, it is nearly impossible in the field, especially for an experiment with a
relatively long span of time. The contamination is further exacerbated in a single organizational
setting, because a large majority of individuals are often connected through a single cluster,
creating a challenge for dividing them into relatively independent groups. However, we still
believe in the benefits of running randomized experiments in the field to gauge the causal effect
of networks on performance in a real business setting. The results may have immediate
managerial implications that might not have been evident in a lab experiment. While it is
difficult to eliminate contamination entirely, we try to minimize it in two ways. First, we avoid
using global network measures where an addition or a deletion of an edge in a person's ego
network could generate a ripple effect to the rest of the network. For example, increasing one
person's betweenness centrality can indirectly change the betweenness centrality of others in the
network, because betweenness centrality is a global network measure that requires calculating
all pathways between any two nodes in the network; any perturbation in the network can change
the betweenness centrality measures for the rest. Thus, using betweenness centrality can be
problematic, since a network change for individuals in the treatment group could change the
network positions of individuals in the control group, confounding the estimation of the
treatment effect. To minimize this type of contamination, only local network measures should
be used. For example, structural holes are calculated based on ties that are one degree and two
degrees apart from the ego. Similarly, adding connections such as long ranging ties, homophilic,
and heterophilic ties should not affect the rest of the nodes in the network and thus, they do not
suffer from network contaminations.
Even using local measures that are not prone to contaminate across groups, it is still
important to mimic the two independent worlds as much as we can by finding the maximum
number of participants (N) that minimizes the distance between any pair of nodes to be at least
steps away. Thus, on average adding an edge in a person's local network in the treatment
group would not affect the local network of individuals in the control group. Through
simulations, we pick N to be 4,770 individuals, which represents a good trade-off between
finding a large enough sample size and minimizing the distance between any two nodes. The
simulation algorithm is in Table 6. The intuition behind the algorithm is counting in a set of
nodes the number of pairs that are within a distance of 2 or less from each other. The nodes are
randomly selected so they have the same probability of having the same distance from each
other. When the set is small, the chance that the two nodes are close together is relatively low,
but the set gets larger, the chance of having multiple pairs of nodes that are close together
increases. For each size of the set, i, ranging from 1 to 9,035 (the total number of nodes), we
randomly choose i nodes and calculate the number of pairs whose distance is less than 2 steps.
We repeat this process 100 times for each i, so that we can find the average number of pairs that
have a distance of less than 2 steps. We then plot this number as a function of i. As shown in
Figure 3, the graph is nonlinear and we pick the size of the set to be 4,470, at the point where the
inflection becomes relatively large. When the network is dense, i tends to be much smaller but
because the network generated from the electronic communications of 9,035 employees is
relatively sparse, 4,470 nodes is still feasible to minimize contamination and at the same time
preserve a large enough sample size.
Table 6: Algorithm to determine the basic sample saize
for (n= 1to N-1)
for (m=1 to M)
for all pairs <ni n-j> in n
ei = find direct edges (or edge) between nji and n-j
e2= find path less than 2 between ni and nj
store <n, <E>/M>
To minimize future errors, we choose to gradually increase the number of subjects in the
treatment group, initially starting with a relatively small number of people, and then increasing
. ........................
. ......
---- -------------------- -----
gradually to the full size. Starting the experiment with a smaller group can help by detecting
errors earlier in the process and thus making it possible to eliminate these errors before the full
experiments are deployed (Kohavi et al. 2009). Slowly ramping up the experiments also allows
us to monitor the level of network contamination and adjust the eventual sample size if
Edge Node Simulation
-4 r
Number of Randomly Selected Nodes
Figure 3: Simulation for choosing sample size to minimize contaminations
Sample Size
The minimum requirement for any experiment to succeed is that the manipulation must
alter subjects' behaviors as intended. To assess whether network positions, such as structural
holes, can causally affect work performance, it is important to ensure that the manipulation can
actually alter the network positions for individuals in the treatment group. It is necessary that
the sample size be large enough to detect any change generated in a statistically significant way.
Using a power of 8o%, the minimum sample size needed is 3,770 people in order to ensure that
a 5% change in structural holes can be detected. Since we have more than 9,ooo individuals in
the sample that is more than enough to satisfy the requirement. After taking network
contamination into account, 3,770 people is still well below the sample size of 4,470, the
maximum network size to minimize network contamination. Overall, these results give
confidence that it is possible to detect a network change from the manipulation. We intend to
monitor the changes throughout the experiment to trace how long it takes to reach certain
measurements of structural holes.
To link a network change to an increase in performance, we also need to detect a
significant change in billable revenue-the primary performance measure. Using a power of
8o%, we need a sample of at least 3,467 consultants in order to detect a 5% change in billable
revenue. However, we only have 2,881 consultants in the sample, so it would be difficult to
detect the same level of change as with structural holes. Table 5 provides the sample size needed
under various power and sensitivity values. Using all the 2,881 consultants allows us to detect a
6.5% change in monthly billable revenue. Using half of them can detect a 7.5% change in billable
revenue, which is a substantial change for a person's performance. Thus, we expect that
detecting a statistically significant change from billable revenue takes much longer than
detecting a corresponding change in network positions. Monitoring the change in billable
revenue over time is important, since it may take multiple months to detect a statistical change
in billable revenue. However, capturing the time it takes to detect the change is an interesting
question by itself. To ensure we capture some intermediate outcomes, we also plan to generate
alternative outcome measures, such as the number of new projects a person started or the
instances of repeat business. Lastly, a survey is planned to gauge the level of job satisfaction, the
likelihood of finding new project opportunities, and the speed with which answers to difficult
problems can be found. These intermediate outcomes are pertinent, as they may eventually lead
to higher billable revenue.
Experiment #2: Passive Network Change through Feedback
The second set of experiments is designed to actively alter people's network
characteristics by providing them with information about their social networks. Instead of
attempting to maximize a person's network position by actively manipulating the search
algorithm that recommends potential connections, this intervention provides feedback about
the individual that may influence the person to change their social networks themselves. Figure
3 provides an example of how the experimental platform is used to provide people with
information about their networks. When users in the treatment group log onto Expertise-Find, a
message about a certain property of their networks shows up on the page. In Figure 3, we show a
message informing the person how many people are directly connected to her. We then provide
the summary statistics of employees at the firm who are similar to the user, as in rank, job role,
and demographics. For example, the message in Figure 3 has summary statistics for the average
of all Level 7 employees in North America who are in management consulting. In addition to
network size, we can also provide other types of network information, such as 2-step network
reach (Figure 5) or centralities or structural holes. However, it may be difficult for average users
to understand the concepts of betweenness centrality and structural holes. Network size and
range, on the other hand, are easier concepts for an average user to grasp. The goal of the
experiment is only to prompt the users to think about their network when compared with the
networks of others, but not necessarily to provide any specific strategy or recommendation (as
in Experiment #i).
Observing how different types of individuals respond to the message can have important
implications for the policies to provide feedback. For example, we may see a pattern of reversion
... .....
. .........
____ . ......
to the mean, in which individuals whose network size is above the average may choose to do
nothing, while individuals whose network size is below the mean may choose to actively pursue
new network connections. Similarly, if we provide information about the maximum number of
connections to individuals, it may prompt people, especially the most competitive individuals, to
strive for more connections.
d N
Tait I hil
Umea,for (sImec keywords)
Ofetmoe: it0bRale:
O1 eo tem
.i f We FmlWiOft
You have 21 people in your network. Band 7 consultants in North America have on average 46 people in
their networks.
Please invite your colleagues to join SmallBlue. The more people who join, the better SmallBlue will be.
Use the search bar above to expertise In your extended social network,
New to SmaBilue? Find out more
Figure 5: Homepage when logged in for the treatment group. A message displays to inform
about a person's network range.
Control Group
The control group uses the original Expertise-Find interface with no information about
the user's network statistics. However, in order to avoid having a significantly different
homepage between the treatment and the comparison groups, we also provide a message for the
control group, but it is neutral and does not contain any network information. For example, in
Figure 4, two neutral messages are shown to the control group. The first simply shows the
number of individuals who are using the social networking tool and the second is an advertising
message that encourages people to participate in using Expertise-Find. These informational
. ..............
. .....
messages are not designed to influence an individual to change their networks.
Umftf I
(solCt keyWordot)
, ies
tow 01050 PM"olo
Did you know 10,232 people have joined the smallblue community?
Please invite your colleagues to join SmallBlue. The more people who join, the better SmallBlue will be.
Usethe searchbarabove to expertise in your extended socialnetwork.
New to SmatBlue? find out
Figure 6: The default home page for the control group. The message does not include any
network parameters.
ter (oot*Wky"oood)
sMaeDoooroM Tanf of W
Oiuismoo Job Role:___
-. c~y
You can reach 126 individuals from your network connections. Band 7 consultants inNorth American can
reach 300 individuals from their network connections.
Please invite your colleagues to join SmallBlue. The more people who join, the better SmallBlue will be.
your extended social network.
Use the search bar above to expertise
New to Smalllue? Fnd out
Figure 7: The homepage for a treatment group to displays network reach for the person.
Experiment #2 prompts the users to actively change their networks by providing
information about their networks and how they compare to people similar to themselves. This
contrasts to the strategy in Experiment #1 that is designed to passively change a person's
network through system recommendations. In the passive experiment, users are not aware that
their search algorithm is modified to simultaneously maximize a network parameter such as
structural holes. Thus, when searchers in the treatment group make these connections, they are
not necessarily aware that these connections are enhancing their network positions. Because
their intention in making these connections is no different from that of the control group, any
change in the network position can be attributed to the system's manipulation, not users' intent.
Experiment #2, on the other hand, tries to induce users to change their social networks of their
own volition. The goal is to make the users more aware of their own social networks. Providing
feedback on their network positions and how they compare to people who are in similar job
roles, ranks, and demographics could potentially induce users to change their networking
behaviors. Literature has shown that providing feedback can be effective for changing behaviors
(Becherer, Morgan & Richard 1982). For example, providing timely feedback about their work
performance can help the employees to increase their organizational commitment and job
satisfaction (Becherer, Morgan & Richard 1982). Similarly, by providing feedback about their
social network statistics, we hope to influence individuals to alter their behaviors and change
their network positions. Both the feedback and how it is presented can have an impact on how
people change their behaviors. For example, providing only the mean may prompt individuals
who are below average to actively pursue network connections. Providing the maximum, on the
other hand, may prompt more competitive individuals to change.
Measuring network characteristics before and after the intervention, we can gauge
whether there is a network change, and if so, how large the network change is when individuals
are provided with information about their social networks. We can compare the results with that
of Experiment #1 to evaluate their relative effectiveness in generating a network change.
However, we should also take the costs of the change into account. It is possible that the extra
costs of actively seeking beneficial connections are higher than the costs of making connections
that are passively suggested by an online social network system. In Experiment #1, specific
recommendations are made after each search and individuals simply need to follow up with the
recommendations. The costs to pursue these recommendations should not be any higher than
the recommendations that are generated by the default algorithm. After all, the searchers are
active pursuing information from their colleagues; they would pursue the recommendations
regardless whether the person just also happens to maximize their structural holes. On the other
hand, in Experiment #2, individuals need to actively seek beneficial connections themselves
before attempting to make them. Therefore, they incur the extra cost of figuring out with whom
they can connect first. Thus, I suspect it may cost more effort and time for a person to actively
seek social connections than passively connecting to individuals through a system
However, it is likely that the connections an individual actively seeks are fundamentally
different from the connections suggested by Expertise-Find. For example, subjects may make a
connection through Expertise-Find because they need an answer to their current problem at
work, and the relationship may be short-lived. On the other hand, connections made after
receiving network feedback may be more strategic and have greater potential value. Thus, it is
an empirical question to understand the quality and the type of connections made from the two
sets of experiments. Perhaps, if there are such quality differences, the performance impact from
network change by providing feedback about an employee's network is also different from
manipulating Expertise-Find's rankings. While passive manipulations may induce many shortterm connections for a specific goal (and the ties may dissolve shortly after), actively reaching
for network connections may have long-term performance and network implications. Table 8
summarizes the research questions addressed in Experiment #2.
Table 8: Research questions addressed by Experiment #2
Can providing feedback change a person's network positions? If so, how big is the change
and how big is the performance resulted from the change in networks.
Compare the network change generated by a passive manipulation (Experimental #i) vs. a
change generated by an active manipulation (Experiment
Compare the performance change between Experiment #1 and Experiment #2
Compare the information advantage generated from Experiment #1 and Experiment #2
Network Contamination and Sample Size Calculations
In this experiment, we use all 9,035 employees in our sample to study how providing
feedback could drive network change. To avoid network contamination stemming from using
global network measures, only local network measures are used to gauge a network change.
Since this intervention does not recommend any specific individuals to maximize a person's
network measure as it did in Experiment #1, we are interested to see not only if there is a change
in network positions but also the type of network change that feedback can provide. For
example, a new connection could be a random person in the company, or it could be a manager,
or someone from a different group, or a person who is central in the organization in both formal
hierarchy as well as informal social networks. To gauge the new types of ties formed by an
individual, we use the network characteristics of people connected to the individual prior to the
intervention in order to avoid potential network change derived from the experiment.
We also need to know the minimum sample size required to detect changes in different
types of network ties. With 80% power and 9,035 employees in the sample, we can easily detect
a statistically significant change. Table 5 lists the sample size required for testing various
network outcomes. For example, to detect a change caused by adding one extra manager in the
network only requires a sample of 161 people. To identify a change in performance, we can
detect a 3% change in monthly revenue with 9,035 individuals.
Experiment #3: Active vs. Passive Tie Formation
This experiment explores if there is a complex relationship between ties formed through
actively seeking for social connections and ties formed passively through changing the
underlying search algorithm for Expertise-Find. It is possible that ties formed passively are
fundamentally different from ones formed actively. Thus, the two mechanisms may be
complementary for forming new ties. However, it is also possible that they are substitutes if they
generate the same underlying network structure. Thus it is an empirical question to examine if
the two act as complements or substitutes for forming social connections and if there are any
performance implications.
Robustness Checks
Although randomized experiments can help us tease out the causal relationship between
network structures and work performance, there might be other intervening factors that will
modulate the relationship. For example, when opportunities are scarce and few projects are
available, social networks may have a stronger effect on work performance, as people tend to
activate their social networks to look for new opportunities. Similarly, when opportunities are
abundant and workloads are high, people may be less likely to leverage their ties to seek more
projects. The adversity or prosperity of the economic environment may condition the overall
effect from social networks. In interpreting the results obtained, it is important take into
account the economic environment while the experiment is being conducted. To address these
issues, I plan to conduct several interviews after the experiment to understand how workers
leverage social networks during unusual economic conditions.
However, if the experiment occurred during special circumstances such as in a recession,
it is important to understand how people enact their social network when facing distressing
circumstances. Recently, a steam of work explores how people activate their social ties in dealing
with complex problems within a difficult environment (Srivastava, 2o1; Smith Menton &
Thompson). With real-time and moment-to-moment interactions as captured by people's digital
trace, we may be able to understand the intermediate mechanism of how people enact their
social networks and if the enacted social networks have different properties than the people's
latent social networks.
There are concerns that even within the same firm, different cultures may induce people
to use their social networks differently. Several studies have shown that the role of brokerage
may vary significantly based on the culture and seniority in the firm (Bian, 1997; Burt, 2007).
The beauty of using randomized assignment is that it eliminates these types of bias. However, to
make our estimate more precise, we choose to study consultants in the United States only and
we plan to stratify based on covariates and randomize the treatment based on a group of
Conclusion and Pre-Experimental Statistics
Currently, I have analyzed the correlations between various network measures and work
performance. Specifically, I have uncovered three preliminary results. First, the structural
diversity of social networks is positively correlated with performance, corroborating previous
work. Second, network size was positively correlated with higher productivity. However, when
we separated network size into in-degree and out-degree, we found that while in-degree is
positively correlated with higher work performance, out-degree is not correlated with
performance in the project network - that is, where each node was a project, not a person.
Third, for both the employee and the project network, knowing powerful individuals such as
executives is positively associated with work performance. However, having many managers on
a project is negatively correlated with project revenues. To understand the detailed mechanisms
and causal relationships of these results, I plan to conduct randomized experiments to test the
causality of these correlations. Establishing the causal relationship between social networks and
work performance would be an important contribution to the literature of social networks and
information worker productivity.
I plan to explore the mechanism of how social networks enable timely access to
information through mining the actual content of people's electronic communications. Through
this exercise, I hope to understand how people acquire knowledge in work settings and under
what conditions experts actually improve work performance.
Aral, S., Brynjolfsson, E., & Van Alstyne, M. 2006. "Information, Technology and Information Worker
Productivity: Task Level Evidence." Proceedings of the 27th Annual International Conference on
Information Systems, Milwaukee, Wisconsin.
Aral, S., & Van Alstyne, M. 2007. "Network Structure & Information Advantage" International Conference
on Network Science 2007
Becherer, R. C., Morgan F., and Richard, L.M. 1982, "The Job Characteristics of IndustrialSalespersons:
Relationships to Motivation and Satisfaction," Journal of Marketing, 46, 125-135.
Burt, R. 1987. "Social Contagion & Innovation: Cohesion versus Structural Equivalence." American
Journal of Sociology, 92: 1287-1335.
Burt, R. 1992. "Structural Holes: The Social Structure of Competition." Harvard University Press,
Cambridge, MA.
Burt, R. 1997. "The Contingent Value of Social Capital", Administrative Science Quarterly, Vol. 42. No. 2
Burt, R.
2004. "Structural
Holes & Good Ideas" American Journal of Sociology, (110): 349-99.
Lin, C., Ehrlich, K., Griffiths-Fisher, V., and Desforges, C., SmallBlue: People Mining for Expertise
Search, IEEE Multimedia Magazine, Jan.-Mar. 2008.
Coleman, J.S. 1988. "Social Capital in the Creation of Human Capital" American Journal of Sociology,
(94): S95-S120.
Freeman, L. 1979. Centrality in social networks: Conceptual clarification. Social Networks 1(3)
Garguilo, M., and A. Rus 2002 "Access and mobilization: Social capital and top management response to
market shocks." Working paper, INSEAD.
Granovetter, M., 1973. "The strength of weak ties." American Journal of Sociology, 6: 1360-1380.
Granovetter, M., 1982. "The strength of weak ties: A network theory revisited." In P. V. Marsden and N.
Lin (eds.), Social Structure and Network Analysis 1o5: 1-30.
Granovetter, M. 1985. "Economic Action & Social Structure: The Problem of Embeddedness." American
Journal of Sociology (91):1420-1443.
Granovetter, M. 1992. "Problems of Explanation in Economic Sociology." In N. Nohria & R.G. Eccles
(eds.), Networks & Organizations: 25-56. Harvard Business School Press, Boston.
Hansen, M. 1999. "The search-transfer problem: The role of weak ties in sharing knowledge across
organization subunits." Administrative Science Quarterly (44:1): 82-111.
Hansen, M. 2002. "Knowledge networks: Explaining effective knowledge sharing in multiunit
companies." Organization Science (13:3): 232-248.
Pentland, A. 2006. "Automatic mapping and modeling of human networks" Physica A: Statistical
Mechanics and its Applications.
Podolny, J., and Baron, J. 1997. "Resources and relationships: Social networks and mobility in the workplace." American Sociological Review (62:5): 673-693.
Polanyi, M. 1966. "The Tacit Dimension." New York: Anchor Doubleday Books.
Reagans, R. and McEvily, B. 2003. "Network Structure & Knowledge Transfer: The Effects of Cohesion &
Range." Administrative Science Quarterly, (48): 240-67.
Reagans, R. and Zuckerman, E. 2001. "Networks, diversity, and productivity: The social capital of
corporate R&D teams." Organization Science (12:4): 502-517
Smith, E., Menon, T., & Thompson, L., Status Differences in the Cognitive Activation of Social Networks
(September 22, 2010). Organization Science, Forthcoming.
Waber, B.N., Olguin Olguin, D., Kim, T., Mohan, A., Ara, K., and Pentland, A. 2007. "Organizational
Engineering using Sociometric Badges" International Conference on Network Science, New York, NY.
Wu, Waber, Aral, Brynjolfsson & Pentland ""Mining Face-to-Face Interaction Networks Using
Sociometric Badges: Predicting Productivity in an IT Configuration Task", International Conference
on Information Systems, Paris, France, December 14 - 17,2008.
Uzzi, B. 1996. "The Sources and Consequences of Embeddedness for the Economic Performance of
Organizations: The Network Effect." American Sociological Review, (61):674-98.
Uzzi, B. 1997. "Social Structure and Competition in Interfirm Networks: The Paradox of Embeddedness."
Administrative Science Quarterly, 42: 35-67.
Water Cooler Networks:
PerformanceImplications of Informal Face-to-FaceInteractionStructures
in Information-Intensive Work
Lynn Wu
MIT Sloan School of Management
Benjamin N. Waber
MIT Media Laboratory
Sinan Aral
NYU Stern School of Business &MIT Sloan School of Management
Erik Brynjolfsson
MIT Sloan School of Management
Alex (Sandy) Pentland
MIT Media Laboratory
This study examines the performance characteristics of face-to-face interaction networks and
finds that their structural properties are important for effective knowledge transfer and
productivity. We argue that network theory should incorporate the implications of media choice,
and particularly differences between face-to-face and electronic communication, when assessing
how networks affect individual performance. We introduce a new methodology, using
Sociometric badges, to record precise data on face-to-face interaction networks for a group of
workers in a large IT manufacturing firm over a one-month period. Linking these data to
detailed performance metrics, we find that 1) network cohesion is associated with higher worker
productivity, in contrast to previous findings in email data; 2) cohesion in face-to-face networks
is associated with even higher performance during complex tasks, suggesting that cohesion
complements information-rich media for transferring the complex knowledge needed to
complete such tasks; 3) while information-seeking from many colleagues creates disruptions,
more interactions with a few key strong-tie informants speeds up work. Face-to-face networks
have more explanatory power than physical-proximity networks, suggesting that information
flows in actual conversations (rather than individuals' correlated exposure to common
environmental factors through physical proximity) are driving our results. These results
augment our understanding of how media choice and network structure interact, shedding light
on the organizational effects of face-to-face interaction. The methods and techniques we
introduce are replicable, creating opportunities for new lines of research into the consequences
of face-to-face interaction in organizations.
Keywords: Social Networks, Face-to-Face Communication, Information Worker Productivity,
Sociometric Badge.
Social networks are theorized to affect work performance due to their central role in the
informal structure of organizations (Sparrowe et al., 2001). Numerous studies have shown that
social networks can affect organizational power, innovation, creativity, and individual and team
performance (Sparrowe et al.,
Cumming and Cross, 2003; Krackhardt, 1990). However,
while most studies elicit generalized social networks ties or specific types of relationships (such
as friendship or advice seeking relationships) through the use of survey instruments or
electronically recorded communication logs such as email or telephone records, almost no
research examines the performance effects of face-to-face conversational networks using
behavioral data.
Face-to-face conversations, both formal and informal, remain a significant part of
organizational communication which to date has been understudied. Valuable information
passes through verbal communication at the proverbial water cooler and face-to-face
communication can, through non-verbal cues, transmit important calibrations of norms and
culture and provide a medium for the informal development of trust and affect among
organizational members (Csikszentmihalyi, 1996). The lack of studies on face-to-face networks
represents an important gap in social networks research and our understanding of the informal
structures that facilitate work in modem organizations.
Recent advances in electronic communication and ubiquitous email use give researchers
the opportunity to solicit networked interactions through moment-to-moment email
communication data and thus to eliminate some of the known biases of survey instruments
(Quintane and Kleinbaum, 2008). As email communication logs record who has emailed whom,
the exact time of the interaction, and the content of the exchange, email archives allow
researchers to study the mechanisms through which electronic communications impact
organizational structure and work performance (Aral et al., 2006, 2007). Attention has also
turned to other forms of remote collaboration, with researchers examining how groups
coordinate in virtual environments (Huang et al., 2009). However, while email has certainly
become an important communication tool over the last fifteen years, face-to-face conversations
remain a critical and in many cases predominant mode of communication (Chidambaram and
Jones, 1993). What is known about electronic communication networks does not necessarily
inform us about the implications of face-to-face communication networks. Face-to-face
conversations are likely to deliver fundamentally different types of information and advice than
what is transferred via email or over the telephone. Consequently, network properties associated
with improved work performance in face-to-face networks may differ dramatically from
networks actualized through other modes of communication. If we utilize our face-to-face
networks differently than we do our electronic networks - if we transfer different types of
information in different ways across these different media - then the ubiquity of electronic
communication data could lead researchers to invalid generalizations about organizational
networks if electronic communication is accepted as a broad proxy for the generic social
structures instantiated in organizational communication.
Unfortunately, until now, recording precise and reliable data on face-to-face interaction
has been difficult, as it requires moment-to-moment records of individuals' conversations over
time and because there is no trace, electronic or otherwise, of most face-to-face interactions.
While self-reports may be good instruments for recording perceived social ties, they do not
typically capture accurate data about actual face-to-face conversations (Marsden 2005).
Recording precise moment-to-moment interactions between individuals is essential to
understanding how workers seek and access information informally to solve complex problems
at work.
To fill this gap, we employ a new data collection method that utilizes Sociometric badges
developed at the MIT Media Laboratory to record continuous face-to-face interactions among
employees at a commercial IT hardware facility over time.
Recording actual face-to-face
interactions, we eliminate some of the known biases in self-report studies (Marsden, 2005).
Thus, we are able to introduce observable characteristics of face-to-face communication into
social network analysis to understand how such communication enables or constrains
information transfer and work performance. Combining data on face-to-face interactions with
project and accounting data on the relative performance of the same set of workers, we evaluate
which face-to-face network structures best predict higher performance and whether these
structures differ from those found to predict productivity in the context of electronic
communication networks such as email and telephone communication.
Our analyses uncover three key results. First, we find that in face-to-face networks,
cohesion and strong ties are positively correlated with higher worker productivity. This contrasts
what has been found in email communication data (Aral et al., 2008; Wu et al., 2009) where
diverse networks with weak ties and structural holes are correlated with higher performance.
These results imply that the mode of communication is essential to understanding the value of
communication in network structures. That cohesive networks are valuable in thick
synchronous information-rich communication channels that provide non-verbal cues (e.g. faceto-face),
while structurally diverse networks are valuable in codified asynchronous
communication channels (e.g. email), suggests that different network structures complement
different communication channels, providing evidence of the need to incorporate theories of
communication media choice into social network theory. Second, network cohesion is associated
with even higher performance when workers are executing complex tasks, suggesting the need
for tight clustered networks to transfer the complex information and knowledge needed to
complete complex work. Third, while information seeking from many colleagues creates
disruptions, more interactions with a few key strong-tie informants speeds up work, implying
that larger networks are costly to maintain, while high bandwidth networks of fewer contacts
provide access to relevant information most efficiently. These results are much stronger in
conversation networks than in physical proximity networks, indicating that information flows in
actual conversations (rather than individuals' correlated exposure to common environmental
factors through physical proximity) are driving our results.
Although we cannot firmly identify the direction of causality in our results, panel data
estimates eliminate bias from any unobserved time-invariant factors that may confound the
Furthermore, on-site visits and interviews support our conclusions - employees
corroborated their use of face-to-face conversations to communicate complex and embedded
knowledge. These results demonstrate the importance of face-to-face social networks in
predicting worker productivity even as technology-mediated communications
ubiquitous. Differences in the types of network structures associated with performance in faceto-face networks compared to electronic communication networks suggests a need for social
network theory to incorporate media choice as a significant driver of network outcomes. Such
evidence is important for managers who face increasingly global and geographically dispersed
work environments, as electronic communication networks alone may not be enough to transfer
the complex tacit knowledge needed for the successful execution of complex tasks.
In this study, we link social network theory (e.g. Granovetter, 1973, Burt, 1992) and
characteristics of face-to-face communication (e.g. Daft and Lengel, 1986, Chidambaram and
Jones, 1993) to understand what types of network structure are most conducive to transferring
knowledge and improving work performance in face-to-face work environments. Specifically, we
contrast new evidence on face-to-face networks with prior results on email networks. Using
electronic communication data, prior work has found that structurally diverse networks with
many weak ties and structural holes are associated with improved individual work performance.
We contend that the opposite should be true in face-to-face networks. Electronic networks such
as email facilitate information sharing that is constrained by the medium to be codifiable and
simple. In contrast, information exchanged in face-to-face networks is more likely to be tacit and
more complex. We hypothesize that such information is transferred more effectively in cohesive
networks of strong ties. By elevating face-to-face network data collection to comparable
standards of accuracy and precision found in electronic communication data, we open new
avenues for understanding how media choice interacts with social network structure to facilitate
knowledge transfer and performance. Although many different types of network data, from call
logs to email to surveys, are used to examine network structure and knowledge transfer in
organizations, most modern network theory remains agnostic about the communication
channels through which information will flow, leaving open the possibility that different
network structures complement different communication media to facilitate knowledge transfer.
By hypothesizing and testing a set of predictions in face-to-face networks which contrast those
that have been shown to hold in electronic networks we substantiate a call to bring media choice
back into considerations of the relationship between network structure and work performance.
The Effect ofNetwork Cohesion in Face-to-Face Networks
As face-to-face communication offers the richest medium for sharing complex
knowledge and cohesive networks provide the most effective structure for transferring such
knowledge, we hypothesize that cohesive networks are likely to complement face-to-face
communication in effectively locating and transferring complex tacit knowledge.
Face-to-Face Communication and Knowledge Exchange
A long line of research in organizational communication theory, particularly information
richness theory, posits that face-to-face conversation is the richest mode of communication,
providing multiple social cues through both natural language and body language, greatly
reducing equivocality (Daft and Lengel, 1986, Chidambaram and Jones, 1993). Face-to-Face
communication has two important properties that help facilitate information transfers: the
ability to transmit complex and tacit information and the ability to foster trust between actors.
Face-to-face communication is thought to have the greatest capacity to transfer complex
knowledge (Roberts, 2000). Complex knowledge is typically defined in terms of its codifiability
and interdependence. When knowledge is codified, it has a stable meaning and can be expressed
in writing since symbols representing the knowledge (e.g. mathematical formulas, acronyms,
etc.) are already widely understood throughout an organization (Brynjolfsson, 1994; Hansen,
1999; Reagans and McEvily, 2003). On the other hand, tacit knowledge that is difficult to codify
cannot be precisely expressed using existing symbolic representations. Transferring tacit
knowledge can be difficult, because it is hard to articulate. Furthermore, even when knowledge
can be codified, it can still be difficult to transfer if its component concepts are complex or
highly interdependent. This interdependency is characterized by the degree to which knowledge
is part of a larger system of interrelated concepts (Teece, 1986; Winter, 1987). Transferring such
knowledge can be particularly challenging, as it requires the transmission of knowledge related
to the larger conceptual system in addition to the specific knowledge itself. Furthermore,
understanding tacit knowledge may be much more important than understanding codified
knowledge in information-intensive settings. Tacit knowledge is often heavily embedded in a
system and is thus more valuable for solving local or context-dependent problems. Similarly,
because codified knowledge can be easily learned, competitive advantage is more likely to arise
from understanding and using tacit knowledge.
Face-to-face communication provides an efficient channel through which to transfer
tacit and interdependent knowledge, as it facilitates interaction across multiple levels of
communication-verbal, physical, and contextual. In a face-to-face conversation, it is natural to
interrupt, learn and give feedback as two people interact, increasing the information processing
power of the exchange (Nohria and Eccles, 1992). During a face-to-face conversation, one party
can see how the other party is responding and can strategically alter the presentation to facilitate
communication, especially if the concept is complex (Goffman, 1982). When people meet faceto-face, they are more likely to devote more energy to the other party, as face-to-face
conversations often require full cognitive attention as oppose to other forms of communication
that do not require an immediate response. For example, while email can deliver the same
verbal content as a conversation, informal social mechanisms that ensure recipients devote time
and energy to absorbing the content of the conversation are missing. Thus, face-to-face
communication can be more effective at transferring tacit and interdependent information than
other communication media such as email or the telephone. Furthermore, simultaneously
processing information in multiple ways is critical to creativity and problem solving (Bateson
1973; Csikszentmihalyi 1996). Physical, verbal and contextual cues enabled by face-to-face
conversations can be complementary and mutually enriching, leading to discoveries that would
not have been possible using more asynchronous communication channels.
Face-to-face interactions also foster trust and provide actors the motivation they need to
devote time and effort to transferring complex or tacit information. In a face-to-face
conversation, people can read others' intentions, because humans are quite effective at sensing
and processing non-verbal messages, particularly about emotion and trustworthiness (Putnam,
Similarly, face-to-face interactions can accelerate the bonding process and foster
informal friendship networks (Storper and Venables, 2003). Water-cooler conversations during
breaks allow workers to develop a sense of members' expertise, competence, character and
personalities outside the immediate task environment. These informal conversations create a
basis to develop trust. When people communicate frequently, especially informally, they tend to
create stronger bonds upon which trust can be built (Cheepen, 1988). Once trust is developed, a
source is more willing to initiate knowledge transfers and to work to ensure that recipients
understand the information even if the knowledge is complex and difficult to share (Reagans
and McEvily, 2003).
Network Cohesion and Knowledge Exchange
Network cohesion also supports transfers of tacit complex knowledge. To find a solution
to an ambiguous and complex problem, an information seeker must be able to articulate her
request clearly to others. Absorptive capacity can facilitate the expression of ideas in ways that
others can understand (Cohen and Levinthal, 1990). A cohesive network can effectively increase
absorptive capacity as repeated communication from multiple perspectives allows a group of
actors to develop group-specific communication heuristics that ease the expression of complex
and ambiguous problems. Contacts in a cohesive network can then be more effective in
identifying relevant recommendations and transferring necessary information especially if it is
tacit and context-dependent. Through frequent communication, actors are less inhibited from
asking for clarification and accordingly, they are more likely to assimilate information.
Network diversity and structural holes, the lack of connection between one's contacts,
has been shown to provide access to novel information (Burt 2004) and improve performance
(Burt 2000, Sparrowe et al., 2001, Reagans and Zuckerman,
Cummings and Cross, 2003).
In particular, in analysis of email communication networks, message content and employee
performance, Aral & Van Alstyne (2009) demonstrate that networks with structural holes
deliver diverse and novel information and that access to novel information explains a significant
portion of the variance in productivity - more so for instance than traditional human capital.
Yet, while structurally diverse networks with an abundance of weak ties are beneficial for
exposing actors to novel information, they are less effective at transferring complex knowledge
(Hansen, 1999; 2002). While people with a structurally diverse social network can efficiently
locate information, whether they can successfully assimilate the information depends largely on
the effectiveness of the knowledge transfer. When information is simple, explicit or declarative,
a structurally diverse network with many weak ties is sufficient, as the information can be easily
transmitted between actors. However, a network with many structural holes may not be as
effective as a cohesive network for transferring complex or tacit information for three main
First, cohesive networks facilitate knowledge transfer because actors are more likely to
trust each other. Trust has been shown to reduce the perceived cost of information sharing as
well as increase spontaneous information sharing (Kramer, 1999). Without trust, the source
may simply refuse to pass on the information to the recipient. Consequently, it is important to
convince the source that the transfer would not negatively affect them or be too costly. A
cohesive network with strong ties and a dense web of third-party ties can help convince the
source to initiate the transfer (Reagans & McEvily 2003). By creating cooperative motivation
and removing competitive impediments to information transfer, cohesion can increase trust
between parties (Granovetter, 1992; Reagans and McEvily, 2003), which is especially important
for transferring complex knowledge.
Second, greater absorptive capacity in a cohesive network facilitates effective knowledge
transfer (Hansen, 1999). Absorptive capacity is important for recognizing the value of new
information and for enabling the assimilation and application of the information (Cohen and
Levinthal, 1990). A cohesive network can increase absorptive capacity as repeated
communication allows actors to develop relationship-specific communication heuristics that
ease knowledge transfer (Hansen, 1999). With more frequent communication, actors are less
inhibited from seeking information and asking for clarification in a cohesive network, and
accordingly, they are more likely to understand how to correctly use the information more
Third, the redundancy inherent in cohesive networks allows actors to receive
information through multiple perspectives from multiple people, easing knowledge transfer.
Although cohesive networks have been criticized for supplying redundant information,
redundancy can also be a powerful instrument for effectively transferring tacit knowledge.
Redundancy does not simply duplicate existing knowledge, but also often creates an intellectual
common ground that can help individuals sense what others are struggling to articulate
(Nonaka, 1990; 1994; Grant, 1996).
Consequently, cohesive networks can facilitate tacit
information transfers by allowing the same information to be repeated multiple times from
different perspectives.
Complementarity between Network Cohesion and Face-to-Face Interaction
Transfers of tacit complex knowledge require both frequent, embedded interaction and
high fidelity conversation. While cohesive networks increase absorptive capacity by providing
the infrastructure for frequent interactions, face-to-face conversations increase absorptive
capacity by enhancing the fidelity of each conversation. Frequent interactions, although helpful,
cannot replace nonverbal cues and feedback available in a face-to-face conversation. Face-toface communication offers the maximal information transfer in each exchange. Through verbal,
physical and contextual cues, face-face conversations can greatly improve understanding of
complex concepts. Although email may textually deliver the same verbal content of a
conversation, it lacks rich social cues which recipients process simultaneously and multimodally to absorb the content of the conversation. When an idea is particularly ambiguous and
complex, structurally diverse networks are likely to be insufficient because infrequent
communications in diverse networks inhibit the development of group norms and group-specific
communication heuristics. Without repeated interactions, enabled by a cohesive network,
people may experience difficulty expressing their ideas and are less likely to receive relevant
responses. Thus, face-to-face communication and cohesive networks should complement each
other to provide both frequent and high fidelity communication to improve the ability of a
person to express complex ideas and disambiguate misconceptions.
Since cohesive networks can foster similar norms, enforce sanctions when someone
misbehaves and cultivate reputations among the group members, cohesive networks can also
remove motivational impediments to information sharing. These properties allow members of a
cohesive social network to trust each other. The dense web of third party ties in cohesive
networks also creates protection and cooperative motivation for the source to share information.
When a source refuses to help, her reputation may suffer as her uncooperative behavior can
quickly pass through a cohesive network and others may immediately issue sanctions against
her. Similarly, it also offers protections to the source, if the recipient of the information
inappropriately uses the information. News of the untrustworthy behavior would quickly spread
in a cohesive network and the person would no longer receive cooperation from others.
While face-to-face communications can develop trust and cooperative behaviors in
general, their effectiveness in structurally diverse social networks is likely to be limited. Without
frequent communication needed to establish trust, face-to-face communication alone is not
enough to motivate cooperative behaviors in a structurally diverse network where
communication may be relatively rare. Furthermore, without a dense web of third-party ties to
vouch for the information seeker and protect the source, the source is less likely to be willing to
initiate the transfer, especially if the information is complex or sensitive. There is also little
penalty for refusing requests for information or for the seekers to misuse information since they
are likely to belong to different communities than the source.
Lastly, both face-to-face
communication and cohesive networks can promote
serendipitous encounters where people can receive valuable information without necessarily
seeking it. Proverbial water cooler conversations are often random encounters. With multiple
levels of communication - verbal, physical and contextual - face-to-face conversations during
these random encounters allow actors to simultaneously process information multi-modally,
which is critical for creativity and problem solving (Bateson, 1973). Frequent communication
enabled by cohesive networks improves the probability of having serendipitous meetings. Thus,
face-to-face conversations and cohesive networks are likely to be complementary and mutually
enriching, leading to discoveries that would not have been possible using more asynchronous
communication channels or less cohesive networks.
In summary, we expect that face-to-face networks require network cohesion to transfer
the more complex, embedded knowledge that they are typically relied upon to transfer, and that
face-to-face communication complements network cohesion in facilitating complex knowledge
transfers. We therefore hypothesize that network cohesion is positively associated with work
performance in a face-to-face networks.
Hypothesis 1a: Cohesion in face-to-face networks, measured by network constraint,is
correlatedwith strongerinformation worker performance.
Furthermore, we expect the effect to be more pronounced when workers are engaged in
complex tasks that require more tacit and interdependent information. Simple tasks that require
relatively codified and context independent information can often be solved without the
necessity of using a cohesive network or face-to-face communication. However, when workers
face complex tasks that presumably require access to tacit and embedded information, manuals
prove to be less useful and workers must turn to face-to-face conversations with colleagues in a
cohesive network to access the desired information. As face-to-face communication is the richest
medium that can most effectively transfer complex knowledge between actors, and because
cohesive networks are most effective for transferring complex knowledge, a cohesive face-to-face
network may be especially helpful in transferring tacit or embedded knowledge used for the
execution of complex tasks. Thus, we hypothesize:
Hypothesis ib: Cohesion in face-to-face networks, measured by network constraint,is
more helpfulfor completing complex tasks than simple tasks.
The Effect of Tie Strength in Face-to-Face Networks
Knowledge sharing is more common among strong ties (e.g. Henderson and Cockburn
1994, Eisenhardt and Tabrizi, 1995, Hansen 1999) because individuals linked by strong ties have
greater motivation to be of assistance and to make themselves available to one another
(Grannovetter 1982). Strong ties also facilitate more frequent two-way interactions that help
assimilate information, allowing recipients to get immediate feedback from strong tie contacts
while information is being transferred (Polanyi 1966, Barton and Sinha, 1993). High frequency
interactions with strong ties allow the source and recipient to develop relationship-specific
heuristics that make it easier for the source to understand and use the information provided
(Mergel, Lazer, and Binz-Scharf, 2008). Strong ties also tend to minimize conflict between
individuals, making information sharing more likely (Hansen 1999).
In his study of networks amongst business units in a national electronic and computer
firm, Hansen (1999) finds that the effectiveness of weak and strong ties in transferring
information is contingent on the complexity of the information being transferred. Weak ties
have the strongest positive effect on information sharing when knowledge is codified and
context independent, while strong ties provide the strongest effect when the knowledge being
transferred is tacit and interdependent. Centola and Macy (2007) also document the contingent
benefit of weak ties. While they are efficient for propagating simple contagions, weak long
ranging ties are not necessarily sufficient for propagating complex contagions, in which an actor
is infected only if more than one neighbor is also infected. However, very few researchers have
examined the degree to which the effect of strong or weak ties on knowledge transfer is
moderated by the communication channels through which information is transmitted. We
therefore recast arguments about cohesion and tie strength in the context of different channels
of communication,
contrasting thick rich face-to-face interactions
with asynchronous
communications such as email in which non-verbal cues and simultaneous interaction are
absent. We argue these differences should change predictions about which types of network
structure are likely to support information exchange and thus performance.
Strong ties facilitate complex information sharing primarily because they enable
frequent interactions, which allow actors to establish effective communication mechanisms,
foster trust and minimize conflicts, all of which are essential to sharing information (Hansen,
1999). In the same way that face-to-face communication should complement cohesive networks
in transferring information, it should complement strong ties by improving the quality of each
communication exchange. Together, tie strength and network cohesion provide both the quality
and the frequency of communication that are necessary to build trust, minimize information
distortion and improve information transfer. Other text-based communications, such as email,
do not have the capacity to transmit the rich social cues that are necessary to transfer tacit and
embedded information, and accordingly, cannot complement strong ties to transfer complex
information. However, maintaining dyadic ties is costly in face-to-face networks, which require
time-consuming conversations and co-presence (Burt, 1992; Uzzi, 1997; Hansen, 1999). The
cost may not justify the benefit if these ties are used to transfer simple information, which does
not necessarily require a rich communication channel. On the other hand, selectively cultivating
a few strong ties in face-to-face settings may be extremely beneficial if they are used to transfer
complex and sensitive information.
Thus, the tradeoff between using strong or weak ties in a face-to-face conversational
network depends largely on the nature of the information being transferred. If the task requires
complex information, strong ties can give extra power in transmitting knowledge. However, if
the information is simple and can be articulated in writing, maintaining a large number of close
contacts may be too expensive to justify the maintenance cost. Maintaining nonessential face-toface contacts may take time away from task completion activities and thus hurt productivity.
Thus, we hypothesize that network size (maintaining many face-to-face contacts) has a negative
average effect on work performance, as transferring simple knowledge through direct contacts is
expensive. However, relying on strong ties to access information in face-to-face networks may
be beneficial when workers are executing complex tasks that require more complex and tacit
Hypothesis 2a: On average,network size has negative effect on work performance.
Hypothesis 2b: Strong ties have a positive effect on work performance when solving
complex tasks.
The Effect ofNetwork Reach in Face-to-Face Networks
We have argued that face-to-face communication can complement a cohesive network
and strong ties to transfer complex information. However, before the transfer can occur, it is
important to find the source of the information first. The ability to access multiple parts of the
informal organizational network helps a worker find required information as well as to seek
advice and support in solving problems and meeting client requirements (Hansen 1999).
Network reach, the number of individuals one can reach within two steps in the network, has
been theorized to provide access to a broader range of information and solutions to solve
problems encountered during task execution (Reagans and McEvily, 2003). Broad network
reach can facilitate search because it enables actors to access people in various parts of the
network who could provide a given piece of knowledge either directly or through a friend.
Consequently, reach should be positively correlated with performance controlling for network
size and network cohesion.
While broad network reach is desirable for the information search process, its effect is
likely to be more pronounced in a face-to-face social network due to information distortion. For
textual exchanges, distortion is less problematic because the original text can be passed
electronically without alteration. When information passes through a long path in a face-to-face
network, it is more likely to be distorted, as people tend to misunderstand or misinterpret
information through each exchange (Collins and Guetzkow, 1964; Huber and Daft, 1987;
Gilovich, 1991; Hansen, 2002). Imprecise or inaccurate information can have a negative
performance impact on recipients. Acting on vague information obtained indirectly, the
recipient may need to use her ties to connect to the original information source, only to find it
was not what she sought. Eliminating misleading information is costly, as verifying each
incorrect lead wastes valuable time and effort. When an actor with high network reach can easily
access other experts directly, not only is she exposed to less information distortion, she can also
access knowledge more quickly. We therefore hypothesize that:
Hypothesis 3a: Greater reach in face-to-face networks is correlated with stronger
Transfers of complex knowledge may experience even greater distortion than transfers of
simple knowledge, as complex knowledge is inherently more difficult to understand and is more
likely to be misinterpreted. Broad network reach can reduce information distortion and promote
knowledge transfer by influencing an actor's ability to effectively access complex ideas in the
network. Workers with broad network reach are exposed to more views and perspectives,
allowing them to understand information from different angles and to frame information in
ways others can understand. Thus, we hypothesize that broad network reach is particularly
beneficial for complex tasks that require both more information and information that is
inherently more complex and therefore more difficult to transfer and absorb.
Hypothesis 3b: Broad network reach in face-to-face networks is more strongly
associatedwith improved performance when completing complex tasks.
Background and Data
We studied a data server configuration facility with 56 employees, 36 of which
participated in this study. The organizational chart of the division can be seen in Figure 1.While
the job description of employees at the facility is heterogeneous, we focus on a set of employees
whose primary role is to guide, solicit and capture clients' IT configuration requirements, and to
produce IT products according to those specifications. The group consists of 28 people, 23 of
whom participated in the study. Interviews indicate that the data configuration process is
information-intensive, requiring employees to quickly analyze the feasibility of specifications
and build the system. Our onsite interviews with both managers and configuration specialists
indicate that talking to other employees is particularly helpful for understanding how the overall
system works, how requirements fit together and how interoperability constrains the set of
viable specifications, as there are no existing manuals to explain all the intricacies of the system.
Employees therefore engage in face-to-face communication to transfer tacit and embedded
knowledge. Each configuration task is executed by a single individual and is randomly assigned
given a workload constraint, much like a series of queued tasks. In this setting, everyone in the
configuration division is placed in a large room with four rows of cubicles. Each row has 4 pairs
of cubicles with each pair facing each other. Since everyone is collocated in the same room, there
are ample opportunities to meet face-to-face.
Department Manager
(1 Person)
Configuration Manager
Pricing Manager
Business Coordinator
Configuration Strategists
(28 People)
Pricing Strategists
(10 People)
Business Coordinators
(14 People)
Figure 8: Organizational Chart
To measure worker performance, we collected data on 911 configuration tasks during the
experimental period of 25 working days (more than one month's activities at the facility). For
each task, we gathered data on the task duration, difficulty level, the number of follow-ups the
employee conducted with the client, and information about the employee who performed the
task. Although some of the tasks took less than a day to finish, tasks that took more than one day
deserve special consideration as we cannot assume the worker is working on the task 24 hours a
day. To better approximate the completion time of tasks that span multiple days, we assumed an
8-hour workday. Our interviews with staff indicate that employees typically follow this work
schedule and rarely stay late or work on weekends to catch up. Although task completion time is
only one dimension of work performance, it is an important outcome in the computing industry
(Eisenhardt & Tabrizi 1995), and in this organization employees are formally evaluated on this
The Sociometric Badge
To capture face-to-face interactions, employees participating
in the study were
instructed to wear a Sociometric badge every day from the moment they arrived at work until
they left their office. In total, we collected
hours of data. All of these employees were male
and since this was a recently formed department, none had been employed for more than one
year. We recorded every face-to-face conversation between workers using the Sociometric badge
and continuously logged physical proximity, as well as many other behavioral features such as
animation, tonal variation and the sequence and timing of interactions. The content of the
conversations were not recorded.
Capabilities of Wearable Sociometric Badge
Recognizing common daily human activities (such as sitting, standing,
walking, and running) in real time using a 3-axis accelerometer (Olguin
Olguin & Pentland, 2006).
Extracting speech features in real time to capture nonlinguistic social signals
such as interest and excitement, the amount of influence each person has on
another in a social interaction, and unconscious back-and-forth interjections,
while ignoring the words themselves in order to assuage privacy concerns
(Pentland, 2005).
Performing indoor user localization by measuring received signal strength and
using triangulation algorithms that can achieve position estimation errors as
low as 1.5 meters, which also allows for detection of people in close physical
proximity (Sugano, Kawazoe, Ohta, & Murata, 2006; Gwon, Jain, &
Sociometric Badge
Kawahara, 2004).
Communicating with Bluetooth enabled cell phones, PDAs, and other devices
to study user behavior and detect people in close proximity (Eagle & Pentland,
Capturing face-to-face interaction time using an IR sensor that can detect
when two people wearing badges are facing each other within a 30*-cone and
one meter distance. Choudhury (Choudhury, 2004) showed that it was
possible to detect face-to-face conversations of more than one minute using
an earlier version of the Sociometric badge with 87% accuracy.
The 'wearable badge' form factor is particularly useful for collecting data on face-to-face
interactions in organizational contexts. First, most organizations already require individuals to
wear identification badges with embedded radio frequency identification (RFID). It is not hard
to extend the sensing functionality of these badges further with accelerometers, infrared (IR)
transceivers, and microphones. Second, wearable badges are less obtrusive than sensors that
require a long setup period to function. The success of IT products that employ this form factor
for wearable sensors, such as the nTag ( and Vocera systems
( suggests that this technology is acceptable to users in a wide variety
of contexts. The capabilities of the wearable Sociometric badge are described and a picture of the
badge is shown in Figure
Below, we describe in detail how we use the Sociometric badge to
detect face-to-face interactions and physical proximity.
Infrared (IR) can be used to detect face-to-face interactions between people. In order for
one badge to be detected by another through IR, two Sociometric badges must have a direct line
of sight to each other. Every time a badge detects an IR signal we say that face-to-face
interaction may occur. We define the total amount of face-to-face interaction time per person as
the total number of consecutive IR detections per contact multiplied by the IR transmission rate,
which in our experiments was once every two seconds. We find that this gives a good balance
between detection accuracy and power expenditure. We can detect whether a person is speaking
through the tonal variations captured through the badge. With both voice and IR detection, we
can predict whether two people are having conversations reasonably well. More details about
the Sociometric badges can be found in the research note accompanying this article (Waber et
al. 2010).
Measuring Physical Proximity and Location
Sociometric badges can detect other badges in close proximity (within 10 meters) in an
omni-directional fashion using the badge's radio. To record the location of an individual, base
stations with overlapping ranges are used to triangulate the wearable badge's position down to
the sub-meter level. The detailed technical description for the location sensing capabilities can
be found at
The Advantage of Recording Precise Face-to-Face Social Networks
Using Sociometric badges, we can accurately detect face-to-face interactions and
construct real face-to-face social networks for a group of people over time. Traditionally, selfreports such as surveys and questionnaires are used as the primary tool for generating social
network data. Even then, to make survey results more accurate and reliable, subjects are often
surveyed repeatedly and questions can be difficult to answer, as it requires subjects to recall
specific events in the past (Marsden, 1990). All this can entail a considerable burden to the
subject, leading to low participation and high drop off rates. Sociometric badges can on the
other hand record accurate social network data while minimizing the cost and burden incurred
by participants (participants simply wear a badge that is slightly larger than a typical
identification card). Furthermore, the badge does not record actual conversations and data is
de-identified prior to analysis, preserving the privacy of employees. Precise measurements of
networks may not be as critical for studies of social influence, attitude and opinions as are
cognitive networks obtained through self-report. However, obtaining accurate knowledge of
precise, time-stamped interaction data is critical for studies of contagion (Centola, 2007) and
information transfer and diffusion (Marsden, 1990), which is the focus of this study. As valuable
information can pass through any type of relationship, it is crucial to record all instances of
information exchange instead of just those recalled in self-reports.
Respondents are generally good at remembering recent, frequent, and intense
interactions but poor at recalling their interactions with weak and distant ties. Thus, survey
methods tend to reveal strong, close relations as oppose to interactions with distant or weak ties.
For example, Brewer (2000) reported an appreciable level of omission of weak ties in a
dormitory friendship study. Brewer (2000) recommends a few steps to reduce the level of
forgetting, including asking subjects to choose their friends and relations from a list rather than
asking them to recall contacts directly from memory. Using multiple name generators can also
be helpful as friends forgotten in one generator can be named in response to other generators.
While these methods can alleviate some of the bias introduced by self-reports, data collection
may still be problematic since certain network measures can change dramatically with even a
few missing ties (Borgatti, Carley and Krackhardt, 2006). Those missing ties could pass
important information and failure to capture them can be problematic. Recent developments in
data collection methodologies have advanced the accuracy of network measurements to include
weak and distance ties as well as strong ties. These methods include position generators (Lin, Fu
and Hsung, 2001) and resource generators (Van der Gaag and Snijders, 2005) that simply ask
the subject to recall the number of people she knows in a particular context instead of naming
specific individuals. However, without knowing the specific alters, it is hard to construct a
network beyond one degree of separation. Without a complete network, it is difficult to calculate
structural properties such as structural holes and centralities, which are essential for the study
of information transfer and diffusion (Marsden, 1999).
Furthermore, Marsden (1990, 2005) finds that while respondents are capable of
reporting on their local networks in general terms, they are typically unable to give useful data
on the exact timing of interactions. Participants tend to respond to surveys in ways that make
them look as good as possible and consequently they tend to under-report behaviors deemed
inappropriate and to over-report behaviors viewed as appropriate (Donaldson and GrantVallone, 2002). Sociometric badges can objectively measure the actual occurrence of face-toface interactions and record precisely when, within whom and for how long two people spoke,
greatly minimizing reporting biases (Olguin Olguin et al., 2009).
Network Variable Construction
We measure Network Size as the number of direct contacts one has. In face-to-face
networks, a direct link between two actors exists when they engage in at least one conversation
during the experimental period. Physical proximity networks, on the other hand, are a broader
measure of direct links where network size counts an interaction between actors when they
either engaged in a conversation or were physically within ten meters of each other. The Volume
of Interactions measures the total number of interactions an actor has with anyone in the
network. This differs from network size as it counts all communication incidents regardless of
with whom the actor has interacted. For example, an actor who communicates 100 times to a
single person in the network would have the same volume of interactions as someone who
communicates with 100 different people once. The network size in the former case is one, but in
the latter case is 100. While both variables measure the number of direct interactions between
actors, network size may have a stronger effect than the volume of interactions in enabling
workers to access and transfer complex information. Since a high volume of interaction may also
only involve a small group of actors, frequent interaction with the same person may be
redundant and may not add value for knowledge transfer. The definitions of all the network
variables are shown in Table 1.
Table 1: Network Characteristics and Description
The total number of contacts with whom an actor exchanges at
Network Size
least one message
The total number of face-to-face interactions an actor
Volume of Iexperiences
The probability of an actor that falls on the shortest path
Betweenness Centrality
between any two other actors
Degree to which an actor's contacts are connected to each other
Cohesion (Constraint)
The number of other people an actor can reach in two links or
Tie Strength is measured using the frequency of one's face-to-face communication with
other actors. Granovetter (1982) described four identifying properties of the strength of ties as
time, emotional intensity, intimacy and reciprocity. In practice, tie strength has been measured
in many ways. Some use reciprocation to represent strong ties and a lack of reciprocation as
evidence of weak ties (Friedkin, 1980). Others have included the recency of contact (Lin et al.,
1978) or the frequency of interactions as a surrogate for tie strength (Granovetter, 1973), and we
adopt a similar measure here. We define actor A to have a strong tie to actor B when the
frequency of communication between A and B exceeds one standard deviation of A's average
communication frequency to everyone else in the network.
StrongTie(A,B) = 1,if(freq(A, B) > average(freq(A,A)+ std(freq(A, A))
Table 2: Summary Statistics F2F Network Variables
Latent Network Variables
Obs. Mean
Std. Dev.
Network size
2-step reach
3-day Network Variables
Network Size
2-step reach
We use Network constraint, Ci , to measure the cohesion of the network in which an
actor is embedded. Constraint measures the degree to which an individual's contacts are
connected to each other. Py is the proportion of i's network time and energy invested in
communicating withj. Network constraint can be used as proxy for measuring network cohesion
(Burt, 1992). In the hypothetical network in Table 2, C 2 is much higher than C7, because friends
of actor 12 are more likely to be friends with each other than friends of actor 7. We construct
network characteristics for both face-to-face and physical proximity. The network topologies are
shown in Figures 3 and 4, and summary statistics are provided in Tables 3.
Network Reach measures the degree to which any member of a network can reach
everyone else in the network. We measure 2-step reach, which calculates the number of actors
that an individual can reach in the network in 2 steps, because our network is small enough that
all actors are able to connect to everyone else in the network in three steps or less. Actor 7,
located in the center of the network in Table 2, can reach eight other employees in two steps and
therefore has a higher network reach than actor 12 who can only reach five others.
Table 3: Network Measures for a Hypothetical Network
Direct Contacts
Size(7)= 4 Size(12)= 3
Btw(7)= 33 Btw(12)=6
Reach(7)=67% Reach(12)=41%
Constr(7)=0.47 Constr(12)=0.84
Control Variable Construction
Other network characteristics may influence information sharing and work performance.
Betweenness centrality B(n) has been widely used to measure how fast a person can obtain
information in a network. It measures the probability that an individual i will fall on the shortest
path between any two other individuals in a network (Freeman, 1979), where gjk(n) is the
number of shortest geodesic paths from i toj that pass through a node n, while gjk is the number
of shortest geodesic paths from i toj:
B(n,) =YLX g,,0,)k
As shown in the hypothetical network in Table 2, actor 7 is located in a relatively more central
position than actor 12. As actor 7 is closer to three different groups of actors, her betweenness
centrality is much higher than actor 12. In addition to network structure, we posit two broad
categories of factors that may influence the task completion rate: characteristics of tasks and of
individual workers.
Characteristics of Tasks
As harder tasks take longer to finish, task difficulty is strongly correlated with time to
completion. We include two controls for task complexity: task difficulty and the number of
follow-ups. Managers determine the task difficulty based on the initial request and parameters
of the job and assign one of three difficulty levels to each task-basic, complex, or advanced.
These difficulty ratings can be revised during task execution, although most task complexity
scores are never modified.
Instead, another metric, the number of follow-ups, is used to
approximate the complexity level of the task during execution. When tasks are particularly
complex, the number of follow-ups between the IT worker and the sales team increases. We
therefore include controls for both the assigned (and revised) task complexity and the number of
follow ups that occurred during the project.
Although managers assign one of three complexity levels to all tasks: Basic (Low
Complexity), Complex (Medium Complexity) and Advanced (High Complexity), the majority of
tasks performed in our sample are basic tasks. Basic tasks can be completely quickly, usually in
one day, since these jobs are generally straightforward and routine tasks that do not require any
advanced technical skills. Often these basic configurations are components of a larger system
and workers only need local knowledge about the component, rather than of the entire system,
to successfully configure the product. To complete these tasks, workers can use simple off-theshelf configurations or follow detailed instructions already created by the sales team or the
client. Even if they encounter technical difficulties during the task, IT workers can find most of
the solutions in existing manuals or knowledge database. Rarely do they need to consult others
to solve these problems. Thus, completing simple tasks usually only requires codified and
context-independent information. The difficulty of a complex task lies in between advanced and
basic tasks. In our sample, only less than 10% of the tasks are labeled as complex tasks, while the
rest are labeled as either advanced or basic.
Advanced tasks are the most difficult tasks assigned by the manager. They are often
novel and technologically complex configurations that require more advanced system
knowledge. Special and customized orders to build an entire hardware system are typical
examples of these tasks. To design these configurations, the IT worker needs to understand the
entire system, especially the compatibility of various components within the system design.
Solutions in such configuration tasks are usually new innovations that cannot be found in
existing manuals or database. These tasks typically take longer to complete than simple routine
tasks and the consultant often must confer with other team members to create a viable solution.
Some tasks are classified as advanced not necessarily because of their technical difficulty
but because the task description is vague. In addition, customers may impose a budge constraint
as well. To put together a system with a set of functionalities under the budget can be
challenging and sometimes infeasible. When it is impossible to meet the customer demand, the
configuration specialist often contacts the sales representative to clarify which of the customer's
requirements is absolutely necessary. The sales team would then work with the customer to
revise the requirements or the budget before the IT worker can complete the order. Sometimes,
the customer's specifications may have errors and the IT worker would also need to contact sales
to verify and modify the existing plan. To keep track of these exchanges between sales and the IT
worker, we measured the number offollow-ups in each configuration task that provides another
proxy for task complexity as measured during task execution. Thus, completing advanced tasks
and tasks that require frequent follow-ups would take longer than simple tasks as advanced
tasks requires the transfer of tacit and embedded knowledge.
The correlation between task difficulty assigned by the manger and the number of
follow-ups is 0.7 and their Chronback alpha is 0.75. We therefore aggregate task difficulty level
and the number of follow-ups into a single construct to measure task complexity. We create the
task complexity variable by first demeaning each variable and then dividing each variable by its
standard deviation (Norm). We then normalize the sum of these variables to construct the
overall task complexity.
Complexity = Norm(Norm(TaskDifficulty) + Norm(NumberOfFollowUps))
Characteristics of Individuals
We included controls for human capital using functional titles that classify employees
into 3 categories: manager, pricing strategist and configuration specialist. While managers may
be knowledgeable about the entire system, they are less likely to be intimately familiar with dayto-day configuration routines. The primary role of a pricing strategist is to determine if the
pricing is feasible and correct based on the requirement, but they perform some configuration
tasks as well. Among the three types of workers in our sample, the configuration specialist is
most prepared to execute the configuration and we expect them complete tasks more quickly
and accurately. Our interviews indicate that almost all workers have at least a Master's degree in
a relevant field and all had joined this particular division less than a year before the start of the
study. Thus, there is little variation in education level, experience or tenure within the group.
Table 4: Summary Statistics for Worker and Task Characteristics
Std. Dev.
Functional Title
Task Complexity
Number of Follow-ups
Voice Animation
Table 5: Pair Wise Correlations Between Independent Variables for the F2F
Compl- Follow
ConsFunction exity
Animation Volume links
Btw traint Reach
Follow up
-. 14
-. 17
-. 44
-. 38
To mitigate the lack of complete demographic data on workers, we infer some worker
characteristics from the badge data. By measuring the tonal variance of workers, we can infer
how animated a person (Pentland, 2006). The animation of a worker's voice may give us
indications about his general enthusiasm or motivation (Basu, 2002; Pentland, 2006). We also
employ fixed effects specifications (described below) to control for time invariant characteristics
of individuals. Summary statistics and correlations of all variables are listed in Tables 4 and 5.
Empirical Methods
Combining task performance and network data, we empirically test whether face-to-face
and proximity networks are correlated with productivity and performance. Time to task
completion measures how fast a person can finish a given task and based on our interviews,
speed is a good measure of work performance in this setting. The accuracy or quality of
configurations is also an important measure but only 20 of the 1217 tasks in our sample contain
detectable errors and 90% of those errors were due to server configuration issues that are largely
outside the control of individual workers. Since the majority of the tasks are completed
correctly, completion time is a good metric for work performance. Although multitasking can
increase total task throughput and could confound the use of duration as the only performance
measure (Aral et al., 2007) in this setting multitasking is not possible since tasks are assigned to
workers one at a time. Consequently, task duration provides a good overall measure of work
performance that is also relied on by the firm to evaluate employees.
Since our dependent variable is the number of minutes it takes to complete a task, we
specify a duration model. We use a hazard rate model of the likelihood of a project completing at
time t, conditional on it not having been completed earlier. The Cox proportional hazards model
is used to examine the effect of network characteristics on project completion rate as follows:
HazardRate(R)= f(size, betwenness,cohesion,reach,strongties, complexity, job title)
R(t) = r(t )b e'i
where R(t) represents the project completion rate, t is project time in the risk set, and r(t)b is the
baseline completion rate when all the independent variables are set to zero. In this model, the
effects of independent variables are specified in the exponential power, where
1 is
a vector of
estimated coefficients on a vector of independent variables X. R has a straight forward
interpretation, where | 0-1
represents the percentage increase (or decrease) in project
completion rate associated with a one unit increase in the independent variable depending on
whether R-1 is positive (or negative). We tested this specification using both face-to-face and
proximity-based network characteristics (Figures 3 and 4). The thickness of the lines in the
graphs indicates the number of interactions between workers. As shown in the figures, there are
more interactions between workers in the physical proximity-based network than in the face-toface network because when people are engaged in conversation they are by definition close to
each other, whereas two people who are not talking could still be in close proximity. Therefore,
the face-to-face network is a subset of the proximity network.
Figure 3: Face-to-Face Conversation
Figure 4: Weighted F2F and
Physical Proximity Network
We test the effects of four face-to-face network attributes on the speed of task
completion: size, volume, cohesion and reach, controlling for betweenness centrality and
various task and worker characteristics. First, we use a single pooled cross-sectional network
over the entire experimental period to compute network variables. Constructing a network over
the entire period allows us to assess the overall social network that a worker can potentially
leverage when completing a task ("Cross-Sectional Network"). In addition, we constructed a set
of longitudinal social networks in three day panels of interaction ("3-Day Networks") that
measure interactions over pooled three day periods and arrange data as a dynamic panel over
three day cross sections, allowing us to estimate fixed effects models that eliminate omitted
heterogeneity from time invariant individual characteristics that may bias results in crosssectional data. We estimate hazard rate models at the project level with the duration of each
project as the dependent variable and a project/worker as the observation, and fixed and
random effects models on panel data with average project duration over three days spells as the
dependent variable and independent variables measured per individual over three day cross
sections of interaction. We use the following linear model to estimate the effect of network
structure on work performance in our longitudinal data.
AvgDuration,,= a+
plAvgComplexity,+ pl2Network,+ /31IndividualCharacteristics+E,
When a coefficient is positive in this model, it means the variable whose parameter is being
estimated is associated with more time to complete projects. As a robustness check, we also
estimated panel data specifications using spells of different durations (1, 3 and 5 day panels) and
the length of the periods do not affect our results. As employees work on more than one task
during the observation period, standard errors are clustered around individuals.
The Effect ofNetwork Structure on Performance
Model 1 in Table 6 shows the effect of the latent cross-sectional network on worker
productivity. Unsurprisingly, complex tasks take longer to complete on average. Tonal variation,
a proxy for employees' level of enthusiasm and motivation, has no effect on work performance.
However, it is possible that tonal variation will have an effect if we can disaggregate the data to
measure tones in conversations during a given task, although we are currently unable to achieve
this level of granularity. As predicted, network cohesion is positively correlated with work
performance. Instead of reducing speed and productivity, as shown in email networks (Aral and
Van Alstyne, 2009), a one-standard-deviation increase in network constraint in face-to-face
networks is associated with doubling the speed of task completion, demonstrating that cohesive
ties in a face-to-face networks are more highly correlated with productivity than networks with
structural holes. We suspect that the information transmitted in face-to-face networks is
inherently different from that which is transferred in other media. It appears that the
advantages of using face-to-face communication to transmit complex knowledge are enhanced
in cohesive networks, supporting Hypothesis Ia.
Similarly, strong ties are also associated with improved work performance. As shown in
Model 1, one additional strong tie is associated with 22.5% increase in productivity controlling
for network size, demonstrating that strong ties may be beneficial in transferring information in
face-to-face social networks. Interestingly, the total number of face-to-face interactions has a
minimal impact on the time to finish a task. However, an additional network contact is
associated with a 9% decrease in the average speed of task completion, demonstrating the
potential cost in time, effort, and energy of engaging in many face-to-face conversations. These
results imply that while having more direct contacts may be harmful for performance as face-toface contacts are expensive to maintain, selectively cultivating strong ties may be more
beneficial in a face-to-face setting, especially when one needs more tacit and embedded
information. These results support Hypothesis 2a.
Network reach is also positively correlated with the task completion rate. Specifically, a
one-percent increase in network reach is associated with a 4% increase in the speed of task
completion, confirming Hypothesis 3a. Having broad reach is particularly helpful for locating
the source of knowledge and thus can increase the rate of task completion. Although results
using the latent cross-sectional network show promising evidence supporting our hypotheses,
the results may be driven by unobserved variation in individual characteristics, such as
employees' inherent ability or ambition. Without comprehensive demographic data, it may be
premature to attribute these performance differences to social networks alone.
We therefore used the panel data fixed effects and random effects specifications that
eliminate variance explained by any time-invariant characteristics, such as individual aptitude
that could affect performance. The results are shown in Table 6, Models 2 and 3 respectively.
The coefficients from the random effects model are roughly the same as with the cross-sectional
network. Although the coefficients in the fixed effects model diminish in magnitude for some
network metrics, the signs and statistical significance of those coefficients again corroborate the
results providing further evidence supporting our hypotheses.
Network cohesion, as measured by network constraint continues to have the strongest
effect on worker performance even after eliminating time-invariant factors such as individuals'
inherent ability. As shown in Model 3 of Table 6, the fixed effects model demonstrates that a
one-standard-deviation increase in network constraint is associated with a decrease of 338
minutes in task completion time. The evidence from both longitudinal and cross-sectional
networks demonstrates that cohesion in face-to-face networks is correlated with higher
productivity supporting Hypothesis 1a.
The number of direct contacts has either a limited or a negative effect on the task
completion rate in the longitudinal models. Although the total number of face-to-face
interactions continues to have minimal impact on the rate of task completion, an additional
network contact in the fixed effects model is associated with an increase of 140 minutes in
average task completion rate. This negative impact of network size on work performance
suggests a potential cost in time, effort, and energy to maintain face-to-face relationships.
Disruptions during task execution can be especially distracting, as the cognitive cost of task
switching can impede the rate of task completion (Aral, et al., 2006). A large number of
during a task are likely to be unproductive and interruption-driven
communication from other colleagues. Since interruptions are costly to workers, we may also
expect the total number of conversations, as approximated by network size, to negatively affect
work performance. The coefficients on the number of strong ties are not significantly than zero.
Thus we do not see strong evidence that having strong ties is correlated with faster task
completion time as we have shown in the cross-sectional network.
The effect of network reach continues to be positively correlated with task completion. A
one percent increase in network reach is associated with an 1.96-minute decrease in task
completion time. Broader network reach in potential contacts is particularly effective in
searching for information and reducing information distortion because an employee can choose
the shortest path to task-relevant experts among those potential contacts while performing a
particular task.
Lastly, when we compare face-to-face networks with physical proximity networks, we see
(in Model 4) that most of the coefficients in the physical proximity network are insignificant,
demonstrating that face-to-face conversations matter more than physical proximity alone.
Although workers are more likely to turn to a person for information when they are physically
proximate (Allen, 1977), we show that being physically close alone is not enough. As information
transfer is far more likely to happen during a face-to-face conversation than during times of
mere collocation, face-to-face interaction networks are more likely to have an impact on work
performance than physical proximity networks. Conley and Udry (2005) find similar results
when studying the effect of social networks on the use of fertilizer in Ghanaian pineapple farms.
These results also rule out confounding effects of workers being simultaneously exposed to
positive environmental factors, like being close to the sales team or sitting in the vicinity of a
Table 6: The Effect of F2F Networks on Work Performance
Panel: 3-Day Networks
Model 1
Model 2
Model 3
Model 4
Tonal Variation
Interactions Volume
Network Cohesion
Betweenness Centrality
Network Type
Dependent Variable
Task Complexity
Configuration Specialist
Network Size
2 Step Reach
Strong Ties
Standard errors in parentheses
p<o.o5, * P<o-1
The Effect of Network Structure on Completing Complex Tasks
As cohesive networks enable more effective transfers of complex knowledge (Reagans
and McEvily 2003; Hansen, 1999) we expect cohesion to be more effective when employees are
engaged in complex tasks. Given the cost of face-to-face interactions in time, effort, energy and
interruption, we also expect additional face-to-face interaction, especially those with strong ties,
to increase the speed of project completion for complex tasks that require more information,
advice and tacit guidance from colleagues. For complex tasks, we expect the benefits of face-toface conversations to outweigh the costs, whereas for simple tasks we expect there to be less
benefit to interaction, while still creating costs. However, we do not expect network size to be
particularly helpful in completing complex tasks when the strength of ties is low. An additional
weakly linked tie is costly to maintain and its ability to transfer complex information may still be
limited. A pair joined by a weak tie may not have communicated frequently enough to develop
trust and relationship-specific heuristics and thus continue to have difficulty in articulating
complex concepts to each other. Accordingly, we expect that while strong ties may be beneficial
to complete complex tasks, network size may have a negative association with the rate at which
workers complete these tasks. To test these expectations, we add interaction terms between task
complexity levels and various network measures. The results in Table 7 lend broad support to
our hypotheses.
In the cross sectional network, we find that network cohesion continues to have a
positive correlation with the rate at which tasks are completed, and is especially beneficial for
completing complex tasks. The interaction term between task complexity and network cohesion
is positive (0=1.796, p<o.1), demonstrating that cohesion in a face-to-face network may facilitate
transferring and understanding complex information. We find similar results using the
longitudinal network data (Model 3). Using fixed effect specifications, we find that while
network cohesion alone is no longer correlated with task completion, the interaction term
between cohesion and complexity remains correlated with faster completion time (0=-643,
p<o.o5, Model 5). Both the cross sectional and longitudinal results lend support to Hypothesis
1b that network cohesion may be even more effective in completing a complex task that requires
tacit or context dependent information.
Table 7: The Effect of F2F Networks on Work Performance in Complex Tasks
Panel: 3-Day Networks
Physical Proximity
Model 1
Model 2
Model 3
Model 4
Completion Rate
Average Duration
Average Duration
Completion Rate
Random Effects
Fixed Effects
Strong Ties
Interact. Vol.
Network type
Dep. Var
Network Size
Strong Ties
Standard errors in parentheses,
*** p<o.oo, ** p<o.05, * p<o-1
Strong ties, on the other hand, show mixed results. While we expect more strong ties to
be positively correlated with work performance, we expect the effect to be more pronounced
when doing complex tasks. For complex tasks, the benefit of a strong tie should outweigh the
costs of maintaining strong ties. However, in both cross-sectional and longitudinal networks,
strong ties are positively correlated with task completion rate but the interaction between task
complexity and strong ties is not statistically significant.
We expect network size to have a negative association with work performance, especially
for completing complex tasks, as the costs of maintaining face-to-face interactions is high but
the benefits are low if these contacts are not strongly tied to the actor. In the cross-sectional
network, We find that more network contact are costly on task execution with an additional
interaction correlated with reducing the project completion rate for complex tasks by about 14%
(Model 1 Table 7) in the cross-sectional network. While the coefficient on the longitudinal
network variable is positive it is short of being statistically significant. However, we find that the
interaction of network size and task complexity is correlated with delaying project completion in
the longitudinal network (Model 2 and Model 3), supporting Hypothesis 2a. Having more
network contacts can actually hurt if none of them have a strong connection to the actor. Strong
network contacts can cause unwanted interruptions but may be strong enough to transfer
complex information needed for completing difficult tasks. Overall, combining the results on
strong ties and network size demonstrates that more interactions with fewer strongly tied
connections are the most beneficial for increasing the speed of work. We suspect that employees
who seek information from a greater number of colleagues not only experience a cost to those
interactions, but are also not finding the information they are looking for and thus are seeking
advice from additional colleagues. Our interviews corroborate this interpretation as employees
report having to contact more people when they cannot find the guidance they are seeking. More
interactions with the right colleagues are helpful on complex tasks, but seeking advice from
many colleagues is not only costly but also signals an inability to find the information necessary
to complete the task quickly. It could also be that more interactions with fewer strongly tied
colleagues generate a higher degree of mutual understanding and conversational rapport that
facilitates more efficient transfers of complex knowledge. Lastly, the physical proximity network
displays no significant effect on task completion, suggesting that face-to-face conversations are
more important than physical proximity when completing complex tasks.
Discussions and Conclusion
By studying face-to-face interaction networks, we can better understand how the
relevant theories interact. Until now, social network theories (e.g. Granovetter, 1973; Burt, 1992)
and theories of communication media choice, such as information richness theory (Daft and
Lengel, 1987) have been used independently to understand knowledge transfer in informationintensive work. Social network theories explain how network structures co-vary with the
diffusion and distribution of information, but largely ignore characteristics of communication
channels. Media-choice theories focus explicitly on communication channel requirements for
different types of knowledge transfer but ignore the population level topology through which
information is transferred in a network. We bridge these two sets of theory to understand what
types of social structures are most conducive to transferring knowledge and improving
performance in face-to-face communication networks, which has been rarely studied in the past.
As valuable information passes through verbal exchange, studying face-to-face interaction
networks is particularly important for understanding how informal social structures can
facilitate work in modem organizations.
To precisely capture face-to-face interactions, we used new tools and methodologies to
collect precise real time data on face-to-face interactions in an IT configuration facility. By
matching data obtained through the use of wearable Sociometric badges with detailed
performance data from the firm's accounting records we are able to test the effects of real time
face-to-face interaction networks on individual information worker performance. Although
detailed data on electronic interactions (e.g. email, phone logs, instant messaging) has become
readily available in recent years, our ability to record network data for face-to-face interactions
has lagged behind. The tools and methods presented in this paper give researchers important
new opportunities for collecting fine grained data about the flow of information and knowledge
through informal channels such as face-to-face interaction in real organizations, opening new
avenues for research into social networks, knowledge management and IT use in organizations
and elevating data collection on face-to-face networks to the standards of accuracy and precision
displayed in electronic communication data.
Our research uncovers three main results. First, in face-to-face networks, network
cohesion, rather than structural holes, is associated with higher productivity. We suspect that
information transmitted in face-to-face networks is more tacit, complex and embedded than
information transferred through electronic channels, and that the advantages of using face-toface communication to transmit complex knowledge are enhanced by cohesion which increases
norms of trust, effective communication heuristics and absorptive capacity through the
provision of multiple perspectives on a problem. Second, we find that cohesion in face-to-face
networks is more strongly correlated with performance when the participants are solving
complex problems. This suggests that cohesion complements information-rich communication
media for the effective transmission of complex tacit knowledge when conducting complex
tasks. Third, we find that network size has a negative relationship with task-completion,
implying that the cognitive cost of interruptions is high during task execution. On the other
hand more interactions with fewer people, as measured by strong ties speed up project
completion for complex tasks, which require more complex information and guidance from
colleagues. The explanatory power of cohesion is stronger in face-to-face networks than in
physical proximity networks, demonstrating that information flows in actual conversations
(rather than mere physical proximity) are driving our results.
There are two main limitations of our work. First, we do not know the content of the
conversations so we can only theorize about the types of information transmitted in face-to-face
conversation. However, our on-site interviews indicate that the information transferred in faceface conversations in this firm may be fundamentally different from that which is transferred in
electronic media. Workers tend to seek tacit and context dependent information when they ask
for help in face-to-face conversations. Second, although our longitudinal models allow us to
control for variance explained by any time-invariant characteristics of employees, our results
may still be biased by unobserved and time-varying characteristics such as media choice at
different points during a task or by endogeneity. Although we do not make causal
interpretations of our parameter estimates, we submit that such interpretations are plausible
when viewed in light of evidence from interviews and fixed effects analyses, which control for
omitted variables that could influence our results.
Caveats aside, our results represent some of the first evidence measuring the effects of
face-to-face communication networks on information worker performance. Using innovative
technology to record face-to-face interactions, we show cohesive networks in a rich
communication medium such as face-to-face interaction are associated with higher employee
performance. The unique characteristics of face-to-face networks highlight the need to
distinguish them from other types of communication networks, particularly when analyzing
their effects on productivity and performance.
Ancona, D.G.,
and Caldwell, D.F. 1992. "Demography & Design: Predictors of new Product
Team Performance." Organization Science 3(3): 321-341.
Apte, U., and Nath, H. 2004. "Size, structure and growth of the U.S. economy." Center for
Management in the Information Economy, Business and Information Technologies Project
(BIT) Working Paper.
Aral, S., Brynjolfsson, E., and Van Alstyne, M. 2006. "Information, Technology and Information
Worker Productivity: Task Level Evidence." Proceedings of the 27th Annual International
Conference on Information Systems, Milwaukee, Wisconsin.
Aral, S., Brynjolfsson, E., and Van Alstyne, M. 2007 "Productivity Effects of Information
Diffusion in Networks." Proceedings of the 28th Annual International Conference on
Information Systems, Montreal, CA.
Aral, S., and Van Alstyne, M. W. 2009. "Networks, Information & Brokerage: The DiversityBandwidth Tradeoff." Available at SSRN:
Argote, L. 1999. "Organizational Learning: Creating, Retaining and Transferring Knowledge.
Kluwer Academic Publishers", Boston, MA
Basu, S. 2002. "Conversational Scene Analysis". PhD thesis, Massachusetts Institute of
Technology, Media Laboratory.
Bateson, G. 1973, Steps Toward an Econology of Mind. London: Paladin Press
Boorman, S. A. 1975 "A combinatorial optimization model for transmission of job information
through contact networks." Bell Journal of Economics 6(21): 6-249.
Brass, D. J., and M. E. Burkhardt. 1992. Centrality and power in organizations. N. Nohria and
R. G. Eccles, eds. Networks and Organizations: Structure, Form and Action. HBS Press,
Boston, MA, 191-215.
Burt, R. 1987. "Social Contagion & Innovation: Cohesion versus Structural Equivalence."
American Journal of Sociology (92): 1287-1335.
Burt, R. 1992. "Structural Holes: The Social Structure of Competition." Harvard University
Press, Cambridge, MA.
Burt, R. 1997. "The Contingent Value of Social Capital", Administrative Science Quarterly 42(2)
Burt, R. 2004. "Structural Holes & Good Ideas" American Journal of Sociology (110): 349-99.
Cheepen, C. "The predictability of informal conversation", London : Pinter, 1988.
Coleman, J.S. 1988. "Social Capital in the Creation of Human Capital." American Journal of
Sociology (94): S95-S120.
Cohen, W., and Levinthal, D. 1990, "Absorptive Capacity: A New Perspective on Learning and
Innovation", Administrative Science Quarterly 35(1):128-152.
Conley, T.G., and Udry, C.R. 2005, "Learning about a New Technology: Pineapple in Ghana,
Yale University Working Paper
Contu, A., and Willmott, H., 2003. "Re-embedding situatedness: The importance of power
relations in learning theory", Organizational Science 14(3): 283-296.
Csikszentmihalyi, M. 1996, "Creativity: Flow and Psychology of Discovery and Invention". New
York: Harper Collins.
Cummings, J., and Cross, R. 2003. "Structural properties of work groups and their
consequences for performance." Social Networks 25(3): 197-210.
Donaldson, S., and Grant-Vallone, E. 2002 "Understanding Self-Report Bias in Organizational
Behavior Research," Journal of Business and Psychology 17(2):245-260.
Daft, R. L., and Lengel, R.H. 1986. "Organizational Information Requirements, Media Richness
and Structural Design," Management Science 32(5): 554-571.
Eisenhardt, K., and Tabrizi, B., 1995. "Accelerating adaptive processes: Product Innovation In
the global computer industry." Administrative Science Quarterly (40): 84-110.
Freeman, L. 1979. Centrality in social networks: Conceptual clarification. Social Networks 1(3):
Garguilo, M., and Rus, A. 2002 "Access and mobilization: Social capital and top management
response to market shocks." Working paper, INSEAD.
Gilovich, T. 1991. How We Know What Isn't So: The Fallibility of Human Reason in Everyday
Life. Free Press, New York.
Goffman, E. 1959. The Presentation of Self in Everyday Life, New York: Doubleday.
Goffman, E. 1982. Interaction Rituals: Essays on Face-to-Face Behavior, New York: Pantheon
Granovetter, M. 1973. "The strength of weak ties." American Journal of Sociology (6): 13601380.
Granovetter, M. 1982. "The strength of weak ties: A network theory revisited." In P. V. Marsden
and N. Lin(eds.), Social Structure and Network Analysis: 105-130.
Granovetter, M. 1985. "Economic Action & Social Structure: The Problem of Embeddedness."
American Journal of Sociology (91): 1420-1443.
Granovetter, M. 1992. "Problems of Explanation in Economic Sociology." In N. Nohria & R.G.
Eccles (eds.), Networks & Organizations: 25-56. Harvard Business School Press, Boston.
Grant, R. 1996. Prospering in dynamically-competitive environments: Organizational capability
as knowledge integration. Organization Science 7(4) 375-387.
Hinds, P.J. and Mortensen, M. 2005. "Understanding Conflict in Geographically Distributed
Teams: The Moderating Effects of Shared Identity, Shared Context, and Spontaneous
Communication," Organization Science 16(3): 290-307.
Hansen, M. 1999. "The search-transfer problem: The role of weak ties in sharing knowledge
across organization subunits." Administrative Science Quarterly 44(1): 82-111.
Holggraves, T. 2002. "Language as Social Action: Social Psychology and Languyage Use",
Thomas Holtgraves, Language as social action (Taylor &Francis, 2002).
Huang, Y., Zhu, M., Wang, J., Pathak, N., Shen, C., Keegan, B., Williams, D., and Contractor, N.,
2009. "The Formation of Task-oriented Groups: Exploring Combat Activities in Online
Games," International Conference on Computational Science and Engineering, pp. 122-127.
Kramer, R.M., 1999. "Trust and Distrust in Organizations: Emerging Perspectives, Enduring
Questions," Annual Review of Psychology (5o): 569-598.
Markus, M.L., 1994. "Electronic Mail as the Medium of Managerial Choice," Organization
Science 5(4): 502-527.
Marsden, P. 1990. "Network Data &Measurement." Annual Review of Sociology (16): 435-463.
McCain, B., O'Reilly, C., and Pfeffer, J. 1983. "The effects of departmental demography on
turnover: The case of a university," Academy of Management Journal (26): 626-641.
Mergel, I., Lazer, D., and Binz-Scharf, M.C., 2008. "Lending a helping hand: voluntary
engagement in knowledge sharing," International Journal of Learning and Change 3(1): 522.
Nelson, R., and Winter, G., 1982 "An Evolutionary Theory of Economic Change." Cambridge,
MA: Belknap Press.
Nohria, N., and Eccles, R., 1992, "Networks and Organizations Structure, Form and Action",
Boston: Harvard Business School Press.
Nonaka, I. 1990. "Redundant, overlapping organization: A Japanese approach to managing the
innovation process", California Management Review, Spring, pp. 27-38.
Nonaka, I. 1994 "A Dynamic Theory of Organizational Knowledge Creation," Organization
Science 5(1):14-37.
O'Reilly, C. 1980 "Individuals and Information Overload in Organizations: Is More Necessarily
Better?" The Academy of Management Journal 23(4): 684-696.
O'Reilly, C. Caldwell, D., and Barnett, W. 1989. Work group demography, social integration, and
turnover. Administrative Science Quarterly (34): 21-37.
Olguin-Olguin, D.,Waber, B. N., Kim, T. J., Mohan, A., Ara, K., and Pentland, A. 2009, "Sensible
Organizations: Technology and Methodology for Automatically Measuring Organizational
Behavior", IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics 39(1):
Olguin-Olguin, D., and Pentland, A. 2006, "Human Activity Recognition: Accuracy across
Common Locations for Wearable Sensors". IEEE ioth International Symposium on
Wearable Computing (Student Colloquium Proceedings). Montreaux, Switzerland
Pentland., A. 2006. "Automatic mapping and modeling of human networks" Physica A:
Statistical Mechanics and its Applications.
Podolny, J., and Baron, J. 1997. "Resources and relationships: Social networks and mobility in
the work-place." American Sociological Review 62(5): 673-693.
Polanyi, M. 1966. "The Tacit Dimension." New York: Anchor Day Books.
Quintane, E., and Kleinbaum, A. 2008. "Mind Over Matter? E-mail and Survey as
Representations of Observed and Perceived Networks", International Social Network
Conference. St. Pete Beach, Florida, USA.
Reagans, R., and McEvily, B. 2003. "Network Structure & Knowledge Transfer: The Effects of
Cohesion &Range." Administrative Science Quarterly (48): 240-67.
Reagans, R., and Zuckerman, E. 2001. "Networks, diversity, and productivity: The social capital
of corporate R&D teams." Organization Science 12(4): 502-517
Slaughter, S., and Kirsch, L. 2006. "The Effectiveness of Knowledge Transfer Portfolios in
Software Process Improvement: A Field Study." Information Systems Research, 17(3):301320
Sparrowe, R, Liden, R., Wayne, S., and
Kraimer, M., 2001. "Social networks and the
performance of individuals and groups." Academy of Management Journal 44(2): 316-325.
Sproull, L., and Kiesler, S. 1986, "Reducing Social Context Cues: Electronic Mail in
Organizational Communication," Management Science 32(11): 1492-1512.
Trevino, L.K., Daft, R.L., and Lengel,R.H.1991. "Understanding Managers' Media Choices: A
Symbolic Interactionist Perspective," in J. Fulk and C. Steinfield (Eds.), Organizations and
Communication Technology, Newbury Park, CA: Sage, 71-94.
Teece, D. J. 1986. "Profiting from technological innovation: Implications for integration,
collaboration, licensing and public policy." Research Policy (15): 285-305.
Waber, B.N., Olguin Olguin, D., Kim, T., Mohan, A., Ara, K., and Pentland, A. 2007.
"Organizational Engineering using Sociometric Badges" International Conference on
Network Science, New York, NY.
Walther, J. 1995. "Relational Aspects of Computer-Mediated Communication: Experimental
observations over time", Organization Science 6(2)
Winter, S. 1987. "Knowledge and competence as strategic assets." In DavidJ. Teece (ed.), The
Competitive Challenge: 159-184. Cambridge, MA: Ballinger.
Uzzi, B. 1996. "The Sources and Consequences of Embeddedness for the Economic Performance
of Organizations: The Network Effect." American Sociological Review (61): 674-98.
Uzzi, B. 1997. "Social Structure and Competition in Interfirm Networks: The Paradox of
Embeddedness." Administrative Science Quarterly (42): 35-67.
Zenger, T., Lawrence, B. 1989. "Organizational demography: The differential effects of age and
tenure distributions on technical communication." Academy of Management Journal (32):