Chapter 4: Methods for Analyzing Networks

advertisement
Chapter 4: Methods for Analyzing Networks
This chapter discusses various methods for analyzing social networks, giving
equal attention to traditionally important topics and newly emergent methods. In
particular, we discuss topics such as centrality, cohesiveness measures, structural
equivalence, clustering, multidimensional scaling, blockmodels, logit p*, affiliation
network, and analysis of lattices. Because representing network data is always precedent
to analyzing social networks, this chapter naturally starts with a description of two
methods that are used commonly to represent social networks: graphs and matrices.
1
Graphs and matrices
Graphs and matrices are two separate methods to represent social network data.
Graphs present a visual format of a social network, whereas matrices are mathematical
algebraic representation of network relations. Although social network scholars may
freely choose either graphs or matrices to present their data, both methods have their
respective advantages and disadvantages. Graphs are much straightforward visual
illustrations of network structures, but it does not support mathematical manipulations. In
contrast, matrices are less user-friendly, but it facilitates mathematical and computer
analyses of social network data.
Often a matrix is a squared array of elements arranged in rows and columns. For
example, a mathematic notation of A (N, N) denotes a social network matrix A with N by
N social actors. The headings of rows and columns are arranged in the same sequence to
indicate social actors in the network. Values in the matrix are actual measures of the
relationship between a pair of actors in the matrix. Normally, the actors in the rows are
the senders of a specific relation, whereas the actors in the columns are receivers of the
relation. Thus, a mathematic notation X i , j ,k = 1 suggests that actor i sends a relation k to
actor j in the binary network, whereas X i , j ,k = 0 suggests an absence of relation k from
actor i to actor j. Note that in non-directed graph, X i , j ,k = X j ,i ,k that is: the value on the
relation k between the sender i and the receiver j always equal the value on the relation
between the send j and the receiver i. We call them symmetric matrices. Empirical
exemplars of symmetric matrices are marriage network, or communication channels. In
contrast, many social networks such as reporting or friendship network are asymmetric:
2
X i , j ,k and X j ,i ,k often diverge, suggesting a disagreement in the assessment of the
relation under scrutiny between actors i and j.
Many social networks contain integer values to reflect the intensity of the
relationship such as frequency of contacts, strength, and magnitudes of associations.
Those networks are called valued graphs, in which the value of X i , j ,k ranges from 0 to
whatever maximum value in the network, as opposed to its restricted range in binary
graphs as either 0 or 1. Same as the binary graphs, valued graphs are also distinguished as
non-directed symmetric, in which X i , j ,k = X j ,i ,k , and directed asymmetric matrix, in
which X i , j ,k = or  X j ,i ,k .
Sometimes, network researchers use non-square matrices to indicate actors’
attributes, or participations in certain events. A mathematical notation A (N, M) is
commonly used to denote a non-square matrix A, in which N is the number of actors and
M is the attributes, events, or locations under investigation. Freeman and Webster (1994),
for example, observed 43 regular beach-goer and recorded 353 events over 31 days, in
which interaction between the 43 beach-goer took place. They created a 43 by 353
matrix, in which a joint entry of 1 between row i and column j indicates that person i was
involved in the interaction event j.
To illustrate presentation of matrix in representing a social network structure, we
discuss network research by Feldman-Savelsberg et. al., (2005) on Cameroon women’s
hometown associations. To analyze how collective memory affects women’s discussion
over reproduction, Feldman-Savelsberg et al., (2005) interviewed 156 women belonging
to 6 women’s associations in Yaounde, Cameroon. Their in-depth interviews contain
questions about women’s social network, such as “please rank the strength of ties
3
between you and other women in the same association according to the following
schema: 1) confidant, 2) friends, 3) acquaintances, and 4) complete strangers.” As each
woman was asked only to rank her ties with other women in the same association, the
network data contain 6 network structures for the 6 associations. For a concise
illustration, we use only network structure of women’s association 6 with 6 women.
Table 4_1 demonstrates the matrix representation of women’s social network
structure of hometown association 6 at Yaounde, Cameroon. Because each of the 6
women was asked to rank her relation with the other 5 women, the matrix in representing
the network structure is valued and asymmetric. Women in the rows are the “senders” or
“evaluators” of their relation with other women, whereas women in the columns are the
“receivers” or “evaluatees” of their relation with the other 5 women. For example,
woman 1 ranks her tie with woman 3 at the strength level 2 (friend), while woman 3
ranks her tie with woman 1 at the strength level 3 (acquaintance). Disagreement occurs in
their respective assessment of the relationship between two women. Note that diagonal
values in the matrix are null: we do not consider how each of the 6 women assesses her
relation with herself as valid.
Figure 4_1 shows the graph representation of the social network between 6
women in Association 6. Out of the total 30 directed ties between the 6 women, only 4 of
those are mutually agreed: two women rank the same on their relation. Women 1 and 2
mutually rank each other as confidents, whereas women 5 and 6 mutually rank each other
as total strangers. In addition, women 2 and 5 rank each other as acquaintances, and
women 3 and 5 rank each other as friends. Other pairs have different ranking on their
relations. For example, woman 6 ranks woman 2 as confident, whereas woman 2 ranks
4
woman 6 as mere acquaintances. Although the graph representation present much
intuitive and straightforward picture of the network structure, they can be poor visual
illustration of large networks with tens or even hundreds actors. We can see that with
only 6 actors and 30 directed relations, the graph appears to be overwhelmingly
entangling. In contrast, matrix can easily display network data with tens of actors.
5
Centrality, Prestige, and Power Measures for Ego-centric and
Complete networks
One of the most important indicators in social network data analyses is centrality,
measured at both individual and group level (Wasserman and Faust 1994: 169-219).
Centrality measures at the individual level indicate the extent to which actor’s position
approximates the central position of the network. Therefore, centrality may suggest
prestige and power, in the sense that central actors commonly receive most “choices”
from other actors (prestige) and, due to their central positions, receive and control great
amount of information or commodities flowing in the network (power). However, Knoke
and Burt (1983) provided highly cogent advice that centrality and prestige are not
interchangeable concepts and may suggest disparate processes. Centrality, in measuring
relative position of a network actor, is largely indifferent to the directions of relations,
whereas prestige, in measuring actor’s influences, is highly sensitive to the relational
directions. Therefore, while centrality is essential network indicator for both non-directed
and directed graphs, prestige measure is mostly relevant to directed graphs. Group level
centrality is normally called centralization. We defer the detailed discussion on group
centralization to the next section.
Centrality and prestige measures are also different between ego-centric networks
and complete networks. Below, we first discuss centrality and prestige in complete
networks, after which we discuss those measures in ego-centric networks.
Centrality and prestige in complete network
Actor centrality and group centralization include several different measures such
as degree measure, closeness measure, betweeness measure, and information measure.
6
The computation and implementation of those measures vary, depending on the type of
social network data. Starting with the simplest case, we first discuss the measures of
centrality and centralization in undirected graphs and then move on to the directed
graphs. Because most development in centrality and centralization measures assumes
binary graphs, we will restrict our discussion to binary data, and encourage network
scholars to more attention to centrality and centralization issues in valued graphs.
Measures of Actor Centrality and Group Centralization
Actor degree centrality measures the extent to which an individual actor
connects to other actors in a social network. Suppose a social network has g actors,
degree centrality for actor i measures the aggregation of i’s connection to other g-1
actors:
g
C D ( N i )   X ij (i  j )
(4.1)
1
g
CD ( Ni ) denotes degree centrality for node i, and
X
ij
denotes the aggregation
1
of presence of a tie from the node i to other nodes j (j denotes nodes from 1 to g,
excluding i). However, note that degree centrality so measured reflects not only a node’s
connectivity with other nodes, but also the size of the network, as the larger the network
size, the higher the degree centrality. Therefore, a given actor degree centrality means
either the actor is well-connected in a small network or the actor is only connected to a
few other nodes in a large network. To eliminate the effect of network size on the degree
centrality measure, researchers (Wasserman and Faust 1994: 179) recommended
normalized degree centrality:
7
g
C (Ni ) 
'
D
X
ij
(i  j )
1
(4.2)
g 1
The normalized degree centrality divides the degree centrality by the maximum
number of possible connections with g actors (g –1). Controlling for the network size,
normalized degree centrality reflects only the connection of a given node in a social
network, with a range from 0 to 1, indicating from no connections with other nodes to
connections with all other nodes respectively.
Actor degree centrality measures the extent to which actors involve in
relationships. Actors receive high normalized degree centrality are the most visible actors
in the network. In particular, the closer the normalized degree centrality is to 1, the
greater the actor’s involvement in the relationship networks. Researchers can readily
apply such concept in measuring access, control, and brokerage of information networks,
in which the sheer involvement in the relationship is more important than the source and
object of relation (Knoke and Burt 1983: 195-222).
Unlike actor degree centrality, Group Degree Centralization measures the
extent to which actors in a social network differ with each other in their degree
centralities. Group degree centralization resembles closely to measure of dispersion in
statistics, indicating the variability or spread of individual actor degree centrality in a
network. Freeman (1979) proposed a generic mathematic solution to indicate such group
index of centralization.
g
CA 
[C
i 1
A
(n*)  C A (ni )]
(4.3)
g
max  [C A (n*)  C A (ni )]
i 1
8
C A (n*) denotes the largest actor centrality observed in a network, whereas
C A (ni ) indicates actor centralities for other actors in the network. Thus, the numerator in
the equation measures the aggregation of the difference in centralities between the node
with the largest centrality and other individual nodes. The denominator is the theoretical
maximum possible sum of differences in actor centralities in a network.
Based on such generic suggestion to compute group level index of centralization,
Wasserman and Faust (1994: 180) proposed a method for computing group degree
centralization.
g
CD 
 [C
i 1
D
(n*)  C D (ni )]
(4.4)
( g  1)( g  2)
The numerator measures the sum of the difference in degree centrality between
the node with the highest degree centrality and other nodes. The denominator measures
the maximum possible sum of difference between the node with the highest centrality and
other nodes. Note that as previously argued, the node with the highest centrality with g
nodes should have “g – 1” degree centrality, and all other nodes should have “1” degree
centrality (in this scenario, the other nodes must have “1” degree centrality, instead of “0”
degree centrality to make it possible that the node with the highest degree centrality
achieves g – 1). Therefore, the distance between the highest node and other node is g – 1
– 1 = g – 2. Such a distance repeats g – 1 times to cover the distance between the highest
node with all other nodes. Thus, the maximum possible sum of degree centrality
difference between the highest node and other nodes is (g –1)(g –2).
Such group index of degree centralization ranges from 0 to 1. When degree
centrality in a social network has a perfect dispersion that every node has the same degree
9
g
centrality,
[C
i 1
D
(n*)  CD (ni )] will be 0, thus group level degree centralization is 0. At
the other extreme, the degree centrality has complete uneven dispersion that one node has
the highest g – 1 centrality and other nodes all have 1 degree centrality. The numerator
g
[C
i 1
D
(n*)  CD (ni )] will be equal the denominator (g – 1)( g – 2), thus group degree
centralization equals 1. Therefore, the closer the group index of degree centralization is to
1, the more uneven or hierarchical in the degree centrality between notes in a social
network.
Actor Closeness Centrality was developed to reflect how close each node to the
other nodes in a social network (Sabidussi 1966). Actor closeness centrality index is a
function of actor’s geodesic distance to all other nodes in the network, geodesic distance
is defined as the length of the shortest path between all pairs of nodes in a network. Based
on Sabidussi’s (1966) suggestion, actor closeness centrality is computed with the
following formulae.
g
Cc (ni )  [ d (ni , n j (i  j ))]1
j 1
(4.5)
Actor closeness centrality (for actor i) is actually the inverse of the sum of
geodesic distance between the actor i and other actors in the network. Therefore, actor
closeness centrality can never be 0 as the denominator of 0 is mathematically undefined.
In empirical social network analysis, such restriction requires that all the nodes in a
network have at least one connection to other nodes. Completely isolated node with no
connection to other nodes does not have a valid closeness centrality measure. In contrast,
10
actor closeness centrality can be 1, assuming a network with two actors connected with
each other.
The low value of actor closeness centrality, thus the high value of the sum of the
distance between a given node and other nodes in a network, results from either that the
node locates in a relatively large network or that the node locates in a small network but
has relatively long distance with other nodes. To control for the size of the network, thus
engendering meaningful comparison in closeness centrality between nodes from different
network, Wasserman and Faust (1994: 186) recommended normalized closeness
centrality as the following
CC' (ni ) 
g 1
g
 d (n , n
j 1
i
j
(4.6)
(i  j ))
To compare and contrast the closeness centrality and the normalized closeness
centrality, we produce two network illustrations. Figure 4_2 shows a 3-node network
structure, in which actor A is directly connected to B with geodesic distance of 1 and
indirectly connected to C with geodesic distance of 2. Thus, the closeness centrality for A
in figure 4_2 is 1/3. Figure 4_3 shows a 4-node structure with direct connection between
the nodes. Actor A has direct connection with Actors B, C, and D with geodesic distance
of 1. Therefore, actor A has a closeness centrality of 1/3. Even though actor A in graph
4_3 is better connected than the actor A in graph 4_2, their closeness indices are the
same, simply because actor network depicted in figure 4_3 has more nodes than does
4_2. In contrast, the normalized closeness centrality will distinguish the two actors by
taking into account of the network size. The normalized closeness centrality for actor A
in 4_3 is 3/3=1, whereas that value for actor A in figure 4_2 is 2/3. Therefore, the closer
11
the actor’s closeness centrality is to 1, the better the actor is connected to other nodes, in
the sense that the actor can reach other nodes via shortest geodesic distances.
Similar to Group Degree Centralization, Group closeness centralization is a
dispersion measure, indicating the hierarchy of actor’s closeness centralities in a given
network. In particular, Group Closeness Centralization measures the extent to which
actors in a given network differ from each other in their closeness centralities. According
to Freeman’s (1979), group closeness centralization is computed with the following
formula.
g
CC 
 [C
i 1
'
C
(n * )  CC' (ni )]
(4.7)
[( g  2)( g  1)] /( 2 g  3)
The group closeness centralization reaches 1 when the network embraces a
complete uneven distribution in actors’ closeness centralities, in which one actor has the
highest closeness centrality and all others have the lowest closeness centralities. In
contrast, the group closeness centralization equals 0, when the network has a complete
even distribution in actor’s closeness centralities, in which every actor receives the same
closeness centrality.
Actor Betweenness Centrality measures the extent to which an actor lies on the
geodesic path between two other actors in the network. Actor betweenness centrality is
important measure of control of information or resource flow between other actors in the
network. Suppose that actor j has to go through actor i to reach actor k, actor i has
responsibilities or control over the content and the timing in passing the message between
actors j and k. The more the actor i locates at the geodesic path between pairs of other
actors, the more control actor i has in the information or resource flows in the network.
12
To qualify actor i’s betweenness centrality, Freeman (1977) proposed the
following procedure: first assume g jk is the number of geodesic paths between the two
actors j and k, and g jk (ni ) is the number of geodesic paths between the two actors j and k
that contain actor i. Thus, dividing g jk (ni ) by g jk measures the degree to which actor i
sits on the geodesic paths connecting j and k. Aggregating g jk (ni ) / g jk should reflect the
extent to which actor i sits on the geodesic paths for all pairs of the remaining nodes in a
network. The following formula reflects such logic.
CB (ni )  
g jk (ni )
g jk
( i  j, j  k )
(4.8)
This index will be 0 when node i falls on no geodesic path for all the pairs
between remaining g – 1 nodes. It reaches its maximum value of (g – 1)(g – 2)/2 when
node i falls on the geodesic path for all pairs of the remaining g – 1 nodes, assuming that
each pair has only one geodesic path. For the remaining g – 1 nodes excluding node i, the
total number of geodesic paths between all pairs (assuming that each pair has only one
geodesic path) is C g21 
( g  1)!
( g  1)!
( g  1)( g  2)
. We must add to this


2!( g  1  2)! 2!( g  3)!
2
body of knowledge of actor betweenness centrality (Freeman, 1977; Wasserman and
Faust 1994: 190) that when the pairs between the g –1 nodes have more than one
geodesic path, the theoretical maximum possible value for node i’s betweenness
centrality will be larger than
( g  1)( g  2)
, depending on how many geodesic paths
2
present between each of those pairs.
13
Wasserman and Faust (1994:110) recommended that actor’s betweenness
centrality C B (ni ) be divided by its maximum theoretical value of
( g  1)( g  2)
2
(assuming each pair has only one geodesic path) to produce the standardized actor
betweenness centrality.
'
C B (ni ) =
C B ( ni )  2
( g  1)( g  2)
(4.9)
The standardized betweenness centrality becomes 0 when the original
betweenness centrality is 0, it reaches 1 when the actor falls on the geodesic path of all
pairs between the remaining g –1 nodes. Therefore, the closer the standardized actor
betweenness centrality is to 1, the more the actor i falls on every geodesic path between
all pairs of the remaining nodes in the network.
Much like group level degree and closeness centralization, Group Level
Betweenness Centralization measures the extent to which actors in a network differ in
their individual level betweenness centralities. Following Freeman’s (1979) generic
method, Wasserman and Faust proposed the following equation to calculate group
betweenness centralization index.
g
CB 
2 [C B (n*)  C B (ni )]
i 1
(4.10)
( g  1) 2 ( g  2)
The numerator measures the sum of the difference between the actor with highest
betweenness centrality and other actors with lower between centralities. The denominator
indicates the theoretical possible maximum value of betweenness centralities for all
nodes in a network. Note that individual betweenness centrality reaches the theoretical
14
maximum value at
( g  1)( g  2)
. At group level, such a individual level maximum can
2
occur maximally g –1 times, in which one dominate node serves as the intermediaries for
all the geodesic paths between the dominate node and all other nodes. Thus, the
theoretical maximum betweenness centralization for a network with g actors is
( g  1) 2 ( g  2)
. Again, we stress that this computation assumes that each dyadic pair has
2
only one geodesic path between them. If multiple geodesic paths present between any
pair of actors, the individual maximum possible betweenness centrality will be larger
than
( g  1)( g  2)
, which produces a corresponding change in theoretical maximum
2
possible value in the group betweenness centralization.
Group betweenness centralization reaches 1 when there is one and only one
dominate actor in the network that sits on the geodesic paths of all pairs for the remaining
actors. The difference between the dominate node and all the remaining nodes is
( g  1)( g  2)
and such difference repeats g –1 times in a network with g nodes. Thus,
2
the numerator reaches the theoretical maximum value to equal the denominator,
producing the result of 1. In contrast, in a complete “egalitarian” network in which every
node has the same betweenness centrality, the numerator is 0, thus the group level
centralization is 0. Thus the closer the betweenness centralization is to 1, the more
unequal in the value of betweenness centrality between different actors in the network.
15
Measures of Prestige in Directed Graphs
In many occasions, social interactions involve directions that specify “senders”
and “receivers” of the relations in the network. Social networks that embrace directions
of the relations between the nodes are called directed graphs. In directed graphs, the mere
participation or involvements in certain relations is less important then the role of “being
receiver” or “being sender” of the relation. For example, in a reporting network of a
workplace, low echelon employees routinely report to their managerial supervisors of
their work activities and merit contribution, whereas high level employees rarely report
their work activities to their subordinates. In friendship network, an actor enthusiastically
nominated many other actors as his best friends may not receive the “best friend”
nomination from those actors.
Here, we define prestige as an indicator of the extent to which a social actor in a
network “receives” or “serves as an object” of the relations in the network. The
distinction between the “senders” or the “sources” and the “receivers” or “objects” of
relations is highly emphasized as the distinction reflects inequalities in control over
resources, and authorities and deference produced by those inequalities (Knoke and Burt
1983: 199). By definition, actor prestige can be measured by simply tallying the number
of times an actor receives nomination of a certain relation in a given network. Wasserman
and Faust (1994: 202) proposed such a measure be called actor degree prestige,
calculated with the following formula.
g
PD ( N i )   X ji ( j  i )
(4.11)
j 1
While this measure counts the number of times actor i was nominated by other
actors in a network with g nodes, its maximum value is g-1 and minimum value 0,
16
indicating respectively that actor i was nominated by all g-1 other actors, or by none of
other actors. Therefore, a standardized actor degree prestige controls for the size of
network, making it possible to compare actor’s degree prestige across different networks.
g
P (Ni ) 
X
j 1
'
D
ji
( j  i)
(4.12)
g 1
The standardized actor degree prestige achieves 1 when all the other actors
nominate actor i for a specific relation, and it is 0 when none of other actors nominates
actor i. Thus, the closer the actor i’s degree prestige is to 1, the greater its prestige in the
network. One can easily conjure up a friendship network in which a high degree prestige
for actor i vividly illustrates that many other actors in the network nominate actor i as
their best friends.
We propose that actor degree prestige can be aggregated to produce an index of
group level degree prestige. We argue before that individual actor’s degree prestige
reaches its maximum value of g-1 when all other nodes nominate this actor. A group
level degree prestige maximum can reach g(g –1), when every node in the network
nominates all other nodes and was nominated by all other nodes for a specific relation. A
summation of the number of nominations actually received by each node in a network
can be the actual measure of reciprocity of the relations in the network. We assert that
group level degree prestige can be computed using the actual reciprocity divided by the
maximum group level degree prestige, shown in the following equation
g
PD ( g ) 
g
 X
i 1 j 1
j ,i
( j  i)
(4.13)
g ( g  1)
17
The equation suggests a group level index of degree prestige measures the extent
to which either a given relation is reciprocated or actors are connected in a network. In a
fully connected and complete reciprocated network, such an index reaches 1, suggesting
that everybody nominates all other actors and was nominated by all other actors for a
specific relation. The index reaches 0 when every node stays complete isolated from
other actors, neither nominating for anybody nor being nominated by anyone else in a
network. However, we must caution audiences that group degree prestige computed using
the equation does not distinguish connection from reciprocity between the nodes in a
network. Therefore, a high group degree prestige may result from either that the nodes
are well connected or that a restricted set of nodes are highly reciprocated in a relations.
Thus, we call for more refined studies on the index of group degree prestige that can
distinguish connection and reciprocity as two separate sources.
Overall, we assert that measures of centrality, centralization, and prestige are
largely based on assumption of binary graphs. Studies of those corresponding measures
on valued graphs are scarce, with an exception of Freeman’s et. al., (1991) discussion of
betweenness centrality in valued graphs. Clearly more studies are needed to advance our
analytical techniques of centrality, centralization, and prestige in valued graphs. In
addition, analyses of centrality and prestige have been restricted only in the complete
network data until very recently, when Marsden’s (2002) landmark work extended
centrality measures from complete network data to egocentric network data. We devote
the following sections to this topic of measuring centrality in egocentric data.
18
Centrality in Egocentric Network
Focusing only on binary symmetric data on a single relation, Marsden (2002)
discussed centrality measures in egocentric network data. Because egocentric research
design survey respondents/egos to nominate alters with whom the egos have certain
relations, each ego would have a data matrix Ai containing the ego (i) and its alters with
size N i  N i , which includes the ego and all the alters. Because by definition, ego (i) has
direct ties with all of its alters, X i , j  1(1  j  N , i  j ) in the Ai matrix.
The actor degree centrality in complete network measures the extent to which an
actor is connected directly with other actors in a network (see equation 4.1). In egocentric
network, egos are connected directly with all other alters. Thus, ego i’s degree centrality
is the maximum possible value of actor degree centrality: g – 1 in a network with g
actors. Standardized degree centrality for ego i is always 1 (
g 1
 1 ).
g 1
In complete networks, actor closeness centrality is actually the inverse of the sum
of geodesic distance between the actor i and other actors in the network (see equation
4.5). The normalized closeness centrality controls for the size of the network (see
equation 4.6). By definition, ego i in its egocentric network data is connected directly
with all alters. Thus, its closeness centrality and normalized closeness centrality are
and 1 (
1
g 1
g 1
 1) respectively.
g 1
In complete network, betweenness centrality measures the extent to which a given
node i sits on the geodesic distances of all pairs between the other nodes in a network
(see equation 4.8). Because by definition, ego node i has direct connection with all its
19
alters in its own egocentric network, this ego node i serves as intermediary node for all
pairs between the alters, unless there exists a direct link between the alters. Marsden
(2002:410) asserted that betweenness centrality for node i in its egocentric data differs
from the measure in complete network data. One the one hand, the betweenness centrality
for node i can be biased downwardly in its egocentric data. The egocentric betweenness
measure does not reflect the ego node i’s intermediary location between a pair of nodes
connected via geodesic distance with length 3 or more, which was counted in complete
network betweenness measure. On the other hand, egocentric betweenness centrality can
exaggerate ego node i’s betweenness centrality if two alters are connected via both i and
another node outside egocentric network. In this case, the intermediary node outside the
egocentric network is discarded in egocentric betweenness measure but was counted in
complete betweenness centrality measure.
To illustrate the above discussions, we produce Figure 4_4 of an egocentric
network of ego node i, who nominated A, B, C, and D as its alters on a specific relation.
Note that node M is connected to two of i’s alters: A and D, though it does not belong to
ego i’s egocentric network. The total number of pairs between i’s alters is 6 (AB, AC,
AD, BC, BD, and CD), each has one unique geodesic path. Because BC are directed
connected and all other geodesic paths that connect other alters include ego i, ego i’s
betweenness centrality is 5/6. However, the presence of node M will distinguish
egocentric betweenness centrality and complete betweenness centrality for ego node i. On
the one hand, the egocentric betweenness measure for i is biased downwardly by omitting
the intermediary location of i on the geodesic paths between M and B and between M and
C. On the other hand, the egocentric betweenness for i is also biased upwardly by
20
overlooking the geodesic path between A and D that goes through M, which competes
with the AD geodesic path that goes through i.
Despite those divergences between egocentric betweenness and complete network
betweenness measures, Marsden (2002) demonstrated empirically that the two measures
closely correspond with each other by analyzing 17 network data. Therefore, egocentric
betweenness centrality is a reliable substitute for actor betweenness centrality in complete
network, when complete network data is difficult to come by. Marsden (2002) asserted
that the data collection differences between egocentric and complete network may
account for the divergences in the betweenness measures between the two network data,
pinpointing an important topic that deserves further systematic studies.
21
Cliques, Cohesion, and Connections
To the extent cohesiveness among a subset of members can be measured by
strong, direct, intense, and positive ties between them, cliques are effective network
indicator of such cohesiveness. The notion of cohesive subgroups and cliques are widely
used in social sciences to indicated frequent and intense interactions among the members.
The concept of cohesive groups and cliques help researchers to understand better how
cliques benefit its members by providing advices and instrumental supports (Dunbar
1995), and how the extensive use of cliques restricts one’s network contact ranges (Blau,
Ruan and Aldelt 1991).
Often the concepts of social groups, subgroup, and clique are used
interchangeably without rigorous definition of each (Borgatti et. al., 1990). Based on a
vast literature of subgroups in social network studies, Wasserman and Faust (1994:251)
stated four general properties that characterize cohesive subgroups. They are the
mutuality of ties, the reachability of subgroup members, the frequency of ties among
members, and the relative frequency of ties among subgroup members compared with
non-members. Such a summary of underlying characteristics of cohesive subgroup lays
foundation for operational definition of clique in measuring those subgroups.
Cliques
Wasserman and Faust (1994: 254) proposed to define a clique as a maximal
complete subgraph of three or more nodes, all of which are directed connected with each
other, and there are no other nodes that are directly connected with all of the nodes in the
subgraph. Thus, three conditions have to present simultaneously in a subgraph to suffice
22
it as a clique: 1) having at least three nodes, 2) all the nodes in the subgraph are directed
connected with each other, and 3) no other node outside such a subgraph is directed
connected with all nodes in the subgraph.
To illustrate the concept of clique, we produce the figure 4_5 that describes a
binary and symmetric network structure among 6 nodes. Two cliques are formed –
cliques BCDE and ABE – because connections between them meet the three
requirements for a clique: 1) that there are more than three nodes; 2) that they are directly
connected, and 3) no other node in the graph are directly connected with all the nodes in
the cliques. Note that although node F is connected with A and C directly, nodes A and C
are not connected directly, which disqualifies F as a member of either of the cliques. Note
also that the third requirement disqualifies EBC, EBD, BCD, and BDE from being
cliques because there is always an outside node that is directly connected with all nodes
in the groups. For example, EBC is not a clique because node D is connected directly
with E, B, and C.
In representing cohesive subgroup, clique requires very strict conditions, which
gains it a reputation of “being stingy” (Alba 1973). Because of the high thresholds of a
clique formation, empirical researchers rarely detect cliques in their actual datasets
(Wasserman and Faust 1994:256). Part of reasons for lack of clique in actual network
data lies in the design of network data collection. For example, the fixed list approach
restricts respondents’ nomination on a specific relation to a certain maximum number.
Thus, the size of clique will not exceed that number pre-imposed by researchers.
Clique also dichotomizes members in cohesive subgroup from those not in the
group, thus overlooking the gradations of the differentiations between the more central
23
and more peripheral actors. Once clique draws a boundary, no further distinction takes
place between clique members and between non-clique members. In reality, such
simplified dichotomization often is not very informative as much more important
distinction occurs between insiders and between outsiders. One of the determinants for
the number of cliques is the size of network. Small networks hardly yield any clique,
whereas large datasets often embrace numerous cliques, many of which are overlapping
with each other. This leads researchers to focus more on the multiple, overlapping cliques
rather than a single standing-along clique. Freeman (1992) developed lattices to describe
overlapping cliques.
Although the rigid definition of clique prevents it from being much informative,
the concept of clique is catalytic for many new measures of cohesive subgroups that relax
some of the stringent requirements for clique. Two general approaches emerge to propose
alternatives to cliques in measuring subgroup cohesiveness. One approach unstraines the
requirements based on nodal degree: the number of lines incident on the node
(Seidman1983; Doreian and Woodard 1994). We discussed such approach using k-core
concept in chapter 3. Essentially, the k-core defines that the subset is a k-core if every
node has ties with at least k other nodes. By changing the value of k, a researcher can set
more or less restrictive criteria for bounding a network. Another approach modifies the
clique based on nodal connectivity, which we discuss in the sessions below (Alba 1973;
Mokken 1979).
n-clique
n-clique relaxes the rigid requirement that geodesic distance between two nodes
in a clique has to be 1, which denotes that all pairs in a clique have to be connected with
24
each other directly. In n-cliques, the geodesic distance between all pairs becomes an
variable of n. By varying the value of n, network researchers can differentiate subgroups
with greater cohesiveness (higher n values) or with lower cohesiveness (lower n values).
For example, a 2-clique method identifies a clique if it has more than two nodes, the
geodesic distances between the nodes equal or less than two, and no other nodes in the
network is connected to all the clique nodes with geodesic distance of two or less. Note
that in the previous illustration of the Figure 4_5, node F is not a member of the two
cliques. Relaxing the geodesic distance from 1 to 2, the 2-clique includes F in the clique
because it is connected with all nodes of the cliques with geodesic distances of either 1 or
2. In fact, the n-clique concept specifies that the maximum geodesic distance between all
pairs of nodes in the clique cannot exceed n. Therefore, the higher the n value, the more
inclusive of the clique and the less cohesive between the clique members. The original
strict definition of the clique is actually a special case of n-clique, in which n equals 1.
Cliques in Directed Graphs
Directed graphs distinguish relational senders from the relational receivers.
Emphasis in directed graphs is the direction of a specific relation, rather than the mere
presence of a tie between two actors. For example, in a friendship network, a receiver of
many “best friend” nominations is quite distinctive from a sender of many of those
nominations, who receives few nominations itself. As cliques measure the degree of
cohesiveness between members in a subgraph, cliques in directed graphs take into
account of reciprocity of ties. Extending previous definition of clique in non-directed
graphs, we define that a clique in directed graphs as having more than three nodes, all of
25
nodes are connected directly with each other, and all ties between the nodes are
reciprocated.
To illustrate the clique detection in directed graphs, Figure 4_6 shows a network
configuration with directed ties between six nodes. Three cliques are formed – ABE,
EBD, and BCD. Note that the relation from E to C is not reciprocated, which prevents
EBCD from forming a clique. The lack of direct and mutual tie between A and C also
prevents AFC from forming a clique.
Modeling after the n-clique method in non-directed graphs, researchers (Peay
1980, Wasserman and Faust 1994: 275) proposed that the rigid requirement for a clique
be relaxed by varying the geodesic distance between mutually connected nodes in a
directed graph. The method bears great similarity with the one in non-directed graph,
except for a special handling of the directions of ties. For example, two nodes can be nconnected via four distinctive scenarios such as weakly n-connected, unilaterally nconnected, strongly n-connected, and recursively n-connected (Peay 1980). In the
strongest form of n-connection – recursive n-connected – the path from i to j uses the
same nodes and connections as the path from j to i in reverse order, in which the path
length is n or lesser. Each of the scenarios can be used to define an n-clique in directed
graphs, producing four types of n-clique subgroup. In Figure 4_6, all the nodes belong to
the same recursive 2-clique because all pairs in the graph are mutually reachable with
each other with path length 2 or less, and the paths connecting all pairs are reversible.
26
Structural Equivalence
Social scientists often are interested in not only the cohesiveness of network
actors but also in the positional equivalence between two actors in the sense that they
both connect with a same set of actors. Structurally equivalent actors are in a competition
relation more than a cohesive relation. For example, vendors of a common goods
connecting with a same set of retailers are structurally equivalent and facing stifle
competition from each other. Actors in structural equivalence are completely
substitutable with each other. That is: if one node is withdrawing from a network, its
structurally equivalent node can easily replace the leaving node while maintaining
original network configuration. For example, two vendors of same goods for a same set
of retailers are completely substitutable, that should one of them leaves the business, the
other one can quickly fill in for the departure vendor to maintain the original flow of
merchandise. Such a substitutability often produces fierce competition. Thus, network
scholars using structural equivalence to partition actors mostly are interested in
competitive relations rather than cohesive ties (Burt 1992).
Much like clique in identifying cohesive relations, formal mathematical definition
of structural equivalence is very strict. Two nodes are structurally equivalent if they have
ties to or from the same set of other actors on a specific relation. In particular, actors i and
j are structurally equivalent if for all other actors N = 1, 2, … g (N  i  j), i has a tie to
N if and only if j has a tie to N, or i has a tie from N if and only if j has a tie from N
(Wasserman and Faust 1994:356). If multiple relations emerge, such a condition must be
present for all relations for two nodes to be structurally equivalent. Note that the presence
or absence of tie between two nodes is not a factor for determining whether they are
27
structurally equivalent. Rather, the determinants of whether two nodes are structurally
equivalent are their connections with other nodes in the network.
Note that previous mathematic definition of structural equivalence assumes
directed binary graphs. In non-directed binary graphs, there is no distinction between
senders and receivers of relations. Thus, extending the definition of structural
equivalence in directed graphs, actors i and j are structurally equivalent in non-directed
graphs if for all other actors N = 1, 2, … g (N  i  j), i has a tie with N if and only if j
has a tie with N. In addition, the definition of structural equivalence needs to take into
account the values of the relations in valued graphs, where ranking scales, rather than
binary values, measure the ties between nodes. Strictly speaking, in valued graphs, two
nodes are in structural equivalence if they have ties with identical values to and from
identical other nodes (Wasserman and Faust 1994:359). If the relations are non-directed
in valued graphs, two nodes are in structural equivalence if they have ties with identical
values with identical other nodes.
Mathematical definition of structural equivalence is too rigid to be practical.
Empirical network data rarely contain pairs that are structural equivalent according to
such a stringent definition. Rather, many pairs of nodes often are approximately
structurally equivalent, in the sense that their connections with other nodes are
overlapping but not identical (Wasserman and Faust 1994:366). To reflect such gradual
approximation to structural equivalence between two actors, measure of structural
equivalence is based on the sum of distances for two nodes in their respective
connections with other nodes, rather than on whether or not two nodes are strictly
structural equivalent. The closer the sum of distances for two nodes in their respective
28
connections with other nodes is to 0, the more structural equivalent between the two
actors.
Measurement of Structural Equivalence
Measurement of structural equivalence between two actors is based on their
similarities in their patterns of relations with other network actors. Two actors are
structurally equivalent if they share common connections to and from the same set of
other actors in the network. Assuming a binary directed graph, to the extent two actors
are structurally equivalent, they should have identical entries in their corresponding rows
and columns of the matrix. Operationalizing this structural characteristic, Burt (1978)
proposed that Euclidean distance between two actors be used to measure the structural
equivalence between them.
d ij 
g
[( x
k 1
ik
 x jk ) 2  ( xki  xkj ) 2 ] (i  j  k )
(4.14)
In the equation, d ij is the Euclidean distance between actors i and j, xik is the
entry value of actors i and k in the matrix, which equals either 1 or 0 in a binary matrix.
Because d ij is the outcome of a square root of summation of two square terms, d ij  0 .
If two actors are in perfect structural equivalence, d ij = 0. The larger the d ij , the less
structural equivalence between actors i and j.
To illustrate how to compute the structural equivalence between two actors, we
produce Figure 4_7 and Table 4_2, both depict the same five-node network structure,
except that Figure 4_7 is the graph representation, whereas Table 4_2 is the matrix
representation. Figure 4_7 shows that actors 1 and 2 are structurally equivalent as they
both connect to actors 3 and 4. In contrast, actors 4 and 5 are not in structural equivalence
29
because although they both connect to actor 3, actor 4 receives connections from actors 1
and 2, whereas actor 5 does not. The followings show how to apply the above equation to
compute the Euclidean distance between actors 1 and 2.
d12  [( x13  x23 ) 2  ( x31  x32 ) 2 ]  [( x14  x24 ) 2  ( x41  x42 ) 2 ]  [( x15  x25 ) 2  ( x51  x52 ) 2 ]
(4.15)
Applying the entry values of x13 , x23 , x31 , x32 , x14 , x24 , x41 , x42 , x15 , x25 , x51 , x52 ,
which are shown in Table 4_2, d12 equals 0, indicating that they are structurally
equivalent. We leave the audiences the exercise to compute the Euclidean distance
between actors 4 and 5, which should be
2.
Computation of structural equivalence in binary non-directed graphs is simpler
than it is in directed graph because there is no distinction between senders and receivers
of the relations. The equation of structural equivalence in binary non-directed graphs is
d ij 
g
 (x
k 1
ik
 x jk ) 2 (i  j  k )
(4.16)
When multiple relations present in the network, computation of structural
equivalence between two actors should take into account their connections with other
actors for all the relations. In multiple relations situation, two actors are structurally
equivalent if and only if they are structurally equivalent in every relation of the multiple
relations. The equation reflexive of this logic follows
d ij 
R
g
[( x
r 1 k 1
ikr
 x jkr ) 2  ( xkir  xkjr ) 2 ] (i  j  k )
(4.17)
Another important measure of structural equivalence between two actors in a
network is Pearson’s correlation coefficient, which was used in the CONCOR algorithm
30
(to be discussed in detail in Blockmodeling). The correlation coefficient (rij) between two
actors i and j is
g
rij 
g
 ( X ik  R i )( X jk  R j )  ( X ki  C i )( X kj  C j )
k 1
k 1


  ( X ik  R i ) 2   ( X ki  C i ) 2 
k 1
 k 1

g
g
1/ 2


  ( X jk  R j ) 2   ( X kj  C j ) 2 
k 1
 k 1

g
g
1/ 2
i jk
(4.18)
Where
Ri 
1 g
 X ik
g k 1
and
Ci 
1 g
 X ki
g k 1
ik
R i and C i in the equation are the average values of the entry value for the row i
and column i respectively. If the two actors i and j are structurally equivalent, the
correlation between their respective rows and columns in the matrix will be 1. According
to the formula, one can compute that R1 = R 2 = 2/5 and C 1 = C 2 = 0, and r12 equals 1
in Figure 4_7.
The computation of correlation coefficients for pairs of nodes in symmetric
network is simpler than it is in asymmetric network, because there is no distinction
between X ik and X ki , and between R i and C i . The formula that computes correlation
coefficients in symmetric network is as the following:
g
rij 
(X
k 1
ik
 R i )( X jk  R j )
g
 g

  ( X ik  R i ) 2  ( X jk  R j ) 2 
k 1
 k 1

1/ 2
i jk
(4.19)
Using this formula, we developed a JAVA program to compute the correlation
coefficients between pairs of nodes in a symmetric/undirected social network shown in
31
Figure 4_5. Table 4_3 displays the results that the coefficient between B and E is 1,
whereas it is -1 between C and F.
32
Visual Displays, Clustering, Multidimensional Scaling
Images of networks were used commonly in social network studies to develop
structural insights and to communicate those insights to others (Freeman 2005). Social
network analysis has undergone three distinctive phases in the development of visual
displays of network structures (Freeman 2000). The beginning stage ranges from 1930s
to 1950s, in which network researchers relied on hand drawings to depict similarities and
differences in the positions occupied by actors (Moreno 1953). The second stage begins
in the 1970s that witnessed the automatic graphing, various software, and mainframe
computers. In this stage, network scholars have been increasingly using standardized
computation and graphing processes. In the latest development, the advent of high-speed
networks, browsers, World Wide Webs, and widespread of PCs opens a whole new array
of opportunities in visual displays of network data.
Clustering
For the most part, visual displays for exploring social network data seek to
uncover cohesive subgroups through partitioning methods. One of those partitioning
methods is hierarchical agglomerative clustering, which groups network actors into
subsets, so that actors in the same subset are relatively similar to each other. Hierarchical
agglomerative clustering normally processes N  N matrices, in which N denotes the
number of actors in the network. Although hierarchical agglomerative clustering works
on both binary and valued graphs, here we focus on binary graphs for simplicity reason.
33
Two measures – correlation coefficient and Euclidean distance – are widely used
to measure the similarity between a pair of actors in social network. An empirical
analysis using both measures suggests that they produce very similar results, although
result of correlation coefficient is easier to interpret than that of Euclidean distance
(Aldenderfer and Blashfield 1984: 24-28). Note that the computation of Euclidean
distance or correlation coefficients depends on whether the matrix is directed/asymmetric
or un-directed/symmetric.
Once the computation of similarity measure is complete, the Hierarchical
agglomerative clustering can proceed to partition actors using some threshold value, α,
which serves as the ceiling value for pairs of actors in the subsets being partitioned
( d ij  α). Thus, actors within the subsets are more structurally equivalent or are
correlated stronger than are actors across different subsets. The partitioning process
continues with successively less restrictive α (higher α) until every actor belongs to one
big group. Note that although the hierarchical agglomerative clustering produces nonoverlapping clusters, those clusters are nested, in that each cluster can be subsumed as a
member of a larger and more inclusive cluster at a higher level of similarity. Often, a tree
diagram called dendrogram is used to depict visually the sequence of mergers of clusters.
Hierarchical agglomerative clustering has three criteria in its merger inclusive
rule: single linkage, complete linkage, and average linkage. At a given α level, a single
linkage criterion would include an actor into a cluster if the actor has a correlation
coefficient larger than α with at least one of the actors in the existing cluster. In contrast,
complete linkage criterion operates on the logic opposite of the single linkage criterion
that any candidate for inclusion into an existing cluster must have a correlation
34
coefficient larger than α with all the actors in the cluster. The third criterion, the average
linkage criterion, was developed as an antidote of the extremes of the single linkage and
complete linkage criteria (Aldenderfer and Blashfield 1984: 36-44). It states that the
candidate actors being included into an existing cluster must have an average correlation
coefficient with all actors in the cluster that is larger than α. Empirical analyses of all
three criteria report that each of them has its advantages and disadvantages, which should
prompt researchers planning to use any of those methods to be acutely aware of those
issues (Aldenderfer and Blashfield 1984: 53-62).
Multidimensional Scaling
Multidimensional scaling (MDS) is yet another method to illustrate visually some
hidden underlying structures of data. The MDS has been highly instrumental in
facilitating research in various disciplines such as psychology, sociology, economics, and
educational researchers. In social network analysis, the primary goal of MDS is to detect
meaningful underlying dimensions that reflect similarities or dissimilarities (distances)
between the network actors.
Commonly the input data to MDS is an N  N symmetric ( X ij  X ji ) matrix, in
which N denotes any type of entities such as persons, communities, organizations, or
countries. Depending on what the entry numbers represent, the matrix can be similarity
matrix if the high numbers indicate great similarity between the two actors or
dissimilarity matrix is the high numbers suggest low similarity between the two actors.
The output of MDS is a visual map depicting actors’ locations that the actors with more
proximity are closer in space than the actors with less proximity. Although the MDS
35
output diagrams can be represented in any dimensions, mostly the results are presented in
two-dimensional maps.
The result of a MDS map is not directly correspondent to the entry values of the
original matrix. Rather it reflects the computed Euclidean distance between pairs of
actors in the network. Thus, an indicator called stress reflects the level of discrepancy
between the original matrix and the new matrix consisting of Euclidean distance between
all pairs (Kruskal and Wish 1978: 23-30).
Stress 
  ( f (x
ij
)  d ij ) 2
(4.20)
Scale
The f ( xij ) is a non-metric, monotonic function of the original entry values
(Kruskal and Wish 1978: 29), whereas the d ij refers to the Euclidean distance between
actors i and j on the map. The scale is a scaling factor to constraint the Stress indicator
between 0 and 1. When the MDS map perfectly reproduces the input data, f ( xij )  d ij for
all i and j, and stress is zero. Thus, the smaller the stress, the better the representation.
To illustrate the application of the two visualizing techniques, we produce a
hierarchical dendrogram graph (shown in Figure 4_8) and a multidimensional scaling
(shown in Figure 4_9) using UCINET 5.0 with network data from Figure 4_5. The input
matrix for hierarchical dendrogram is the correlation matrix shown in Table 4_3, whereas
the input matrix for MDS is the original matrix corresponding to Figure 4_5. Figure 4_8
shows the dendrogram using the “average” option: that the candidate actors being
included into an existing cluster must have an average correlation coefficient with all
actors in the cluster that is larger than α, which is shown in the up horizontal axis. For
example, the average of the correlations between F and E (0.316) and between F and B
36
(0.316) is 0.316. Thus, F is joined to the BE cluster at the level of 0.316. The dendrogram
shows that at the most superficial level, the six nodes are divided into 2 clusters: ACD
and BEF. Further split occurs within each cluster as the threshold value α increases:
within ACD, CD forms a cluster as oppose to A, and within BEF, BE forms a cluster as
opposed to F. The MDS mapping of the 6 actors yields a slightly different configuration.
The dimension specified by the X-axis clusters CBE together as the central set of actors,
whereas the dimension by the Y-axis groups CBF as central actors. Note that the stress
indicator of this MDS is 0, suggesting a perfect representation of its original data.
However, it is inevitable that when MDS encounters large datasets, the stress indicator
will be increasing to reflect a certain level of discrepancy between the input data and the
MDS mapping.
37
Blockmodels
Blockmodeling is an important method to partition network actors, which was
developed initially by Whites and his associates (White, Boorman, and Breiger 1976;
Boorman and White 1976; Schwartz 1977). Since White’s et. al., (1976) groundbreaking
work of blockmodeling, researchers have been fruitfully used the method to study various
topics such as interorganizational network (Knoke and Rogers 1979), diffusion of a new
technology (Anderson and Jay 1985), and positions and roles of cities belonging to a
world city system (Alderson and Beckfield 2004). Studies also extend blockmodeling in
its searching and partitioning methods (Winship and Mandel 1983; Wu 1983; Nowicki
and Snijders 2001). Space limitation prohibits an extensive discussion on all those
literatures of blockmodeling. Rather, this section focuses on core issues such as what is
blockmodeling method, how to implement the blockmodeling with suitable algorithm,
and how to interpret the outputs of blockmodeling partitioning.
The Blockmodeling Method
The blockmodeling is a partitioning technique to divide a population into sets of
structurally equivalent actors – blocks. In blockmodeling, a search process iteratively
partitions a population, permuting rows and columns, so that members of a block are
grouped together. The term block refers to a rectangular submatrix, consisting of
structurally equivalent actors that have great density between themselves.
Blockmodeling essentially is a data reduction technique for descriptive purpose,
searching for patterns in network data by regrouping cases and presenting condensed
aggregate-level information. Close and similar cases are grouped together to form a
38
homogeneous block, which is distinguished from other blocks. By creating several blocks
that are characterized by within block homogeneity and across block heterogeneity,
blockmodeling reveals those regularities in the patterns of relations among actors that
undergird social structure.
Blockmodeling can process matrix reflective of a social network in single and
multiple relations, directed and undirected matrices, and in binary or valued graphs. Here
we focus on binary matrix and refer readers to a more elaborated discussion on
blockmodeling of both binary and valued graphs by Doreian, Batagelj, and Ferligoj
(2005: 347-360). When input matrices reflect multiple relations, those matrices are
stacked up to produce a matrix of K N  N , whereby K denotes the number of relations
and N represents the number of cases in the matrix. The implementation of
blockmodeling is mainly through an algorithm CONCOR (Convergence of iterative
Correlations), developed by one of White’s students Schwartz (Schwartz 1977). Sections
below devote to discussing the CONCOR algorithm.
The CONCOR Algorithm
The CONCOR algorithm operates on rows, columns, or both rows and columns
simultaneously. For simplicity in our illustration, we assume that the algorithm correlates
between columns. The first step of the algorithm calculates the Pearson correlation
coefficients between all pairs across different columns. Two cases with the exact same
pattern of connections with other network actors should have a correlation coefficient of
1, whereas two cases with the opposite pattern of connections would have a coefficient of
-1. The result of this first step calculation is a symmetric N  N matrix, in which N
denotes the case and the entries represents the correlation coefficients between all pairs of
39
cases. The second step uses the results of similarity measures from the first step to group
cases into structurally equivalent sets – the blocks, so that cases with great similarities
belong to the blocks, whereas cases with great dissimilarities are separated into
distinctive blocks.
If the first step processes a matrix in which columns are either perfectly correlated
(1) or completely uncorrelated (-1), the second step of clustering would be easy. All
values in the result from the first step would be either 1 or -1, permitting a clear-cut
division that groups all the pairs of 1 together, in contrast to the other group with all the
pairs having -1. However, empirical social network data rarely fit in such a profile,
necessitating an iterative processing of the result. When the first step does not produce a
clearly divisible matrix, the CONCOR algorithm re-calculates the correlation coefficients
using the previous round of result as input matrix. This process is repeated for each
successive matrix – correlating the coefficients of the coefficients and so on. Such
repeated computation of the correlation coefficients eventually produces a matrix
containing either 1 or -1, allowing a distinctive grouping of the cases into two different
blocks. Each of the two blocks can be further divided using the same procedure:
repeatedly computing the correlation coefficients until a clearly divisible matrix emerges.
Researchers can decide when to stop the iteration of division, thus determining
the number of blocks. The CONCOR algorithm uses 1/-1 as the threshold parameter of
for division during each round of iteration, which represents the strongest criterion for an
unambiguous partition into structurally equivalent sets – the blocks. Those blocks, which
contain structurally equivalent actors constitute a squared image matrix that contains
40
binary values in its entries. The criteria to assign 0 and 1 to the entries vary, which will be
discussed further in the following sections.
The Output and the Interpretations
The output of the blockmodeling, implemented through the CONCOR algorithm,
is a squared binary image matrix, which replaces the original submatrices with blocks.
The entries of the image matrix are either 0 or 1, depending on the density of relations
within each block. Here, two criteria emerged to determine the binary value for each
entry. (1) blocks with no ties among their actors are coded as 0s (zero-blocks), blocks
with one or more ties are coded as 1s (one-block), or (2) researchers arbitrarily choose a
density cutoff point – an alpha value (), those entries with density measures below the
cutoff threshold are coded as 0, and those with the  value or above are coded as 1. The
first criterion of using 0/1 in block density to determine the entry value is the most
restrictive form as such density patterns rarely occur in real data. The second criterion
that researchers use the alpha value to dichotomize the entry values is a more common
practice. Often researchers choose the average density of the entire matrix as the cutoff
point. However, because the choice of the alpha value inevitably involves researchers’
discretionary judgment, researchers adopting the alpha value criterion are vulnerable to
criticism of being arbitrary. In response, researchers should always provide justifications
based on theoretical and empirical grounds, rather than purely mathematical principles
(Scott: 136-142).
41
Network Position Measures: Automorphic, Isomorphic, and
Regular Equivalences
Roles and positions are central concepts in social network analysis. Structural
equivalence is one of those methods to identify roles and positions for individuals in a
social network. However, its definition – two actors have to be connected with the same
set of other actors to be structurally equivalent – is too stringent to be practical.
Researchers have developed many alternative and more abstract measures to identify
roles and positions (Everett 1985; Everett, Boyd, and Borgatti 1990; Borgatti and Everett
1992; Faust 1988; Pattison 1988). The following sections discuss those new methods,
including automorphic/isomorphic equivalence, and regular equivalence. Note that with
regard to the level of abstract in describing properties of the relations, structural
equivalence is the least abstract, regular equivalence is the most abstract,
automorphic/isomorphic equivalence lies in the middle. Therefore, structural equivalence
guarantees automorphic/isomorphic equivalence, which in turn embraces regular
equivalence, whereas the reverse is not necessarily true. For simplicity in our illustration,
the following description presumes a binary, undirected graph defined by a single
relation, although with some modifications, automorphic equivalence and isomorphic
structure can be used to partition directed and valued graphs too (Wasserman and Faust
1994: 461-502).
Automorphic Equivalence and Isomorphic Structure
Automorphic equivalence and isomorphic structure are closely related concepts.
Researchers often used them interchangeably (Borgatti and Everett 1992). However, it
should be noted that isomorphic structure is used to characterize two graphs, whereas
42
automorphic equivalence describes relational properties between social actors in one
graph. Two graphs are structurally isomorphic if there is a one-to-one mapping of one set
of nodes to another such that the relations among the original nodes are also preserved. In
other words, a graph isomorphism is a mapping of the nodes in one graph to
corresponding nodes in another graph such that if two nodes are connected in one graph,
then their correspondences in the second graph must also be connected (Borgatti and
Everett 1992: 11). All graphs are isomorphic with themselves; an isomorphism of a
structure with itself is called automorphism. Two actors are automorphically equivalent if
they are connected to corresponding other positions. Automorphic equivalent nodes have
identical graph theoretic properties, such as centrality, ego-density, and clique size
(Borgatti and Everett 1992).
Automorphic equivalence relaxes the rigid requirement of structural equivalence
in defining roles and positions in social networks. Structural equivalence defines
positions by locating groups of similar individuals based on the extent to which they
share identical ties with identical others. In contrast, automorphic equivalence identifies
positions by grouping similar individuals based on the extent to which they share
identical ties with counterparts who play the same roles. For example, two professors
must have the same relations with the same set of students to be structurally equivalent,
whereas automorphic equivalence requires only two professors to have the same relations
with their own students. Therefore, automorphically equivalent actors are also
structurally equivalent, whereas the reverse is not necessarily true.
By relaxing the structural equivalence, automorphic equivalence proves to be very
useful in facilitating empirical research corresponding to various theories. Borgatti and
43
Everett (1992) summarize and clarify several studies using structural equivalence to
operationalize different theories, which, In fact, should be operationalized via
automorphic equivalence. For example, addressing Burt’s (1979) proposal to define the
industries or sectors in the economy as firms producing similar types of goods and
occupying a single position in an interorganizational network, Borgatti and Everett
(1992:21) argue that structurally equivalent firms, which buy from the same providers or
sell to the same clients, hardly constitute sectors, whereas automorphically equivalent
firms – buying from the similar vendors and selling to the similar clients – might.
Regular Equivalences
Regular equivalence is the least restrictive of the three most commonly used
definitions of equivalence: structural equivalence, automorphic equivalence, and regular
equivalence. However, it is the most important measure for the sociologist in capturing
social roles and positions. Two persons are regularly equivalent if one has a relation with
a person in a second position, the other has an identical relation with a counterpart in that
position (White and Reitz 1983: 214). Mothers with children are regularly equivalent, so
are doctors with nurses. The following paragraphs review studies on the definitions of
equivalence (Borgatti and Everett 1992; Borgatti and Everett 1993; Everett 1985;
Borgatti and Everett 1989; Doreian 1987; Everett, Boyd, and Borgatti 1990; Faust 1988),
attempting to clarify the differences between the three types of equivalence.
As the strictest form of equivalence, structural equivalence requires that a pair of
actors connect with the same set of other actors on the same type of relation. In contrast,
automorphic equivalence and regular equivalence require that a pair of actors connect
with the other actors who are structurally equivalence with each other on the same
44
relation. However, the distinction between automorphic equivalence and regular
equivalence is not always clear. Here, we state that automorphic equivalence requires that
sub-structure of graphs can be substituted for one another, whereas regular equivalence
does not require a complete substitution between two subgraphs.
To illustrate the differences, we provide an artificial organizational hierarchical
network. Figure 4_10 depicts an organizational hierarchical network divided into four
levels and linked by supervisory relations. The CEO supervises three executive level
managers A, B, and C, who supervise four middle managers (D, E, F, and G), who in turn
supervise a few rank-and-file (H, I, J, K, L, M and N). Actors B and C are structurally
equivalent because they have identical ties (supervisory relation) with identical others (F
and G). However, the other two pairs (A and B, A and C) are not structurally equivalent
but instead regularly equivalent because they are not connected with identical others but
with role-similar others. Note that these two pairs are not automorphic equivalent either
because the subgraph leading by A is not substitutable with the subgraph leading by B
and C. No structural equivalent pairs present at the middle manager level between D, E,
F, and G. Instead, several automorphic equivalent pairs surface, including DF, DG, and
FG. The subgraphs leading by D (DHI), F (FKL), and G (GMN) are completely
substitutable with each other. In addition, several regular equivalent pairs emerge,
including ED, EF, and EG. Although the subgraph leading by E is not substitutable with
those leading by D, F, and G, actor E shares some similarities with D, F, and G in that
they all are middle managers of the rank-and-file employees in the organization
hierarchical structure. Figure 4_10 vividly illustrates that among the three equivalences,
the strictest one is structural equivalence, the least strict is regular equivalence,
45
automorphic equivalence lies in the middle. In addition, in reflecting social roles and
positions, defined as an aggregate class or category of individuals who share similarities
in their relations with other categories of the rest of the social system (Faust 1988: 315),
regular equivalence is a better indicator than is structural or automorphic equivalence.
46
Logit models (p*)
Most social network methods are descriptive, attempting to represent some
underlying social structures through data reduction techniques or to characterize network
properties through algebraic computations. A recent wave of groundbreaking work
moves beyond the descriptive analyses of social networks, providing an important
statistics model – logit p* – to explain the presence of dyadic ties with a set of individual
level and graph level explanatory factors. Wasserman and Pattison (1996) firstly
proposed the logit model p* and logistic regression for social network. Their work,
however, was developed from several earlier treatises on Markov random graphs (Frank
and Strauss 1986), the log-linear modeling of directed graphs (p1) (Holland and
Leinhardt 1981), and algorithm implementation of pseudolikelihood estimate (Strauss
and Ikeda 1990). In recent developments, the proposed logit p* and logistic regression
were extended to analyze multivariate relations (Pattison and Wasserman 1999) and
valued relations (Robins, Pattison, and Wasserman 1999). Focusing on the application of
the logit p*, a couple of recent thesis describe, in detail, the data structure and
interpretation of the results (Crouch and Wasserman 1998; Anderson, Wasserman, and
Crouch 1999).
This section discusses briefly the mathematical basics to the logit p* model, while
emphasizing the applications of the method with an artificial network. Although the
method can be used to analyze multivariate relations and valued graphs, for simplicity,
this section presumes a dichotomous directed graph with single relation. Interested
readers may consult the above citations for more advanced topics.
47
Logistic Regression
The logit p* is closely related to the logistic model. Thus we start with a brief
introduction of the logistic regression and refer readers to Pampel (2000) for an
introduction to logistic model in great details.
Logistic regression model is used to explain dichotomous dependent variable
coded as a binary variable (Y* = 1 or 0), which often is presumed to have a binomial
distribution. Applying the OLS regression to the binary dependent variable, which
models the probabilities as a function of a linear combination of a vector of explanatory
variables produces two major problems: 1) that the predicted response value can be larger
than 1 or lower than 0; and 2) that the model induces heteroscedasticity: the variance of
the error term varies in response to the value of the dependent variable. The logistic
regression model corrects the problems by transforming the probabilities into logit. In
particular,
Logit (Y*) = log (
Pr(Y *  1)
)   0  1 X 1   2 X 2  ...   k X k
Pr(Y *  0)
(4.21)
One can interpret the parameters using logit, odds, or probability. Parameter
interpretation using odds is more common than the other two methods, possibly because
it is more straightforward than logit and less mathematical than the probability method
(Pampel 2000). To obtain the effects of independent variable in terms of odds, one needs
to take the exponential function of the linear equation,
Pr(Y *  1)
 exp(  0  1 X 1   2 X 2  ...   k X k )  e 0 e 1 y1 ...e k yk
Pr(Y *  0)
(4.22)
The equation shows that independent variables exert a multiplicative impact on
48
the odds of the response variable. The interpretation of the impact of a certain
independent variable ( y k ) involves taking the exponential of its original
parameter e
k
.
Logit p*
Logit p* is the application of logistic regression on analyzing social network data.
In a social network dataset with dichotomous, single, and directed relations between g
actors, the entry (i, j) in the matrix X (X = g  g) is a binary value
X i, j 
i
 j
{10ifotherwise
From X, researchers (Wasserman and Pattison 1996) proposed three additional
matrices: 1) X i, j as the matrix for the relational tie from i to j is forced to be present, 2)
X i, j as the matrix for the relational tie from i to j is forced to be absent, and 3) X iC, j the
complement relation for the tie from i to j. With these three additional matrices, one can
model the probability that the tie from i to j is present as the following,
Pr( X i , j  1 | X ) 
C
ij

Pr( X  X ij )
Pr( X  X ij )  Pr( X  X ij )
(4.23)
exp{( xij )}
exp{( xij )}  exp{( xij )}
In the formula,  is the vector of the parameters to be estimated, whereas ( xij )
and ( xij ) is the vector of network statistics when the variable X i , j  1 and X i , j  0
respectively.
The odds ratio of the presence of a tie from i to j to its absence is
49
Pr( X i , j  1 | X ijC )
Pr( X i , j  0 | X ijC )

exp{( xij )}
exp{( xij )}
(4.24)
Taking the natural log of both sides and simplifying the formula transform the
above equation into the following
(
log
Pr( X i , j 1| X ijC )
Pr( X i , j  0| X ijC )
)
  ( ( xij )  ( xij ))
(4.25)
This equation is called logit p*, which contains a vector of parameters  to be
estimated, and a vector of network statistics ( xij )  ( xij ) that arises when the variable
X i , j changes from 1 to 0.
An Artificial Network Dataset
To illustrate the application of the logit p*, we use a small artificial network
dataset. Figure 4_8 shows the binary directed graph with 6 actors, 3 boys in square and 3
girls in circle. Assuming the directed lines in the graph represent “nomination of best
friend,” actors 1 and 5 name each other as the best friend, whereas actor 2 names actor 1
as her best friend, but not vis-à-vis.
Scrutinizing the graph, one can easily detect that nominating best friend is genderspecific: best friend nominations occur more frequent between same sexes: boy-boy or
girl-girl than between cross-sex: a boy and a girl. To model this “same-sex trend” and
other network characteristics on the presence of ties, we chose five model parameters: 1)
overall degree of choice (θ), 2) differential choice within sex (θw), 3) mutuality (ρ), 4)
differential mutuality within sex (ρw), and 5) transitivity. The vector of model parameters
to be estimated is
50
θ = {θ, θw, ρ, ρw,τT }.
Computing the vector of explanatory variables for all pairs in the graph involves
calculation of the changes in the vector of network statistics z(x), when the ties between i
and j changes from 1 to 0. In particular,
Z   X ij
is the
Zw   X ij ij
is the choice within sex explanatory variable
Z   i j X ij X ji
is the mutuality variable
Z w  i j X ij X ji ij
is the mutuality within sex explanatory variable
Z T  ijk X ij X jk X ik
is the transitivity explanatory variable
choice explanatory variable
The indicator variable  ij is a binary indicator, which equals 1 if both i and j are
in the same sex group or 0 if otherwise. Note that for a directed network dataset with g
actors, the total number of cases for the logit p* should be g(g-1), derived from
Pg2 
g!
 g  ( g  1) . Thus our dataset with 6 actors would have 30 cases of
( g  2)!
directed pairs in the logit p* model. Also noted that Markov graph theory encompasses
more variables than those in our model, such as individual expansiveness (  X i  ) and
popularity (  X  j ), and graph level cyclicity. Space limitation prohibits exhibition of
the full-blown logit p* with all explanatory variables. Interested audiences can consult
Anderson et al., (1999) and Wasserman and Pattison (1996).
51
Table 4_3 shows the input dataset for logistic regression of the presence/absence
of directed friendship ties between ordered pairs. Below we illustrate the computation of
the five explanatory variables with the ordered pair from actor 5 to 6.
Change in choice =
X X

ij

ij
Change in choice within the same sex
i j X ij X ji

Change in mutuality
–
= 10 – 9 = 1
X  X 

ij ij


i j

ij ij
X ij X ji
=3–2=1
i j X ij X ji ij

Change in mutuality within the same sex
=9–8=1
–


i j
X ij X ji ij
=3–2=1


Change in the transitivity
X ij X jk X ik
ijk


–
ijk
X ij X jk X ik
=4–2=2
Note that in computing the transitivity, the original graph with the tie being
present from actor 5 to actor 6 has 4 transitivity: (1  5, 5  6, 1  6) ,
(1  6, 6  5, 1  5) , (4  2, 2  3, 4  3) , and (5  1, 1  6, 5  6) . With the
tie from actor 5 to actor 6 being forced to be absent, the graph contains 2 transitivity,
(1  6, 6  5, 1  5) and (4  2, 2  3, 4  3) . Thus, the change in the transitivity
is 4 – 2 = 2 when the tie from actor 5 to 6 changes from being present to being absent.
Loading the dataset in Table 4_3 into some commercial statistical software such
as SPSS, SAS, or STATA, one can conduct logistic regression of the “tie” as the
dependent variable with the four independent variables: choice, choice-within, mutuality,
mutuality-within, and transitivity. Regressing “tie” on the four independent variables, we
52
found some anomalies with our result. First, the variable “choice” turns out to be constant
because every ordered pair has a value of 1. Thus the model discards the “choice”
variable. Second, the remaining variables have unusually huge parameter estimates and
standard errors, indicating a potential problem of multicollinearity: the high correlation
between independent variables significantly distort the regression estimate of standard
errors. To circumvent this problem, we run four separate models, regressing the “tie”
dependent variable on the four independent variables one at a time. Third, even with
separate model, we found anomalies that the standard error for transitivity is as large as
9262 to some unknown reasons. We thus drop the transitivity model and call for more
investigation on this anomaly.
Interpretation of the logit p* results is similar to interpreting other standard
outputs from logistic regression. Model 1 in Table 4_4 shows that each unit of increase in
the choice between i and j, provided that i and j have the same sex, increases the odds that
i actually sends a tie to j by 51 (exp(3.932) = 51) times. Model 2 shows that the tendency
that actors i and j have mutual tie increases the odds that actor i actually sends a tie to
actor j by 6 (exp.(1.792) = 6) times. Model 3 shows that the tendency that actors i and j
have mutual tie increases the odds that actor i actually sends a tie to actor j by 8.5
(exp(2.140) = 8.49) times, provided that both actors are in the same sex group. Note that
the logistic regression model fitness is indicted by twice the negative of the log likelihood
(-2 log likelihood). If the model were to fit perfectly, this -2 log likelihood measure
would equal 0. Thus, large value in the -2 log likelihood suggests poor fit (Knoke,
Bohrnstedt and Mee 2002: 287-314). Comparing -2 log likelihood of two nested
equations – one is less restrictive and one is more restrictive –, one can determine
53
whether the additional predictors in the less restrictive equation significantly improve the
model fitness. Researchers have used such technique to search for the most parsimonious
model (Anderson et al., 1999).
The logit p* model marks an important step advancing social network studies. It
moves beyond representation and description, focusing on explanation of relational ties
between actors. It models explicitly the impacts on relational ties from graph level
characteristics and individual idiosyncrasies. Immense opportunities lie ahead for
network researchers to use logit p* to analyze substantive social issues. We suggest that
more practical guides showing the applications of logit p* are much needed to propagate
this new technique among network researchers.
54
Affiliation Networks
Affiliation networks are used to represent the affiliation of a set of actors with a
set of social events (Wasserman and Faust 1994: 291-343). Social actors are linked
through their joint participation in social events or membership in collectivities. Social
events are linked to each other through the multiple memberships of actors. Affiliation
networks vividly illustrate those connections between actors and events. Affiliation
network is also called membership network (Breiger 1990) or hypernetwork (McPherson
1982).
An affiliation network consists of two elements: a set of actors and a set of events,
which makes it a two-mode network. Substantive studies using affiliation networks are
numerous (Wasserman and Faust 1994: 196). In chapter 3 we discuss Freeman’s et al.,
(1987) study focusing on a group of university faculty and students who attend a series of
nine colloquiums. The researchers use affiliation network to represent their network data,
in which faculty and students are social actors and colloquiums and social events. This
chapter uses an artificial network data that consist of five social actors and three social
events to discuss topics such as how to represent affiliation network, analysis of
affiliation network, and properties of affiliation network.
The Affiliation Network Matrix and Bipartite Graph
Affiliation network can be represented with a matrix that records the affiliation of
each actor with each event. Assuming that the affiliation network has g actors and h
events, the matrix that represents such affiliation should have g rows and h columns,
whereby rows and columns indicate actors and events. If actor i attends event j, the i, j
55
entry in the matrix should be 1, otherwise the entry would be 0. Denoting such affiliation
matrix as A and values in the matrix as X i , j the following shows the condition
1 if actor i is affiliated with event j
X i, j =
{ 0 otherwise
h
Note that the row margins (  X i , j ) of the matrix A sum up the number of social events
j 1
g
an actor is affiliated with, whereas the column margins (  X i , j ) indicate the number of
i 1
actors in a social event. Thus the value of the row margins range from 0 to h, indicating
that the number of social event an actor attends can be anywhere from no event to all the
events. Similarly, the values of the column margins are from 0 to g, suggesting that an
event attracts actors from no actor to all actors in the network.
In addition to affiliation matrix A, bipartite Graph is another method to represent
an affiliation network. Bipartite graph consists of two presentation forms: a visual
bipartite graph and a bipartite matrix. A visual bipartite graph includes actors, events, and
lines connecting the actors to the events, with which the actors are affiliated. However,
bipartite graph does not permit lines connecting actors with each other or lines
connecting events with each other. A bipartite matrix contains both the actors and events
in its row and column specification. Assuming an affiliation network with g actors and h
events, a bipartite matrix should be a “g + h” by “g + h” squared matrix.
We use an artificial network consisting of five actors and three events to illustrate
how to represent an affiliation network using bipartite visual graph and bipartite matrix.
Figure 4_12 shows the affiliation network, in which the five actors are represented in
circles and the three events in squares. Lines connecting actors with events indicate that
56
the actors attend the events, or the events draw the actors. From the actor’s perspective,
actor 1 attends both events 1 and 2, actor 2 attends only event 1, actor 3 attends only
event 2, actor 4 attends event 3, and actor 5 attends both event 2 and event 3. From the
event’s perspective, event 1 draws actors 1 and 2, event 2 draws actors 1, 3, and 5, and
event 3 attracts actors 4 and 5. While the graph contains no line between actors or
between events, it shows how actors can be connected through their common affiliation
with certain events. For example, actor 1 is connected with actor 2 through their common
affiliation with event 1. Actor 1 is also connected with actors 3 and 5 through common
affiliation with event 2.
Table 4_5 shows the matrix representation of the bipartite graph. It is a squared
matrix with rows and columns representing actors and events. Cell entries connecting
actors with actors, or connecting events with events are all 0. The cell entries connecting
the five actors in the rows and the three events in the columns are either 1 or 0, depending
on whether the actor is affiliated with the event. For example, actor 1 has two 1s in both
events 1 and 2, suggesting that it attends both events. The five actors in the rows and the
three events in the columns constitute a sub-matrix A, denoted with bold fonts and
borders in the up-right portion of the table. In contrast, the lower left portion of the table
contains another sub-matrix with events as rows and actors as columns. It is actually a
transpose matrix of A, denoted as A’ ( X i , j  X j ,i ). Note that the row margin equals its
corresponding column margin. Row margins (or column margins) of actors suggest the
number of events the actors attend, whereas the row margins (or column margins) of
events indicate the number of actors who attend the events.
57
Succinctly, such bipartite matrix can be represented using the following form
0 A 
XA,E = 

A' 0 
(4.26)
Multiplying the two sub-matrix (A and A’) produces much information about the
co-membership relations between actors or co-actorship relations between events. Let’s
define XA as a symmetric and valued matrix describing the co-membership between
actors.
XA = AA’
(4.27)
Assuming an affiliation network has g actors and h events, A is a g  h matrix,
whereas A’ is an h  g matrix. Thus, XA is always a g  g matrix, in which the entry
value of Xi, j indicates the number of shared events actors i and j are affiliated with.
Meanwhile, if we assume XE is a symmetric valued graph describing the number of
common actors an event has, then
XE = A’A
(4.28)
One can easily see that XE is an h  h matrix, in which the entry value of Xi, j
indicates the number of shared members between the two events i, and j.
Using our example to illustrate
110 
 21101




100  11000  11000 


XA =  010   10101   10101 




 001   00011  00011
 011 
10112 




(4.29)
Because we have 5 actors and 3 events, the result in XA is a 5 by 5 symmetric
matrix, in which Xi, j denotes the number of events both actors i and j attend. For
example, actor 1 has 1 co-membership with actors 2, 3, and 5 through their common
58
affiliations but no co-membership with actor 4. The diagonal values Xi, i in XA have
substantive meanings as they indicate the number of event an actor is affiliated with. For
example, actor 5 is affiliated with 2 events.
Using our example to compute XE, one can easily obtain the results indicative of
the event’s actors and affiliation.
110 


11000  100   210 




XE = 10101    010   131 


 00011  001   012 




 011 


(4.30)
Similar to XA, XE has several interesting properties. First, it is a 3 by 3 matrix
symmetric matrix with rows and columns indicating the three events. Second, the
diagonal values (Xi,i) indicate the number of actors who are affiliated with the event i. For
example, event 2 has three actors (actors 1, 3, and 5). Third, the entry values coordinating
two events indicate the number of common actors the two events share. For example, X1,2
=1 suggests that events 1 and 2 share the same actor (actor 1 attends both events),
whereas X1,3 = 0 denotes that events 1 and 3 share no common actors.
As the diagonal values in XA suggest the number of events an actor is affiliated
with, one can easily compute the average rate of affiliation at the network level by
dividing the total number of events all the actors are affiliated by the total number of
g
X
actors (
i 1
g
i ,i
7
). In our example, the average rate of affiliation is 1.4 ( ), suggesting
5
that, on the average, each actor is affiliated with 1.4 events in the graph. Likewise, we
can compute the average number of actors per event by dividing the diagonal total in XE
59
h
X
by the total number of events (
event is 2.33 (
i 1
h
i ,i
). In our example, the average number of actors per
7
), indicating that, on the average, each event attracts 2.33 actors.
3
Density and Centrality in Affiliation Network
Density and centrality are important measures of network properties. In particular,
a density measure suggests the proportion of ties that are present out of the maximum
possible ties in a binary graph or the average value attached to the lines in a valued graph.
Similarly, the interpretation of density measure in an affiliation network depends on
whether the affiliation network is a binary or valued graph (Wasserman and Faust 1994:
316).
Assuming that we have an actor by actor matrix XA that describes the comembership relation, the density measure formula follows (note that our formula is bit
different from Wasserman and Faust 1994: 316 because we assume symmetric matrix in
which relations are undirected, whereas they assume asymmetric directed graphs).
g
g
i 1
j
2   X iA, j (i  j )
DA =
(4.31)
g ( g  1)
DA denotes the density measure for the actor matrix in an affiliation network. The
numerator sums up all the entry values between all distinctive pairs in the matrix,
excluding the diagonal cells. The diagonal values in the actor matrix suggest the number
of events an actor is affiliated with. In this case, we do not consider a pair involving an
actor and itself a valid unit. The denominator shows the total number of pairs in a
network with g actors, which equals g ( g  1) /2 in a symmetric matrix.
60
The formula to compute the density measure of the event by event matrix XE is as
the following:
h
h
i 1
j
2   X iE, j (i  j )
DE =
(4.32)
h(h  1)
This formula uses the summation of all the entry values between all distinctive
pairs in the matrix (excluding the diagonal cells) to divide the total number of pairs
between different events. It indicates the proportion of events that share one or more
members in common in binary graph or the average number of actors who belong to each
pair of events.
In our example, both actor matrix XA and event matrix XE are binary graphs,
excluding the diagonal values. Thus the interpretation of the density of both graphs
assumes binary network. In particular, DA = 5/10 = 50 percent, suggesting that 50 percent
of actors share co-membership by attending at least one common event. From event
network perspective, DE = 2/3 = 66.67 percent, indicating that 66.67 percent of events
share at least one common actor. Looking at the Figure 4_12, you may find that the actor
pairs that share at least one event are a1a2, a1a3, a1a5, a3a5, and a4a5, whereas the event
pairs that share at least one actor are e1e2, which shares a1, and e2e3, which shares a5. In
contrast, e1and e3 share no common actor.
Social network analysts have been studying centrality at actor level and
centralization at graph level for decades (Freeman 1979; Wasserman and Faust 1994:
chpt. 5). In previous section, we noted that actor centrality measures the importance or
visibility of actors within a network. Depending on how the concepts of importance or
visibility are interpreted, researchers described four major types of centrality: degree
61
centrality, closeness centrality, betweenness centrality, and eigenvector centrality
(Wasserman and Faust 1994: chpt. 5). To review briefly, degree centrality reflects the
extent to which an actor is active in a network, closeness centrality measures the extent to
which an actor is connected with other actors in a network via shortest paths,
betweenness centrality captures the extent to which an actor mediates flows of
information or resources between other actors in a network, and eigenvector centrality
reflects the extent to which an actor is connected to other central actors in a network. A
recent work discusses application of all four centrality measures in affiliation network
(Faust 1997). Here, we focus on the computation and interpretation of actor degree
centrality in affiliation network.
A distinctive feature of affiliation networks is that it relates not only actors and
events, but also between actors and between events. Thus, degree centrality for affiliation
networks can reflect both actor’s and event’s activity. Drawing upon the notion that
degree centrality measures the number of contacts an actor has, one can measure actor’s
degree centrality in affiliation networks by looking at the number of contacts an actor has
through co-membership with certain events (Faust 1997). Interestingly, the actor matrix
XA that was derived from the bipartite graph describes the co-membership between a
certain actor and other actors in a network. Taking advantage of this property of the actor
matrix XA, we can obtain the actor centrality by computing actor’s row margins in XA.
g
C DA (ni )   X iA, j (i  j )
(4.33)
j 1
In our example, actor 1 shares co-membership with actors 2, 3, and 5. Its degree
centrality of 3 reflects its connectivity. With only one co-membership with actor 1, actor
62
2 has lower degree centrality (1) than actor 1, suggesting its lower number of contacts
with other actors compared with actor 1.
Likewise, one can obtain the event’s degree centrality in affiliation networks by
looking at row margins of the event matrix XE, which reflect the number of other events
with which a given event is connected by sharing common actors.
h
C DE (ei )   X iE, j (i  j )
(4.34)
j 1
The degree centralities for the three events in our example are 1, 2, and 1
respectively. Event 2 has higher degree centrality than events 1 and 3 because it is
connected with both events 1 and 3 by sharing at least one common actor, whereas events
1 and 3 are connected with only event 2.
63
Analysis of Lattices
The affiliation network is essentially a two mode network data. One mode is a set
of N ( ai , a j ,....a n , ) actors, and the other mode is a set of M ( ei , e j ,....em, ) events. The
two sets are linked by affiliations. When an actor ai participates in an event ej, the binary
entry X i , j = 1 in the matrix P ( P  N  M ) , in which the actors define the rows and
events define the columns.
One of the clear advantages of using the affiliation network to represent network
structure is that the affiliation network can illustrate three types of patterning: (1) the
actor-event structure, in which actors are linked to events through their participations, (2)
the actor-actor structure, in which actors are connected with each other through their
common affiliations with certain events, and (3) the event-event structure, in which
events are related to each other through their sharing of a common set of actors.
However, the bipartite graphs, as described in a previous section that only permits links
between actors and events, fall short of visualizing the other two types of structures, such
as actor-actor structure, and the event-event structure. Lattices are developed to represent
clearly all three types of structures in a single visual model (Freeman and White 1993;
Wasserman and Faust 1994: 326-342).
Assume that we have a finite nonempty set X (x, y, z,…) and a binary relation 
in X, in which “  ” is reflexive ( X i ,i  1 ), asymmetric ( X i , j  X j ,i ) and transitive (if
X i , j  X j ,k and X j ,k  X k ,l on a relation R, then X i , j  X k ,l ). Between a pair of
elements x, y in X, there is an element m such that m  x and m  y. Such an element m
64
is called lower bound, it is the greatest lower bound, or meet when there is on other
element b such that b  x and b  y and m  b. An upper bound j is an element such that
x  j and y  j. “j” becomes the greatest upper bound, or join when there is no other
element b such as x  b and y  b, and b  j. A lattice is formed when a partial order
imposes on a finite set X that every pair of the elements in X has both a meet and a join
(Freeman and White 1993: 131).
Galois lattice
A Galois lattice encompasses dual ordering. It has two nonempty sets: actor set A
and event set E. The two sets are linked by the affiliation patterns that assign actors to
events. Therefore, Galois lattice is defined with a triple (A, E, I), in which I is the binary
relation in the matrix A  E. The sub-matrix A, which is contained in the bipartite matrix
in Table 4_3, displays such actor by event matrix. Now considering P(A) = {A1, A2, …},
a collection of subsets of A, and P(E) = {E1, E2, …}, a collection of subsets of E. The I
relation defines the mapping from P(A) to P(E): B  B  :
B  {e  E | (a, e)  I for all a  A}
(4.35)
The above mathematic expression indicates that the mapping can identify all the
events certain actor or actors are affiliated with. For example, the sub-matrix A in Table
4_5 suggests that actor 1 is affiliated with events 1 and 2, whereas actors 1 and 2 are
affiliated with event 1.
Conversely, the mapping can take place from P(E) to P(A): F  F 
F  {a  A | (a, e)  I for all e  E}
65
(4.36)
The above expression means the mapping should identify all the actors certain
event or events attract. The sub-matrix A in Table 4_5 shows that event 2 attracts actors
1, 3, and 5, whereas events 2 and 3 attract only actor 5.
Combining both mappings, the Galois lattice shows how subsets of actors are
affiliated with subsets of events. As a convention, the universal lower bound of the lattice
contains all the elements in A, and its universal lower bound contains all the elements in
E. Figure 4_13 displays the Galois lattice pictorially, using the affiliation matrix in Table
4_5 as an example. The graph describes the three events as in A, B, and C and the five
actors in numbers 1, 2, 3, 4, and 5. Each point in the graph labels both the actors and the
events that define it. Down in the bottom, the lattice contains the largest collection of
events. As it moves up, it contains larger collection of actors and smaller collections of
events. The starting point at the bottom has all events (ABC) but an empty set of actors
(  ) because no actor attends all three events. Moving up, the lattice graph shows that the
events A and B share actor 1, and the events B and C share actor 5. Moving further up,
the subsets of events get smaller but the subsets of actors become larger. It shows that the
event A attracts actors 1 and 2, whereas the event B attracts actors 1, 3, and 5. Event C
attracts actors 4 and 5. The top portion of the lattice lists all actors but no event,
indicating that no event attracts all actors in the network. In general, actors that are
incident to a line descending from events are affiliated with those events. In figure 4_13,
actor 1 is incident on lines from events A and B respectively; indicating that actor 1 is
affiliated with both events. Conversely, events that are incident on the lines ascending
from actors attract those actors. For example, Figure 4_13 shows that event C is incident
66
on the line ascending from actor 5, suggesting that event C attracts actor 5. Event B
receives lines from both actors 1 and 5, indicating that both actors attend event B.
The Galois lattice also shows some affiliation patterns between events and
between actors. For example, Figure 4_13 illustrates that actor 2 does not participate in
any other events without actor 1 (actor 2 only participates in the event (A), with which
actor 1 is also affiliated). Likewise, actor 4 only participates in the event (C) that draws
actor 5. In contrast, neither actor 1 nor actor 5 restricts itself in the events that draw actor
2 or 4 respectively. In other words, the participation in certain event for actors 1 or 5 is
not contingent upon participation of other actors. From the event’s perspective, Figure
4_13 also shows that the three events contain distinctive sets of actors, in the sense that
the actors in the three sets are overlapping but not identical. A more elaborated network
with more actors and events may reveal some containment structures that certain actors
who are present in one event appear certainly in the other events (Freeman and White
1993: 135). Thus, we can observe all three types of relations from the Galois lattice: (1)
the actor-event relation, (2) the actor-actor relation, and (3) the event-event relation.
Despite the clear advantage of using Galois lattice to examine simultaneously the
structural features of all three types of relations, its application is limited in representing
large dataset. In this vein, Galois lattice is similar to graphs, whose principal use is to
represent, not to reduce data. Large datasets commonly embrace highly complex
structures that overwhelm the Galois lattice representation. Even with reduce symbols in
Galois lattice (suggested by Freeman and White 1993), observers may encounter great
difficulty to entangle the complex images generated by lattice in representing large
datasets.
67
Galois Lattice of Network Cliques
Researchers suggest that some statistical or algebraic data reduction techniques
can be used to simplify the visual representation of Galois lattice (White 1996; Duquenne
1996). Freeman (1996) cogently recommends that Galois lattice be used to facilitate
representation of cliques among social actors. The classic Luce-Perry definition (Luce
and Perry 1949) of the cliques presumes a binary symmetric squared matrix ( A  A ) on a
social relation R. A clique C is a maximal subset containing three or more actors among
whom all pairs are linked by R. The term “maximal” means that no clique can be
contained in a larger clique. However, in practice, cliques often are too small, too many,
or too overlapping to reflect intuition social group structures (Freeman 1996:174). The
application of Galois lattices to cliques involves replacing the collection of events with a
collection of cliques. In previous section we state that Galois lattice is defined with a
triple (A, E, M), in which A is a set of human actors, E is a set of events, and M is a
binary relations in A  E . Applying Galois lattice to cliques, the Galois lattice is redefined with an another triple (A, C, and M), in which A and M are the same as their
original connotation, and C indicates a set of cliques.
Following the similar layout as the Galois lattice of events, Galois lattice of
cliques places the larger collection of cliques towards the bottom and the larger collection
of actors toward the top. Assuming that figure 4_13 shows the Galois lattice of cliques,
instead of lattice of events, the bottom lies the three cliques ABC with the null set
indicating that no actor belongs to all three cliques. As the graph moving up, it shows
fewer cliques and larger set of actors. Actor 1 belongs to both cliques A and B, whereas
actor 5 belongs to both cliques B and C. Moving further up, clique A contains actors 1
68
and 2, clique B includes actors 1, 3, and 5, whereas clique C has actors 4 and 5. The
entire set of actors lies on the top of the graph, with the null set indicating that no clique
encompasses all five actors.
Freeman (1996) discussed several important structural properties in Galois lattice
of network cliques. First, two overlapping cliques will be linked by descending lines that
converge at some labeled point lower in the lattice, whereas two non-overlapping cliques
will be linked only at the unlabeled universal lower bound with null set (  ). Assuming
that Figure 4_13 describes a Galois lattice of network cliques, cliques A and B converge
at a lower point labeled with an actor 1, indicating that the two cliques are overlapping by
sharing a common actor 1. In contrast, cliques A and C are not overlapping, indicated by
their converging point that is at the universal unlabeled lower bound.
Second, Freeman (1996) characterized the position of individual actors in the
clique lattice with several key dimensions such as chain, length, height, and depth. A
chain is a sequence made up of entirely of ascending lines of entirely of descending lines
leading from one element to another. The length of that chain is the number of lines it
contains. The height of an actor is the length of chains ascending from the universal
lower bound to that actor. In contrast, the depth of an actor is the length of chains from
the universal upper bound to that actor. Therefore, actors who appear near the bottom of
the lattice would have great depth but low height, whereas actors near the top of the
lattice would have great height and low depth. Those with great depth but low height are
deeply embedded in the network. They involve in several cliques but their affiliations
with those cliques are not dependent on others’ affiliation. In this sense, they are the core
members of the cliques. In contrast, those with great height but low depth involve with
69
the network superficially. Their affiliations with certain cliques depend on others’
affiliations. In this sense, they are the peripheral members of the cliques. Diagnosing
Figure4_13 with those insights, we can ascertain that actors 1 and 5 are core members to
cliques AB and BC respectively, whereas actors 2, 3, and 4 are peripheral actors to those
cliques.
Correspondence Analysis
While lattice represents algebraic approach to display affiliation networks,
correspondence analysis uses scaling technique to achieve the “joint display” of actors
and events in an affiliation network (Wasserman and Faust 1994: 291-343; Faust 2005).
Correspondence analysis is accomplished mainly through a mathematical technique
called “Singular Value Decomposition (SVD).” Here we provide a very sketchy
description of SVD, focusing on the issues that are directly relevant to affiliation matrix
(for details on SVD, see Strang 1996).
SVD is a decomposition of a matrix A, of size ( g  h).
A  X YT
(4.37)
The equation contains  , which is a diagonal matrix of singular values {λK}, X,
the matrix of left singular vectors of size g  h , and Y, the matrix of right singular
vectors of size h  h . The number of singular values is also called rank, denoted
commonly as W. The SVD uses the rank “W” to scale the actors and events in a graphic
display to approximate their entries in A.
The SVD in correspondence analysis involves decomposition of the normalized
version of A. There are two methods to produce the entries values for the normalized A.
One is by dividing the entries in original matrix A by the square root of the product of the
70
row and column marginal totals. The other method involves computing the product of
three matrices, two diagonal matrices R

1
2
, and C

1
2
, and the matrix of A (Faust 2005).
In particular,
R
C


1
2
1
2
= diag (
= diag (
1
ai 
1
a j
)
(4.38)
)
(4.39)
Multiplying the three matrices R

1
2
AC

1
2
produces the normalized version of A,
which can be obtained also by dividing the entries in original matrix A by the square root
of the product of the row and column marginal totals. Correspondence analysis involves
singular value decomposition of the matrix result of R
R

1
2
AC

1
2
 X YT

1
2
AC

1
2
,
(4.40)
Correspondence analysis produces three sets of information: a set of g scores for
rows of the matrix, U = {uik}, for i = 1,2,…g and k = 1,2,…w; a set of h scores for
columns of the matrix, V = {vjk}, j = 1,2,…h and k = 1,2,…w; and the singular values
 = {λK}, for k = 1,2,…w, which indicates the importance of each dimension. To
achieve joint display of row and column entries, correspondence analysis compute the
score for an actor as the function of the weighted average of the scores for the events with
which it is affiliated and the score for an event as the function of the weighted average of
the scores of its constituent actors. The formulas are as the following,
71
h
λk uik =
aij
a
j 1
v jk
(4.41)
u ik
(4.42)
i
and
g
λk vjk =
aij
a
i 1
j
In both equations, the aij is the entry value of the ith row and the jth column in the
original matrix A.
To illustrate, we use our previous example of the affiliation network data
consisting of the five actors and three events. Table 4_6 shows the matrix A in its both
original and normalized version. Table 4_6 shows the UCINET solution to the Singular
Value Decomposition of the normalized A, which produces three sets of scores uik, vjk,
and λK. Note that the actors’ scores are the function of the scores of the events with
which the actors are affiliated. Likewise, the events’ scores are the function of scores of
the actors they attract. In particular, actor 1’s score (-0.661) in the first dimension is
derived through the weighted average of the two events it is affiliated divided by the
1
1
 (1.146)   0
2
 0.661 . Likewise, the score for event 1 in
singular value (λ1) 2
0.866
dimension 1 (-1.146) is derived through the weighted average of scores of the actors it
1
1
 (0.661)   (1.323)
2
 1.146 . Interested readers can
attracts (actors 1 and 2) 2
0.866
determine that scores of other actors and events should match with the computed scores
72
using the event scores, with which the actor is affiliated or using the actor scores, whom
the event attracts.
Figure 4_14 shows the visual graph of the correspondence analysis of the
affiliation network. Numbers in parentheses are the scores in both dimensions, which
should be the same as their corresponding numbers in Table 4_7. The X-axis and Y-axis
denote the first dimension and the second dimension respectively (λ). From the graphic
display, we determine that actor 1, who is affiliated with events 1 and 2, and actor 5, who
is affiliated with events 2 and 3, are central in both dimensions, thus locating in the center
of the graph. Event 2 appears to be the center of the first dimension, which is possibly
due to that it attracts the most actors (actors 1, 3, and 5), but it is slightly further away
from the center of the second dimension than events 1 and 3, each attracts only two
actors. Actor 3 locates at the center of the first dimension, which is due to its affiliation
with the central event (event 2), but compared with all other actors, actor 3 also locates at
the most peripheral position at the second dimension, possible because it attends only one
event. Actors 2 and 4 are in peripheral locations in both dimensions, reflecting their
respective affiliations with only one event. Likewise, events 1 and 3 are also peripheral in
both dimensions, suggesting that they have attracted fewer party-goers than does event 2.
73
References
Alba, Richard D. 1973 “A Graph-Theoretic Definition of a Sociometric Clique” Journal
of Mathematical Sociology 3:113-126
Aldenderfer, Mark S. and Roger K. Blashfield. 1984. Cluster Analysis. Beverly Hills.:
Sage Publications
Alderson, Arthur and Jason Beckfield. 2004. “Power and Position in the World City
System” American Journal of Sociology 109/4: 811-851
Anderson, Carolyn, Stanley Wasserman and Bradley Crouch. 1999. “A p* Primer: Logit
Models for Social Networks” Social Networks 21:37-66
Anderson, James G and Stephen Jay. 1985. “Computers and Clinical Judgement: The
Role of Physician Networks” Social Science and Medicine 20/10: 969-979
Blau, Peter M; Ruan, Danching; Ardelt, Monika. 1991. “Interpersonal Choice and
Networks in China” Social Forces 69/4: 1037-1062
Boorman, Scott and Harrison White. 1976. “Social Structure from Multiple Networks. II.
Role Structures” American Journal of Sociology 81/6: 1384-1446
Borgatti, Stephen and Martin Everett. 1992. “Notions of Position in Social
Network Analysis” Sociological Methodology 22:1-35
Borgatti, Stephen and Martin Everett. 1993. “Two Algorithms for Computing Regular
Equivalence” Social Networks 15/4: 361-376
Borgatti, Steven, Martin Everett and Paul Shirey 1990. “LS sets, Lambda Sets and Other
Cohesive Subsets. Social Networks 12: 337-357
Borgatti, Stephen and Martin Everett. 1989. “The Class of All Regular Equivalences:
Algebraic Structure and Computation” Social Networks 11/1: 65-88
Breiger, Ronald L. 1990. “Social Control andSocial Networks: A Model from
GeorgSimmel,” Pp. 453-476 in Craig Calhoun, Marshall W. Meyer, and
W.Richard Scott (eds.), Structures of Power and Constraint: Papers in Honor of
Peter M. Blau. New York: Cambridge University Press
Burt, Ronald S. 1992. Structural Holes: the Social Structure of Competition. Cambridge,
Mass.: Harvard University Press
Burt, Ronald S. 1979. “Disaggregating the Effect on Profits in Manufacturing Industries
of Having Imperfectly Competitive Consumers and Suppliers” Social Science
Research 8/2: 120-143
Burt, Ronald S. 1978. “Cohesion versus Structural Equivalence as a Basis for Network
Subgroups” Sociological Methods and Research 7/2: 189-212
74
Crouch, Bradley and Stanley Wasserman 1998 “A Practical Guide to Fitting Social
Network Models via Logistic Regression” Connections 21:87-101
Doreian, Patrick, Vladimir Batagelj, Anuska Ferligoj. 2005. Generalized Blockmodeling
Cambridge, U.K. Cambridge University Press
Doreian, Patrick and Katherine Woodard 1994 “Defining and Locating Cores and
Boundaries of Social Networks” Social Networks, 16/4:267-293
Dunbar, RIM and M Spoor 1995 “Social Networks, Support Cliques and Kinship”
Human Nature 6: 273-290
Duquenne, Vincent. 1996. “On Lattice Approximations: Syntactic Aspects” Social
Networks 18/3:189-199
Everett, Martin. 1985. “Role Similarity and Complexity in Social Networks” Social
Networks 7/4:353-359
Everett, Martin, John Boyd, and Stephen Borgatti. 1990. “Ego-Centered and Local Roles:
A Graph Theoretic Approach” The Journal of Mathematical Sociology 15: 163172
Faust, Katherine. 2005. “Using Correspondence Analysis for Joint Displays of Affiliation
Networks.” in Models and Methods in Social Network Analysis, edited by
Carrington, Peter J., John Scott, and Stanley Wasserman. New York: Cambridge
University Press.
Faust, Katherine. 1997. “Centrality in Affiliation Networks” Social Networks 19/2: 157191
Faust, Katherine. 1988. “Comparison of Methods for Positional Analysis: Structural and
General Equivalences” Social Networks 10/4: 313-341
Feldman-Savelsberg, Pamela, Flavien Ndonko and Song Yang. 2005. “Remembering ‘the
troubles:’ Reproductive Insecurity and the Management of Memory in
Cameroon” Africa 75/1: 10-29
Frank, Ove and David Strauss. 1986. “Markov Graphs” Journal of the American
Statistical Association, 81:832—842
Freeman, Linton 2005 “Graphic Techniques for Exploring Social Network Data” Pp 248270 in Models and Methods in Social Network Analysis, edited by Peter J.
Carrington, John Scott and Stanley Wasserman Cambridge MA: Cambridge
University Press
Freeman, Linton 2000 “Visualizing social networks” Journal of Social Structure 1:1-15
75
Freeman, Linton C. 1992. “The Resurrection of Cliques: Application of Galois Lattices”
BMS, Bulletin de Methodologie Sociologique 37: 3-24
Freeman, Linton. 1979. “Centrality in Social Networks: I. Conceptual Clarification”
Social Networks 1: 215-239
Freeman, Linton. 1977. “A Set of Measures of Centrality Based Upon Betweeness”
Sociometry 40:35-41
Freeman, Linton and Cynthia Webster. 1994. “Interpersonal Proximity in Social
and Cognitive Space” Social Cognition, 12/3: 223-247
Freeman, Linton C and Douglas White. 1993. “Using Galois Lattices to Represent
Network Data” Sociological Methodology 23: 127-146
Freeman, Linton, Stephen Borgatti and Douglas White. 1991. “Centrality in Valued
Graphs: A Measure of Betweeness Based on Network Flow” Social Networks 13:
141-154
Freeman, Linton C, Kimball Romney, and Sue Freeman. 1987. “Cognitive Structure and
Informant Accuracy” American Anthropologist, 89/2: 310-325
Holland, Paul and Samuel Leinhardt. 1981 “An Exponential Family of Probability
Distributions for Directed Graphs.” Journal of the American Statistical
Association. 76:33-65
Knoke, David, George W. Bohrnstedt, Alisa Potter Mee. 2002. Statistics for Social Data
Analysis 4th Edition, Wadsworth Publishing
Knoke, David and Ronald Burt. 1983. Prominence. Pp 195-222 In Applied Network
Analysis: A Methodological Introduction, edited by Burt, Ronald and Michael J.
Miner, Beverly Hills CA: Sage
Knoke, David and David Rogers. 1979. “A Blockmodel Analysis of Interorganizational
Networks” Sociology and Social Research 64/1: 28-52
Kruskal, Joseph B. Myron Wish. 1978. Multidimensional Scaling. Beverly Hills, Calif.:
Sage Publications
Luce, Duncan and Albert D. Perry. 1949. “A Method of Matrix Analysis of Group
Structure.” Psychometrika 14: 95—116
Marsden, Peter 2002 “Egocentric and Sociocentric Measures of Network Centrality”
Social Networks 24: 407-422.
McPherson, Miller. 1982. “Hypernetwork Sampling: Duality and Differentiation among
Voluntary Organizations” Social Networks 3/9:225-249
Mokken, Robert J. 1979. “Cliques, Clubs and Clans” Quantity and Quality 13:161-173
76
Moreno, Jacob L 1953 (Revised Edition) Who Shall Survive? Foundations of
Sociometry, Group Psychotherapy, and Sociodrama Beacon, NY: Beacon House.
Nowicki, Krzysztof and Tom Snijders. 2001. “Estimation and Prediction for Stochastic
Blockstructures” Journal of the American Statistical Association 96/455: 10771088
Pampel , Fred C. 2000. Logistic Regression: a Primer Thousand Oaks, Calif.: Sage
Publications
Pattison, Philippa and Stanley Wasserman. 1999 “Logit Models and Logistic Regressions
for Social Networks, II. Multivariate Relationships” British Journal of
Mathematical and Statistical Psychology 52: 169–193
Pattison Philippa and Stanley Wasserman. 1999. “Logit Models and Logistic Regressions
for Social Networks: II. Multivariate Relations” British Journal of Mathematical
and Statistical Psychology 52/2: 169-193
Peay, Edmund R. 1980. “Connectedness in a General Model for Valued Networks”
Social Networks 2: 385-410
Robins Garry Robins, Philippa Pattison and Stanley Wasserman. 1999. “Logit Models
and Logistic Regressions for Social Networks: III. Valued Relations”
Psychometrika 64/3: 371-394
Sabidussi, Gert. 1966 “The Centrality Index of a Graph” Psychometrika 31:581-603
Seidman, Stephen 1983 “Network Structure and Minimum Degree” Social Networks 5:
269-284
Strang, Gilbert. 1988 Linear Algebra and its Applications. 3rd ed. San Diego : Harcourt,
Brace, Jovanovich
Strauss, David and Michael Ikeda 1990 “Pseudolikelihood Estimation for Social
Networks” Journal of the American Statistical Association 85/409: 204-212
Wasserman, Stanley and Philippa Pattison. 1996. “Logit Models and Logistic
Regressions for Social Networks: I. An Introduction to Markov Graphs and P”
Psychometrika 61/3: 401-425
Wasserman, Stanley and Katherine Faust 1994 Social Network Analysis: Methods and
Applications Cambridge; New York: Cambridge University Press
White, Douglas and Karl Reitz. 1983. “Graph and Semigroup Homomorphisms on
Networks of Relations” Social Networks 5/2:193-234
77
White, Douglas. 1996. “Statistical Entailments and the Galois Lattice” Social Networks
18/3: 201-215
White, Harrison, Scott Boorman and Ronald Breiger. 1976. “Social Structure from
Multiple Networks. I. Blockmodels of Roles and Positions” American Journal of
Sociology 81/4: 730-780
Winship, Christopher and Michael Mandel. 1983. “Roles and Positions: A Critique and
Extension of the Blockmodeling Approach” Sociological Methodology 14: 314344
Wu, Lawrence L. 1983. “Local Blockmodel Algebras for Analyzing Social Networks”
Sociological Methodology 14:272-313
78
Download