An Introduction to Social Network Analysis

advertisement
Social Network Analysis
American Sociological Association
San Francisco, August 2004
James Moody
Introduction
We live in a connected world:
“To speak of social life is to speak of the association between people –
their associating in work and in play, in love and in war, to trade or to
worship, to help or to hinder. It is in the social relations men establish that
their interests find expression and their desires become realized.”
Peter M. Blau
Exchange and Power in Social Life, 1964
"If we ever get to the point of charting a whole city or a whole nation, we
would have … a picture of a vast solar system of intangible structures,
powerfully influencing conduct, as gravitation does in space. Such an
invisible structure underlies society and has its influence in determining the
conduct of society as a whole."
J.L. Moreno, New York Times, April 13, 1933
These patterns of connection form a social space, that can be seen in multiple
contexts:
Introduction
Source: Linton Freeman “See you in the funny pages” Connections, 23, 2000, 32-42.
Introduction
High Schools as Networks
Introduction
And yet, standard social science analysis methods do not take this space
into account.
“For the last thirty years, empirical social research has been
dominated by the sample survey. But as usually practiced, …, the
survey is a sociological meat grinder, tearing the individual from his
social context and guaranteeing that nobody in the study interacts
with anyone else in it.”
Allen Barton, 1968 (Quoted in Freeman 2004)
Moreover, the complexity of the relational world makes it impossible to
identify social connectivity using only our intuitive understanding.
Social Network Analysis (SNA) provides a set of tools to empirically
extend our theoretical intuition of the patterns that construct social
structure.
Introduction
Why do Networks Matter?
Local vision
Introduction
Why do Networks Matter?
Local vision
Introduction
Why networks matter:
• Intuitive: “goods” travel through contacts between actors,
which can reflect a power distribution or influence attitudes
and behaviors. Our understanding of social life improves if
we account for this social space.
• Less intuitive: patterns of inter-actor contact can have effects
on the spread of “goods” or power dynamics that could not be
seen focusing only on individual behavior.
Introduction
Social network analysis is:
•a set of relational methods for systematically understanding
and identifying connections among actors. SNA
•is motivated by a structural intuition based on ties linking
social actors
•is grounded in systematic empirical data
•draws heavily on graphic imagery
•relies on the use of mathematical and/or computational
models.
•Social Network Analysis embodies a range of theories
relating types of observable social spaces and their relation to
individual and group behavior.
1.
2.
3.
4.
5.
Introduction
Social Network Data
a. Basic data Elements
b. Collecting network data
c. Basic data structures
Measuring Networks
a. Flows within of goods in networks
1) Topology
2) Time
b. Structure of Social Space
1) Small Worlds, Scale-Free, Triads
2) Cohesive Groups
3) Role Positions
Modeling with Networks
a. Modeling Behaviors with Networks
1) Peer attribute models
2) Network Autocorrelation Models
3) Dyad / QAP Models
b. Modeling Network Network Structure
1) QAP for network structure
2) Exponential Random Graph Models
SNA Computer Programs
Social Network Data
The unit of interest in a network are the combined sets of
actors and their relations.
We represent actors with points and relations with lines.
Actors are referred to variously as:
Nodes, vertices or points
Relations are referred to variously as:
Edges, Arcs, Lines, Ties
Example:
b
a
d
c
e
Social Network Data
In general, a relation can be:
Binary or Valued
Directed or Undirected
b
b
d
a
c
a
e
c
1
a
b
d
1
3
c
Undirected, Valued
e
Directed, binary
Undirected, binary
b
d
d
2
4
e
a
c
Directed, Valued
e
Social Network Data
Social network data are substantively divided by the number of
modes in the data.
1-mode data represents edges based on direct contact between
actors in the network. All the nodes are of the same type (people,
organization, ideas, etc). Examples:
Communication, friendship, giving orders, sending email.
1-mode data are usually singly reported (each person reports on
their friends), but you can use multiple-informant data, which is
more common in child development research (Cairns and
Cairns).
Social Network Data
Social network data are substantively divided by the number of
modes in the data.
2-mode data represents nodes from two separate classes, where
all ties are across classes. Examples:
People as members of groups
People as authors on papers
Words used often by people
Events in the life history of people
The two modes of the data represent a duality: you can project
the data as people connected to people through joint membership
in a group, or groups to each other through common membership
There may be multiple relations of multiple types connecting
your nodes.
Social Network Data
We can examine networks across multiple levels:
1) Ego-network
- Have data on a respondent (ego) and the people they are connected to
(alters). Example: 1985 GSS module
- May include estimates of connections among alters
2) Partial network
- Ego networks plus some amount of tracing to reach contacts of
contacts
- Something less than full account of connections among all pairs of
actors in the relevant population
- Example: CDC Contact tracing data for STDs
Social Network Data
We can examine networks across multiple levels:
3) Complete or “Global” data
- Data on all actors within a particular (relevant) boundary
- Never exactly complete (due to missing data), but boundaries are set
-Example: Coauthorship data among all writers in the social
sciences, friendships among all students in a classroom
For the most part, I will be discussing techniques surrounding global
networks today, though I will briefly mention some standard uses of
ego-network data.
Social Network Data
Collecting Network Data
Data capture any connection between the nodes. Sources include
surveys, published accounts, special informants, etc.
In general, you can only make conclusions about relations among the
set of nodes you have collected, so it is important to observe as
much of the network as possible.
See W&F, chap 2 on different types of data collection
Social Network Data
Collecting Network Data
If you use surveys to collect data, some general rules of thumb:
a)
Network data collection can be time consuming. It is better (I think) to
have breadth over depth. Having detailed information on <50% of the
sample will make it very difficult to draw conclusions about the general
network structure.
b) Question format:
• If you ask people to recall names (an open list format), fatigue will
result in under-reporting
• If you ask people to check off names from a full list, you can often get
over-reporting
c) It is common to limit people to ~5 nominations. This will bias network stats
for stars, but is sometimes the best choice to avoid fatigue.
d) Concrete relational indicators are best (who did you talk to?) over attitudes
that are harder to define (who do you like?)
Social Network Data
Collecting Network Data
Existing Sources of Social Network Data
1) Check INSNA: The International Network of Social Network Analysis
2) Many secondary sources (particularly for 2-mode data)
3) National Longitudinal Survey of Adolescent Health (Add Health)
Social Network Data
Basic Data Structures
Working with pictures.
No standard way to draw a sociogram: each of these are equal:
Social Network Data
Basic Data Structures
In general, graphs are cumbersome to work with analytically, though there is a
great deal of good work to be done on using visualization to build network
intuition.
I recommend using layouts that optimize on the feature you are most interested
in, and find that either a hierarchical layout or a force-directed layout are best.
Social Network Data
Basic Data Structures
From pictures to matrices
b
b
d
a
c
e
Undirected, binary
a
b
1
a
b 1
c
1
d
e
c
d
1
1
c
e
a
1
a
b 1
c
1
d
e
1
1
a
e
Directed, binary
1
1
d
b
1
c
1
d
e
1
1
1
Social Network Data
Basic Data Structures
From matrices to lists
a
a
b 1
c
d
e
b
1
c
d
e
1
1
1
1
1
1
1
1
Adjacency List
ab
bac
cbde
dce
ecd
Arc List
ab
ba
bc
cb
cd
ce
dc
de
ec
ed
Measuring Networks: Flow
“Goods” flow through networks:
Measuring Networks: Flow
In addition to the simple probability that one actor passes information on
to another (pij), two factors affect flow through a network:
Topology
-the shape, or form, of the network
- Example: one actor cannot pass information to another unless they
are either directly or indirectly connected
Time
- the timing of contact matters
- Example: an actor cannot pass information he has not receive yet
Measuring Networks: Flow
Two features of the network’s topology are known to be important: connectivity
and centrality
Connectivity refers to how actors in one part of the network are connected to
actors in another part of the network.
• Reachability: Is it possible for actor i to reach actor j? This can only be
true if there is a chain of contact from one actor to another.
• Distance: Given they can be reached, how many steps are they from
each other?
• Number of paths: How many different paths connect each pair?
Measuring Networks: Flow
Without full network data, you can’t distinguish actors with limited
information potential from those more deeply embedded in a setting.
c
b
a
Measuring Networks: Flow
Reachability
Indirect connections are what make networks systems. One actor can
reach another if there is a path in the graph connecting them.
b
a
a
d
c
b
e
f
c
f
d
e
Paths can be directed, leading to a distinction between “strong” and “weak”
components
Measuring Networks: Flow
Reachability
Reachability
If you can trace a sequence of relations from one actor to another,
then the two are reachable. If there is at least one path connecting
every pair of actors in the graph, the graph is connected and is called
a component.
Intuitively, a component is the set of people who are all connected by
a chain of relations.
Measuring Networks: Flow
Reachability
This example
contains many
components.
Measuring Networks: Flow
Distance & number of paths
Distance is measured by the (weighted) number of relations separating a pair:
Actor “a” is:
1 step from 4
2 steps from 5
3 steps from 4
4 steps from 3
5 steps from 1
a
Measuring Networks: Flow
Distance & number of paths
Paths are the different routes one can take. Node-independent paths are
particularly important.
b
There are 2 independent
paths connecting a and
b.
There are many nonindependent paths
a
Measuring Networks: Flow
Distance & number of paths
Probability of transfer
by distance and number of paths, assume a constant pij of 0.6
1.2
1
probability
10 paths
0.8
5 paths
0.6
2 paths
0.4
1 path
0.2
0
2
3
4
Path distance
5
6
Reachability in Colorado Springs
(Sexual contact only)
•High-risk actors over 4 years
•695 people represented
•Longest path is 17 steps
•Average distance is about 5 steps
•Average person is within 3 steps
of 75 other people
•137 people connected through 2
independent paths, core of 30
people connected through 4
independent paths
(Node size = log of degree)
Measuring Networks: Flow
Centrality
Centrality refers to (one dimension of) location, identifying where an actor
resides in a network.
• For example, we can compare actors at the edge of the network to actors
at the center.
• In general, this is a way to formalize intuitive notions about the
distinction between insiders and outsiders.
Measuring Networks: Flow
Centrality
At the individual level, one dimension of position in the network can be
captured through centrality.
Conceptually, centrality is fairly straight forward: we want to identify
which nodes are in the ‘center’ of the network. In practice, identifying
exactly what we mean by ‘center’ is somewhat complicated, but
substantively we often have reason to believe that people at the center
are very important.
Three standard centrality measures capture a wide range of
“importance” in a network:
•Degree
•Closeness
•Betweenness
Measuring Networks: Flow
Centrality
The most intuitive notion of centrality focuses on degree. Degree is
the number of ties, and the actor with the most ties is the most
important:
C D  d (ni )  X i    X ij
j
Measuring Networks: Flow
Centrality
If we want to measure the degree to which the graph as a whole is centralized,
we look at the dispersion of centrality:
Simple: variance of the individual centrality scores.
g

2
2
S D   (CD (ni )  Cd )  / g
 i 1

Or, using Freeman’s general formula for centralization (which ranges from 0 to 1):

C


g
CD
i 1
(n )  CD (ni )
*
D
[( g  1)( g  2)]

Measuring Networks: Flow
Centrality
Freeman: 1.0
Variance: 3.9
Degree Centralization Scores
Freeman: .02
Variance: .17
Freeman: .07
Variance: .20
Freeman: 0.0
Variance: 0.0
Measuring Networks: Flow
Centrality
A second measure of centrality is closeness centrality. An actor is considered
important if he/she is relatively close to all other actors.
Closeness is based on the inverse of the distance of each actor to every other actor
in the network.
Closeness Centrality:


Cc (ni )   d (ni , n j )
 j 1

g
1
Normalized Closeness Centrality
CC' (ni )  (CC (ni ))( g  1)
Measuring Networks: Flow
Centrality
Closeness Centrality in the examples
C=0.0
C=1.0
C=0.36
C=0.28
Measuring Networks: Flow
Centrality
Betweenness Centrality:
Model based on communication flow: A person who lies on
communication paths can control communication flow, and is thus important.
Betweenness centrality counts the number of shortest paths between i and k
that actor j resides on.
b
a
C d e f g h
Measuring Networks: Flow
Centrality
Betweenness Centrality:
C B (ni )   g jk (ni ) / g jk
j k
Where gjk = the number of geodesics connecting jk, and
gjk(ni) = the number that actor i is on.
Usually normalized by:
C (ni )  C B (ni ) /[( g  1)( g  2) / 2]
'
B
Measuring Networks: Flow
Centrality
Betweenness Centrality:
Centralization: 1.0
Centralization: .59
Centralization: .31
Centralization: 0
Measuring Networks: Flow
Centrality
Actors that appear very
different when seen
individually, are
comparable in the global
network.
(Node size proportional to betweenness centrality )
Measuring Networks: Flow
Time
Two factors that affect network flows:
Topology
- the shape, or form, of the network
- simple example: one actor cannot pass information to
another unless they are either directly or indirectly
connected
Time
- the timing of contacts matters
- simple example: an actor cannot pass information he has
not yet received.
Measuring Networks: Flow
Time
Timing in networks
A focus on contact structure has often slighted the importance of network
dynamics,though a number of recent pieces are addressing this.
Time affects networks in two important ways:
1) The structure itself evolves, in ways that will affect the topology an
thus flow.
2) The timing of contact constrains information flow
Measuring Networks: Flow
Time
Drug Relations, Colorado Springs, Year 1
Data on drug users in
Colorado Springs, over
5 years
Measuring Networks: Flow
Time
Drug Relations, Colorado Springs, Year 2
Current year in red, past relations in gray
Measuring Networks: Flow
Time
Drug Relations, Colorado Springs, Year 3
Current year in red, past relations in gray
Measuring Networks: Flow
Time
Drug Relations, Colorado Springs, Year 4
Current year in red, past relations in gray
Measuring Networks: Flow
Time
Drug Relations, Colorado Springs, Year 5
Current year in red, past relations in gray
Measuring Networks: Flow
Time
What impact does timing have on flow through the network?
C
A
2-5
8-9
E
B
D
Numbers above lines indicate contact periods
3-5
F
Measuring Networks: Flow
Time
The path graph for the hypothetical contact network
A
C
E
D
F
B
While clearly important, this is not often handled well by current software.
Measuring Networks: Structure & Social Space
The second broad division for measuring networks steps back to
generalized features of the global network.
These factors almost always are of interest because of what they imply
about how goods move through the network, but have resulted in a distinct
line of methods and substantive research.
We focus on 3 such factors today:
1) Basic structure of large-scale networks
2) Cohesive Peer Groups
3) Identifying Role positions (blockmodels)
Measuring Networks: Large-Scale Models
Small World Networks
Based on Milgram’s (1967) famous work,
the substantive point is that networks
are structured such that even when
most of our connections are local,
any pair of people can be connected
by a fairly small number of relational
steps.
Works on 2 parameters:
1) The Clustering Coefficient (c) =
average proportion of closed
triangles
2) The average distance (L)
separating nodes in the network
Measuring Networks: Large-Scale Models
Small World Networks
C=Large, L is Small =
SW Graphs
•High probability that a node’s contacts are connected to each other.
•Small average distance between nodes
Measuring Networks: Large-Scale Models
Small World Networks
In a highly clustered, ordered
network, a single random
connection will create a shortcut
that lowers L dramatically
Watts demonstrates that small
world properties can occur in
graphs with a surprisingly small
number of shortcuts
Diffusion / flow implications are
unclear, but seem similar to a
random graphs where local
clusters are reduced to a single
point.
Measuring Networks: Large-Scale Models
Scale-Free Networks
Across a large number of substantive
settings, Barabási points out that the
distribution of network involvement
(degree) is highly and characteristically
skewed.
Measuring Networks: Large-Scale Models
Scale Free Networks
Many large networks are characterized by a highly skewed distribution of the
number of partners (degree)
Measuring Networks: Large-Scale Models
Scale Free Networks
Many large networks are characterized by a highly skewed distribution of the
number of partners (degree)
p(k ) ~ k

Measuring Networks: Large-Scale Models
Scale Free Networks
The scale-free model focuses on the distance-reducing
capacity of high-degree nodes:
Measuring Networks: Large-Scale Models
Scale Free Networks
The scale-free model focuses on the distance-reducing capacity of highdegree nodes, as ‘hubs’ create shortcuts that carry network flow.
Measuring Networks: Large-Scale Models
Scale Free Networks
Colorado Springs High-Risk
(Sexual contact only)
•Network is approximately
scale-free, with  = -1.3
•But connectivity does not
depend on the hubs.
Measuring Networks: Large-Scale Models
Social Cohesion
White, D. R. and F. Harary. 2001. "The Cohesiveness of Blocks
in Social Networks: Node Connectivity and Conditional
Density." Sociological Methodology 31:305-59.
Moody, James and Douglas R. White. 2003. “Structural
Cohesion and Embeddedness: A hierarchical Conception of
Social Groups” American Sociological Review 68:103-127
White, Douglas R., Jason Owen-Smith, James Moody, &
Walter W. Powell (2004) "Networks, Fields, and
Organizations: Scale, Topology and Cohesive
Embeddings." Computational and Mathematical
Organization Theory. 10:95-117
Moody, James "The Structure of a Social Science
Collaboration Network: Disciplinary Cohesion from
1963 to 1999" American Sociological Review. 69:213238
Measuring Networks: Large-Scale Models
Social Cohesion
Formal definition of Structural Cohesion:
(a) A group’s structural cohesion is equal to the minimum number of actors who,
if removed from the group, would disconnect the group.
Equivalently (by Menger’s Theorem):
(b) A group’s structural cohesion is equal to the minimum number of independent
paths linking each pair of actors in the group.
Measuring Networks: Large-Scale Models
Social Cohesion
•Networks are structurally cohesive if they remain connected even when
nodes are removed
0
2
1
Node Connectivity
3
Measuring Networks: Large-Scale Models
Social Cohesion
Structural cohesion gives rise automatically to a clear notion of
embeddedness, since cohesive sets nest inside of each other.
2
3
1
9
10
8
4
5
11
7
12
13
6
14
15
17
16
18
19
20
2
22
23
Measuring Networks: Large-Scale Models
Social Cohesion
Project 90, Sex-only network (n=695)
3-Component (n=58)
Measuring Networks: Large-Scale Models
Social Cohesion
IV Drug Sharing
Largest BC: 247
k > 4: 318
Max k: 12
Structural Cohesion
simultaneously gives
us a positional and
subgroup analysis.
Connected
Bicomponents
Measuring Networks:
Cohesive Sub Groups
A primary interest in Social Network Analysis is the identification of
“significant social subgroups” – some smaller collection of nodes in
the graph that can be considered, at least in some senses, as a “unit”
based on the pattern, strength, or frequency of ties.
There are many ways to identify groups. They all insist on a group
being in a connected component, but other than that the variation is
wide.
Measuring Networks:
Cohesive Sub Groups
Graph Theoretical Models.
Start with a clique. A clique is defined as a maximal subgraph in which every
member of the graph is connected to every other member of the graph.
Cliques are collections of nodes where density = 1.0.
Properties of cliques:
• Density: 1.0
• Everyone connected to n-1 alters
• Distance between every pair is 1
• Ratio of within group ties to between
group ties is infinite
• All triads are transitive
Measuring Networks:
Cohesive Sub Groups
Graph Theoretical Models.
In practice, complete cliques are not very useful. They tend to overlap
heavily and are limited in their size.
Graph theorists have thus
relaxed the complete
connectivity requirement
(with varying degrees of
success). See the Moody
& White (2003) for a
discussion of these
attempts.
Measuring Networks:
Cohesive Sub Groups
Identifying Primary groups:
1) Measures of fit
To identify a primary group, we need some measure of how clustered
the network is. Usually, this is a function of the number of ties that
fall within group to the number of ties that fall between group.
2) Algorithmic approaches to maximizing (1)
Once we have such an index, we need a method for searching through
the network to maximize the fit.
3) Generalized cluster analysis
In addition to maximizing a group function such as (1) we can use the
relational distance directly, and look for clusters in the data. We next
go over two different styles of cluster analysis
Measuring Networks:
Cohesive Sub Groups
Segregation Index
(Freeman, L. C. 1972. "Segregation in Social Networks." Sociological Methods and
Research 6411-30.)
Freeman asked how we could identify segregation in a social network.
Theoretically, he argues, if a given attribute (group label) does not matter for
social relations, then relations should be distributed randomly with respect to the
attribute. Thus, the difference between the number of cross-group ties expected
by chance and the number observed measures segregation.
E( X )  X
Seg 
E( X )
Measuring Networks:
Cohesive Sub Groups
Consider the (hypothetical) network below. There are two
attributes in this network: people with Blue eyes and Brown eyes
and people who are square or not (they must be hip).
Measuring Networks:
Cohesive Sub Groups
Segregation Index
Mixing Matrix:
Blue
Blue
Brown
6
Brown 17
17
16
Seg = -0.25
Hip
Square
Hip
20
3
Square
3
30
Seg = 0.78
Measuring Networks:
Cohesive Sub Groups
The segregation index is one metric used to identify groups. Others include:
a) The ratio of in-group to out-group ties (Negopy, UCINET Factions)
b) Maximizing the probability of in-group contact (CliqueFinder)
c) The Segregation Matrix Index (SMI)
d) The dyadic factor loadings for overlapping groups (akin to a latent
class model)
e) Minimize the within-group distance
Once a metric has been chosen, some algorithm is needed to search through
the graph to identify clusters. These algorithms range from very sophisticated
“graph-intelligent” algorithms, such as NEGOPY, to simple cluster analysis
of distance matrices.
In most cases, you have to pre-set the number of groups to use (the exceptions
are NEGOPY and CliqueFinder. Moody’s CROWDS algorithm also has
automatic stopping criteria, but you have to give it starting values.
Measuring Networks:
Cohesive Sub Groups
In practice, the different
algorithms will give
different results.
Here, I compare the
NEGOPY results to the
RNM results. NEGOPY
returned one large group,
RNM found many smaller,
denser groups.
It’s usually a good idea to
explore multiple solutions
and algorithms.
Measuring Networks:
Cohesive Sub Groups
Gangon Prison Network
In practice, the different
algorithms will give
different results.
Here, I compare
NEGOPY, FACTIONS
and RNM. Groups A and
B are identical, C is close.
F, E and D differ.
It’s usually a good idea to
explore multiple solutions
and algorithms.
(all solutions constrained to 6 groups)
Measuring Networks:
Role Positions
Overview
•Social life can be described (at least in part) through social roles.
•To the extent that roles can be characterized by regular interaction
patterns, we can summarize roles through common relational patterns.
•Identifying these sets is the goal of block-model analyses.
Nadel: The Coherence of Role Systems
•Background ideas for White, Boorman and Brieger. Social life as
interconnected system of roles
•Important feature: thinking of roles as connected in a role system =
social structure
White, Harrison C.; Boorman, Scott A., and Breiger, Ronald L. Social
Structure from Multiple Networks I. American Journal of Sociology.
1976; 81730-780.
•The key article describing the theoretical and technical elements of
block-modeling
Measuring Networks:
Role Positions
Elements of a Role:
•Rights and obligations with respect to other people or classes of
people
•Roles require a ‘role compliment’ another person who the roleoccupant acts with respect to
Examples:
Parent - child, Teacher - student, Lover - lover, Friend - Friend,
Husband - Wife, etc.
Nadel (Following functional anthropologists and sociologists) defines
‘logical’ types of roles, and then examines how they can be linked together.
Measuring Networks:
Role Positions
White et al: From logical role systems to empirical social structures
Start with some basic ideas of what a role is: An exchange of something (support,
ideas, commands, etc) between actors. Thus, we might represent a family as:
H
W
C
C
C
Romantic Love
Provides food for
Bickers with
(and there are, of course, many other relations inside a family!)
Measuring Networks:
Role Positions
The key idea, is that we can express a role through a relation (or set of relations)
and thus a social system by the inventory of roles. If roles equate to positions in
an exchange system, then we need only identify particular aspects of a position.
But what aspect?
Structural Equivalence
Two actors are structurally equivalent if they have the same
types of ties to the same people.
Measuring Networks:
Role Positions
Structural Equivalence
A single relation
Measuring Networks:
Role Positions
Structural Equivalence
Graph reduced to positions
Measuring Networks:
Role Positions
Blockmodeling: basic steps
In any positional analysis, there are 4 basic steps:
1) Identify a definition of equivalence
2) Measure the degree to which pairs of actors are equivalent
3) Develop a representation of the equivalencies
4) Assess the adequacy of the representation
Measuring Networks:
Role Positions
1) Identify a definition of equivalence
Structural Equivalence:
Two actors are equivalent if they have the same type of ties to the same people.
Measuring Networks:
Role Positions
Automorphic Equivalence:
Actors occupy indistinguishable structural locations in the network. That is,
that they are in isomorphic positions in the network.
In general, automorphically equivalent nodes are equivalent with respect to
all graph theoretic properties (I.e. degree, number of people reachable,
centrality, etc.)
Measuring Networks:
Role Positions
Automorphic Equivalence:
Measuring Networks:
Role Positions
Regular Equivalence:
Regular equivalence does not require actors to have identical
ties to identical actors or to be structurally indistinguishable.
Actors who are regularly equivalent have identical ties to and
from equivalent actors.
If actors i and j are regularly equivalent, then for all relations
and for all actors, if i
k, then there exists some actor l such
that j l and k is regularly equivalent to l.
Measuring Networks:
Role Positions
Regular Equivalence:
There may be multiple regular equivalence partitions in a network, and thus we tend
to want to find the maximal regular equivalence position, the one with the fewest
positions.
Measuring Networks:
Role Positions
Role or Local Equivalence:
While most equivalence measures focus on position within the full network, some
measures focus only on the patters within the local tie neighborhood. These have
been called ‘local role’ equivalence.
Note that:
Structurally equivalent actors are automorphically equivalent,
Automorphically equivalent actors are regularly equivalent.
Structurally equivalent and automorphically equivalent actors are role equivalent
In practice, we tend to ignore some of these distinctions, as they get blurred quickly
once we have to operationalize them in real-world graphs. It turns out that few
people are ever exactly equivalent, and thus we approximate the links between the
types.
In all cases, the procedure can work over multiple relations simultaneously.
The process of identifying positions is called blockmodeling, and requires identifying
a measure of similarity among nodes.
Measuring Networks:
Role Positions
Once you identify equivalent actors, block them in the matrix and reduce it, based on the number of ties
in the cell of interest. The key values are a zero block (no ties) and a one-block (all ties present):
1 2
1 . 1
2 1 .
1 0
3
1 0
0 1
4
0 1
0 0
5 0 0
0 0
0 0
0 0
6 0 0
0 0
0 0
3
1
0
.
1
0
0
1
1
1
1
0
0
0
0
4
1
0
1
.
0
0
1
1
1
1
0
0
0
0
0
1
0
0
.
1
0
0
0
0
1
1
1
1
5
0
1
0
0
1
.
0
0
0
0
1
1
1
1
0
0
1
1
0
0
.
0
0
0
0
0
0
0
0
0
1
1
0
0
0
.
0
0
0
0
0
0
6
0
0
1
1
0
0
0
0
.
0
0
0
0
0
0
0
1
1
0
0
0
0
0
.
0
0
0
0
0
0
0
0
1
1
0
0
0
0
.
0
0
0
0
0
0
0
1
1
0
0
0
0
0
.
0
0
0
0
0
0
1
1
0
0
0
0
0
0
.
0
0
0
0
0
1
1
0
0
0
0
0
0
0
.
1
2
3
4
5
6
1
0
1
1
0
0
0
2
1
0
0
1
0
0
3
1
0
1
0
1
0
4
0
1
0
1
0
1
5
0
0
1
0
0
0
6
0
0
0
1
0
0
Structural equivalence thus generates 6 positions in the network
Measuring Networks:
Role Positions
Once you partition the matrix, reduce it:
.
1
1
1
0
0
0
0
0
0
0
0
0
0
1
.
0
0
1
1
0
0
0
0
0
0
0
0
1
0
.
1
0
0
1
1
1
1
0
0
0
0
1
0
1
.
0
0
1
1
1
1
0
0
0
0
0
1
0
0
.
1
0
0
0
0
1
1
1
1
0
1
0
0
1
.
0
0
0
0
1
1
1
1
0
0
1
1
0
0
.
0
0
0
0
0
0
0
0
0
1
1
0
0
0
.
0
0
0
0
0
0
0
0
1
1
0
0
0
0
.
0
0
0
0
0
0
0
1
1
0
0
0
0
0
.
0
0
0
0
0
0
0
0
1
1
0
0
0
0
.
0
0
0
0
0
0
0
1
1
0
0
0
0
0
.
0
0
0
0
0
0
1
1
0
0
0
0
0
0
.
0
0
0
0
0
1
1
0
0
0
0
0
0
0
.
1
1 1
2 1
3 0
1
2
1
1
1
2
3
Regular equivalence
(here I placed a one in the image matrix if there were any ties in the ij block)
3
0
1
0
Measuring Networks:
Role Positions
Operationally, you have to measure the similarity between actors. If two actors
are structurally equivalent, then they will have identical ties to other people.
Consider the example again:
1 2
1 . 1
2 1 .
1 0
3
1 0
0 1
4
0 1
0 0
5 0 0
0 0
0 0
0 0
6 0 0
0 0
0 0
3
1
0
.
1
0
0
1
1
1
1
0
0
0
0
4
1
0
1
.
0
0
1
1
1
1
0
0
0
0
0
1
0
0
.
1
0
0
0
0
1
1
1
1
5
0
1
0
0
1
.
0
0
0
0
1
1
1
1
0
0
1
1
0
0
.
0
0
0
0
0
0
0
0
0
1
1
0
0
0
.
0
0
0
0
0
0
6
0
0
1
1
0
0
0
0
.
0
0
0
0
0
0
0
1
1
0
0
0
0
0
.
0
0
0
0
0
0
0
0
1
1
0
0
0
0
.
0
0
0
0
0
0
0
1
1
0
0
0
0
0
.
0
0
0
0
0
0
1
1
0
0
0
0
0
0
.
0
0
0
0
0
1
1
0
0
0
0
0
0
0
.
C D Match
1 1
1
0 0
1
. 1
.
1 .
.
0 0
1
0 0
1
1 1
1
1 1
1
1 1
1
1 1
1
0 0
1
0 0
1
0 0
1
0 0
1
Sum: 12
C and D match on all
12 other people, and
are thus structurally
equivalent.
Measuring Networks:
Role Positions
If the model is going to be based on asymmetric or multiple relations, you simply stack the
various relations, usually including both “directions” of asymmetric relations:
H
Romance
0 1 0 0 0
1 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
W
C
C
C
Romantic Love
Provides food for
Bickers with
0
0
0
0
0
Feeds
0 1 1
0 1 1
0 0 0
0 0 0
0 0 0
Bicker
0 0 0 0
0 0 0 0
0 0 0 1
0 0 1 0
0 0 1 1
1
1
0
0
0
0
0
1
0
0
Stacked
0
1
0
0
0
0
0
0
0
0
0
0
1
1
1
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
1
0
1
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
1
1
0
Measuring Networks:
Role Positions
The metric used to measure structural equivalence by White, Boorman and Brieger is
the correlation between each node’s set of ties. For the example, this would be:
1.00
-0.20
0.08
0.08
-0.19
-0.19
0.77
0.77
0.77
0.77
-0.26
-0.26
-0.26
-0.26
-0.20
1.00
-0.19
-0.19
0.08
0.08
-0.26
-0.26
-0.26
-0.26
0.77
0.77
0.77
0.77
0.08
-0.19
1.00
1.00
-1.00
-1.00
0.36
0.36
0.36
0.36
-0.45
-0.45
-0.45
-0.45
0.08
-0.19
1.00
1.00
-1.00
-1.00
0.36
0.36
0.36
0.36
-0.45
-0.45
-0.45
-0.45
-0.19
0.08
-1.00
-1.00
1.00
1.00
-0.45
-0.45
-0.45
-0.45
0.36
0.36
0.36
0.36
-0.19
0.08
-1.00
-1.00
1.00
1.00
-0.45
-0.45
-0.45
-0.45
0.36
0.36
0.36
0.36
0.77
-0.26
0.36
0.36
-0.45
-0.45
1.00
1.00
1.00
1.00
-0.20
-0.20
-0.20
-0.20
0.77
-0.26
0.36
0.36
-0.45
-0.45
1.00
1.00
1.00
1.00
-0.20
-0.20
-0.20
-0.20
0.77
-0.26
0.36
0.36
-0.45
-0.45
1.00
1.00
1.00
1.00
-0.20
-0.20
-0.20
-0.20
0.77
-0.26
0.36
0.36
-0.45
-0.45
1.00
1.00
1.00
1.00
-0.20
-0.20
-0.20
-0.20
-0.26
0.77
-0.45
-0.45
0.36
0.36
-0.20
-0.20
-0.20
-0.20
1.00
1.00
1.00
1.00
-0.26
0.77
-0.45
-0.45
0.36
0.36
-0.20
-0.20
-0.20
-0.20
1.00
1.00
1.00
1.00
-0.26
0.77
-0.45
-0.45
0.36
0.36
-0.20
-0.20
-0.20
-0.20
1.00
1.00
1.00
1.00
-0.26
0.77
-0.45
-0.45
0.36
0.36
-0.20
-0.20
-0.20
-0.20
1.00
1.00
1.00
1.00
Another common metric is the Euclidean distance between pairs of actors, which you
then use in a standard cluster analysis.
Measuring Networks:
Role Positions
Automorphic and Regular equivalence are more difficult to find, and require
iteratively searching over possible class assignments for sets that have the same
graph theoretic patterns. Usually start with a set of nodes defined as similar on
a number of network measures, then look within these classes for automorphic
equivalence classes.
A theoretically appealing method for finding structures that are very similar to
regular equivalence, role equivalence, uses the triad census. Each node is
involved in (n-1)(n-2)/2 triads, and occupies a particular position in each of
these triads.
Measuring Networks:
Role Positions
Moving from a similarity/distance matrix to a blockmodel:
number of groups and determining blocks:
“An important decision in an analysis using CONCOR is how fine the
partition should be; in other words, when should one stop splitting
positions? Theory and the interpretability of the solution are the
primary consideration in deciding how many positions to produce.”
(W&F, p.378)
“In defining positions of actors, the ‘trick’ is to choose the point along
the series that gives a useful and interpretable partition of the actors
into equivalence classes.” (W&F p.383)
Measuring Networks:
Role Positions
An example:
Padgett, J. F. and Ansell, C. K.
Robust action and the rise of
the Medici, 1400-1434.
American Journal of
Sociology. 1993; 9812591319.
“Political Groups” in the attribute
sense do not seem to exist, so
P&A turn to the pattern of
network relations among
families.
This is the block reduction of the
full 92 family network.
Modeling with Networks: Behaviors
There are two general approaches to modeling behaviors with network data:
1) Using network measures as variables to predict individual outcomes
2) Network autocorrelation / peer influence models
3) Dyad / QAP models of the similarity of actors and their joint network
position
Modeling with Networks: Behaviors
The simplest way to use network data in research is to include the network
measure as a covariate in a standard model:
Y = a0 + b(netvars) + b(other vars) + e
“netvars” most commonly include:
•Functions of each person’s direct contacts attributes
•Such as: mean income of friends, proportion of friends who are
employed, racial heterogeneity of the friends,etc.
•Structural indicators:
•Such as: Centrality, dummies for group / role membership, etc.
These models are the only option for ego-network data,where information on
network alters is collected from a single respondent’s (ego’s) report.
They can be used from extractions of partial or complete data, but the error term
is – by definition – autocorrelated. Cases are not independent, but connected
through the social relations
Modeling with Networks: Behaviors
Network Autocorrelation models (aka Peer Influence models):
Friedkin, N. E. 1984. "Structural Cohesion and Equivalence
Explanations of Social Homogeneity." Sociological Methods and
Research 12:235-61.
———. 1998. A Structural Theory of Social Influence. Cambridge:
Cambridge.
Friedkin, N. E. and E. C. Johnsen. 1990. "Social Influence and
Opinions." Journal of Mathematical Sociology 15(193-205).
———. 1997. "Social Positions in Influence Networks." Social
Networks 19:209-22.
Y
()
 αWY
()
~
 Xb  e
Where W is a direct function of the adjacency matrix, and a is the estimated
value of peer influence.
Modeling with Networks: Behaviors
There are two general ways to test for peer influence in an observed network.
The first estimates the parameters (a and b) of the peer influence model directly,
the second transforms the network into a dyadic model, predicting similarity
among actors.
Peer influence model:
See Doreian, Patrick. “Maximum likelihood methods for linear models Spatial
Effects and Spatial Disturbances Terms.” Sociological Methods and Research.
1982; 10243-269.
Gould, Roger V. Multiple Networks and mobilization in the Paris Commune,
1871. American Sociological Review. 1991; 56716-729. (applied example)
Y
()
 αWY
()
~
 Xb  e
Modeling with Networks: Behaviors
The basic model says that people’s opinions are a function of the opinions of
others and their characteristics.
Y
()
 αWY
()
~
 Xb  e
WY = A simple vector which can be added to your model. That is, multiply
Y by a W matrix, and run the regression with WY as a new variable, and the
regression coefficient is an estimate of a.
This is what Doriean calls the QAD (“Quick and Dirty” estimate of peer
influence, and is equivalent (under certain assumptions) to adding the mean of
ego’s friends to the model.
Modeling with Networks: Behaviors
The problem with the above regression is that cases are, by definition, not
independent. In fact, WY is also known as the ‘network autocorrelation’
coefficient, since a ‘peer influence’ effect is an autocorrelation effect -- your value
is a function of the people you are connected to. In general, OLS is not the best
way to estimate this equation. That is, QAD = Quick and Dirty, and your results
will not be exact.
In practice, the QAD approach (perhaps combined with a GLS estimator) results in
empirical estimates that are “virtually indistinguishable” from MLE (Doreian et al,
1984)
The proper way to estimate the peer equation is to use maximum likelihood
estimates, and Doreian gives the formulas for this in his paper.
The other way is to use non-parametric approaches, such as the Quadratic
Assignment Procedure, to estimate the effects.
Modeling with Networks: Behaviors
An empirical Example: Peer influence in the OSU Graduate Student Network.
Each person was asked to rank their satisfaction with the program, which is the dependent variable
in this analysis.
I constructed two W matrices, one from HELP the other from Best Friend. I treat relations as
symmetric and valued, such that:
 1 if Aijt  1 or A jit  1 


Wijt  2 if Aijt  1 and A jit  1


0
otherwise


 Wij  1
j
Wii  0
I also include Race (white/Non-white, Gender and Cohort Year as exogenous variables in the model.
Modeling with Networks: Behaviors
An empirical Example: Peer influence in the OSU Graduate Student Network.
Distribution of Satisfaction with the department.
Modeling with Networks: Behaviors
Parameter Estimates
Variable
Parameter
Estimate
Standardized
Pr > |t| Estimate
Intercept
FEMALE
NONWHITE
y00
y99
y98
y97
PEER_BF
PEER_H
2.60252
-1.07540
-0.22087
0.93176
-0.19375
-0.45912
0.60670
0.23936
0.50668
0.0931
0.0142
0.5975
0.0798
0.7052
0.4637
0.3060
0.0002
0.0277
0
-0.25455
-0.05491
0.21627
-0.04586
-0.08289
0.11919
0.42084
0.23321
Model R2 = .41, compared to .15 without the peer effects
Modeling with Networks: Behaviors
Dyad QAP models
Another way to get at peer influence is not through the level of Y, but through the
extent to which actors are similar with respect to Y.
The model is now expressed at the dyad level as:
Yij  b0  b1 Aij   bk X k  eij
k
Where Y is a matrix of similarities, A is an adjacency matrix, and Xk is a
matrix of similarities on attributes
Modeling with Networks: Behaviors
Dyad QAP models
NODE
1
2
3
4
5
6
7
8
9
0
1
1
1
0
0
0
0
0
1
0
1
0
0
0
1
0
0
ADJMAT
1 1 0 0
1 0 0 0
0 0 1 0
0 0 1 0
1 1 0 1
0 0 1 0
1 0 0 0
0 0 1 1
0 0 0 1
0
1
1
0
0
0
0
0
0
0
0
0
0
1
1
0
0
1
0
0
0
0
0
1
0
1
0
0
1
0
0
1
0
0
0
1
1
0
0
0
1
0
0
0
1
SAMERCE
0 0 1 0
0 0 1 0
0 1 0 1
1 0 0 1
0 0 0 0
1 1 0 0
1 1 0 1
1 1 0 1
0 0 1 0
0
0
1
1
0
1
0
1
0
0
0
1
1
0
1
1
0
0
1
1
0
0
1
0
0
0
0
0
0
1
1
0
0
1
1
0
0
0
0
0
1
1
0
0
1
SAMESEX
1 1 0 0 1
0 0 1 1 0
0 1 0 0 1
1 0 0 0 1
0 0 0 1 0
0 0 1 0 0
1 1 0 0 0
1 1 0 0 1
0 0 1 1 0
1
0
1
1
0
0
1
0
0
0
1
0
0
1
1
0
0
0
Modeling with Networks: Behaviors
Dyad QAP models
Y
0.32
0.59
0.54
0.50
0.04
0.02
0.41
0.01
-0.17
Distance (Dij=abs(Yi-Yj)
.000 .277 .228 .181 .278
.277 .000 .049 .096 .555
.228 .049 .000 .047 .506
.181 .096 .047 .000 .459
.278 .555 .506 .459 .000
.298 .575 .526 .479 .020
.095 .182 .134 .087 .372
.307 .584 .535 .488 .029
.481 .758 .710 .663 .204
.298
.575
.526
.479
.020
.000
.392
.009
.184
.095
.182
.134
.087
.372
.392
.000
.401
.576
.307
.584
.535
.488
.029
.009
.401
.000
.175
.481
.758
.710
.663
.204
.184
.576
.175
.000
Modeling with Networks: Behaviors
Dyad QAP models
The REG Procedure
Model: MODEL1
Dependent Variable: SIM
Analysis of Variance
Source
DF
Sum of
Squares
Model
Error
Corrected Total
4
31
35
0.90657
0.75591
1.66248
Root MSE
Dependent Mean
Coeff Var
0.15615
0.33161
47.08929
Mean
Square
0.22664
0.02438
R-Square
Adj R-Sq
F Value
Pr > F
9.29
<.0001
0.5453
0.4866
Parameter Estimates
Variable
Intercept
NOM
SAMERCE
SAMESEX
NCOMFND
DF
Parameter
Estimate
Standard
Error
t Value
Pr > |t|
1
1
1
1
1
0.51931
-0.17054
0.05387
-0.06535
-0.16134
0.05116
0.05963
0.05916
0.05365
0.03862
10.15
-2.86
0.91
-1.22
-4.18
<.0001
0.0075
0.3696
0.2324
0.0002
Modeling with Networks: Behaviors
Dyad QAP models
Like the basic Peer influence model, cases in dyad models are not
independent. However, the non-independence now comes from two sources:
(1) the fact that the same person is represented in (n-1) dyads and (2) that i and
j are linked through relations.
One of the best solutions to this problem is QAP: Quadratic Assignment
Procedure. A non-parametric procedure for significance testing.
QAP runs the model of interest on the real data, then randomly permutes the
rows/cols of the data matrix and estimates the model again. In so doing, it
generates an empirical distribution of the coefficients, generating n levels of
the coefficients at ‘chance’ levels, which you then compare to the observed
data. This is implemented in UCINET for regression, and in DAMN for
logistic regression (J.L. Martin).
Modeling with Networks: Behaviors
Dyad QAP models
Procedure:
1. Calculate the observed association / model
2. for K iterations do:
a) randomly sort one of the matrices
b) recalculate the association / model
c) store the outcome
3. compare the observed outcome to the distribution of
outcomes created by the random permutations.
Modeling with Networks: Behaviors
Dyad QAP models
Comparing multiple networks: QAP
Modeling with Networks: Behaviors
Dyad QAP models
Modeling with Networks: Behaviors
Dyad QAP models
MULTIPLE REGRESSION QAP W/ MISSING VALUES
--------------------------------------------------------------------------------
# of permutations:
Diagonal valid?
Random seed:
Dependent variable:
Expected values:
Independent variables:
2000
NO
533
EX_SIM
c:\moody\Classes\soc884\examples\UCINET\mrqap-predicted
EX_NCOM
EX_ADJ
EX_SRCE
EX_SSEX
Number of valid observations among the X variables = 72
N = 72
Number of permutations performed: 1999
MODEL FIT
R-square Adj R-Sqr Probability
# of Obs
-------- --------- ----------- ----------0.545
0.525
0.029
72
REGRESSION COEFFICIENTS
Un-stdized
Stdized
Proportion Proportion
Independent Coefficient Coefficient Significance
As Large
As Small
----------- ----------- ----------- ------------ ----------- ----------Intercept
0.519314
0.000000
0.012
0.012
0.988
EX_NCOM
-0.161337
-0.541828
0.011
0.989
0.011
EX_ADJ
-0.170539
-0.381186
0.020
0.980
0.020
EX_SRCE
0.053864
0.124551
0.236
0.236
0.764
EX_SSEX
-0.065364
-0.151144
0.180
0.820
0.180
Note that the coefficient values will be identical, but the p values differ
Modeling with Networks: Behaviors
Dyad QAP models
A substantive question raised with any kind of network autocorrelation
model is whether observed associations between network structure and
behaviors is due to selection or influence.
Theory is your best friend here, as there is no fool proof method to
distinguish the two.
However, recent work has made great progress using individual-level
fixed effect models (sometimes random effects models), where the
network features vary over time. This removes any stable characteristic
that might account for selection into a particular group.
Modeling with Networks: Structure
Dyad QAP models
While the most common way to use QAP models is to predict the
similarity on some substantive variable, one can just as easily predict the
presence/absence of a relation given attribute similarity.
This makes it possible to model the network itself, and ask questions about
how particular structures form.
Modeling with Networks: Structure
Exponential Random Graph Models (p*)
A long research tradition in statistics and random graph theory has lead to
parametric models of networks.
These are models of the entire graph, though as we will see they often work on
the dyads in the graph to be estimated.
Substantively, the approach is to ask whether the graph in question is an element
of the class of all random graphs with the given known elements. For example,
all graphs with 5 nodes and 3 edges, or, put probabilistically, the probability of
observing the current graph given the conditions.
Modeling with Networks: Structure
Exponential Random Graph Models (p*)
The earliest approaches are based on simple random graph theory, but there’s
been a flurry of activity in the last 10 years or so.
Key references:
- Holland and Leinhardt (1981) JASA
- Frank and Strauss (1986) JASA
- Wasserman and Faust (1994) – Chap 15 & 16
- Wasserman and Pattison (1996)
Thanks to Mark Handcock for sharing some figures/slides about these models.
Modeling with Networks: Structure
Exponential Random Graph Models (p*)
exp{ z ( x)}
p ( X  x) 
 ( )
Where:
 is a vector of parameters (like regression coefficients)
z is a vector of network statistics, conditioning the graph
 is a normalizing constant, to ensure the probabilities sum to 1.
Modeling with Networks: Structure
Exponential Random Graph Models (p*)
The simplest graph is a Bernoulli random graph,where each Xij is
independent:
p( X  x) 
exp{ ij xij }
i, j
 ( )
Where:
ij = logit[P(Xij = 1)]
() =P[1 + exp(ij )]
Note this is one of the few cases where () can be written.
Modeling with Networks: Structure
Exponential Random Graph Models (p*)
Typically, we add a homogeneity condition, so that all isomorphic
graphs are equally likely. The homogeneous bernulli graph model:
p( X  x) 
exp  { xij }
Where:
() =[1 + exp()]g
i, j
 ( )
Modeling with Networks: Structure
Exponential Random Graph Models (p*)
If we want to condition on anything much more complicated than density, the
normalizing constant ends up being a problem. We need a way to express the
probability of the graph that doesn’t depend on that constant. It turns out we
can do this by conditioning on a ‘complement’ graph.
First some terms:
X i, j  Sociomatri x with ij element forced to 1
X i, j  Sociomatri x with ij element forced to 0
X ic, j  Sociomatri x with no tie between i and j
Modeling with Networks: Structure
Exponential Random Graph Models (p*)
After some algebra:
 p( X ij  1 | X ijc ) 



 ij  log 
  [ z ( xij )  z ( xij )]
c 
 p( X ij  0 | X ij ) 
Note that we can now model the conditional probability of the graph,
as a function of a set of difference statistics, without reference to the
normalizing constant.
The model, then, simply reduces to a logit model on the dyads.
Modeling with Networks: Structure
Exponential Random Graph Models (p*)
Fitting p* models
I highly recommend working through the p* primer examples, which can be
found at:
http://kentucky.psych.uiuc.edu/pstar/index.html
Including:
A Practical Guide To Fitting p* Social Network Models
Via Logistic Regression
The site includes the PREPSTAR program for creating the difference variables
of interest.
Modeling with Networks: Structure
Exponential Random Graph Models (p*)
1 2
3 |4 5 6
1 1 1


2 1
1


3 1
1
1
x 

4
1 1

5
1


6
1 1

1
2
3
6
4
5
We can model this network based on parameters for overall degree of Choice
(), Differential Choice Within Positions (W), Mutuality(), Differential
Mutuality Within Positions (W), and Transitivity (T).
The vector of model parameters to be estimated is:  = {  W  W T }.
Modeling with Networks: Structure
Exponential Random Graph Models (p*)
proc logistic descending ;
tie = l lw m mw tt / noint;
run;
1
2
3
6
4
5
L = Choice
LW = Within Group
M = Mutuality
MW = Mutual within Group
TT = Transitivity
Substantively, this graph is likely from the random class of graphs with similar mutuality and size
Modeling with Networks: Structure
Exponential Random Graph Models (p*)
One practical problem is that the resulting values are often quite correlated,
making estimation difficult. This is particularly difficult with “star”
parameters.
lw
m
mw
tt
lw
1.00000
0.58333
0.0007
0.80178
<.0001
0.15830
0.4034
m
0.58333
0.0007
1.00000
0.80178
<.0001
-0.02435
0.8984
mw
0.80178
<.0001
0.80178
<.0001
1.00000
-0.11716
0.5375
tt
0.15830
0.4034
-0.02435
0.8984
-0.11716
0.5375
1.00000
Modeling with Networks: Structure
Exponential Random Graph Models (p*)
Parameters that are often fit include:
1) Expansiveness and attractiveness parameters. = dummies for
each sender/receiver in the network
2) Degree distribution
3) Mutuality
4) Group membership (and all other parameters by group)
5) Transitivity / Intransitivity
6) K-in-stars, k-out-stars
7) Cyclicity
Modeling with Networks: Structure
Comparing to Random Graphs
A conceptual merge between random graph models and QAP models is to identify a
sample of graphs from the universe you are trying to model. So, instead of
estimating:
exp{ z ( x)}
p ( X  x) 
 ( )
generate X empirically, then compare z(x) to see how likely a measure on x would
be given X. The difficulty, however, is generating X.
Modeling with Networks: Structure
Comparing to Random Graphs
The first option would be to generate all isomorphic graphs within a given
constraint.
This is possible for small graphs, but the number gets large fast. For a
network with 3 nodes, there are 16 possible directed graphs. For a
network with 4 nodes, there are 218, for 5 nodes 9608, for 6
nodes1,540,944, and so on…
So, the best approach is to sample from the universe, but, of course, if you
had the universe you wouldn’t need to sample from it. How do you
sample from a population you haven’t observed?
Use a construction algorithm that generates a random graph with known
constraints.
Modeling with Networks: Structure
Comparing to Random Graphs
Example: Bearman, Peter S., James Moody and Katherine Stovel (2004) “Chains of Affection:
The Structure of Adolescent Romantic and Sexual Networks” American Journal of Sociology
110:44:92
Romantic Relations in Jefferson High
Modeling with Networks: Structure
Comparing to Random Graphs
Simulate random networks with similar degree distribution:
Modeling with Networks: Structure
Comparing to Random Graphs
Simulated networks preserve observed degree, isolated dyad
distribution, and four-cycle constraint
Modeling with Networks: Structure
Comparing to Random Graphs
Simulated networks preserve observed degree, isolated dyad
distribution, and four-cycle constraint: 4 examples from the
simulated set
Social Network Software
UCINET
•The Standard network analysis program, runs in Windows
•Good for computing measures of network topography for single nets
•Input-Output of data is a special 2-file format, but is now able to read
PAJEK files directly.
•Not optimal for large networks
•Available from:
Analytic Technologies
Social Network Software
PAJEK
•Program for analyzing and plotting very large networks
•Intuitive windows interface
•Used for most of the real data plots in this presentation
•Started mainly a graphics program, but has expanded to a wide range of
analytic capabilities
•Can link to the R statistical package
•Free
•Available from:
Social Network Software
Cyram Netminer for Windows
•Newest Product, not yet widely used
•Price range depends on application
•Limited to smaller networks O(100)
http://www.netminer.com/NetMiner/home_01.jsp
Social Network Software
NetDraw
•Also very new, but by one of the best known names
in network analysis software.
•Free
•Limited to smaller networks O(100)
Social Network Software
NEGOPY
•Program designed to identify cohesive sub-groups in a network,
based on the relative density of ties.
•DOS based program, need to have data in arc-list format
•Moving the results back into an analysis program is difficult.
•Available from:
William D. Richards
http://www.sfu.ca/~richards/Pages/negopy.htm
SPAN - Sas Programs for Analyzing Networks (Moody, ongoing)
•is a collection of IML and Macro programs that allow one to:
a) create network data structures from nomination data
b) import/export data to/from the other network programs
c) calculate measures of network pattern and composition
d) analyze network models
•Allows one to work with multiple, large networks
•Easy to move from creating measures to analyzing data
•Available by sending an email to:
Moody.77@sociology.osu.edu
Download