ERGM models

advertisement
Where we are
• Node level metrics
– Degree centrality
– Betweenness centrality
• None of these address
the probability that a
dyad or triad exists
• Group level metrics
– Degree centralization
– Betweenness
centralization
– Components
– Subgroups
• Visualization
• They are broad
summaries of structure
Mathematical versus Statistical Models
• Statistical models can
tell you if the
relationship observed
between variables is
due to chance
• Mathematical models
describe the
relationship between
variables and suggest
what we should
observe
E = 𝑚𝑐
2
• This formula predicts:
– Nuclear fission
– Photoelectric cells
– Black holes
• The statistical analog would be
to observe the characteristics
of, say, a black hole and
conclude they exist from those
observations
Thinking about models
• Models let us try to test why a structure exists rather
than just describing it
• QAP allows us to test whether a structure is explained
by another structure or by an attribute or set of
attributes
• Equivalence begins to let us see how nodes have roles
in network structure
– Structural
– Regular
– Equivalence in Ucinet (Profile and CATREGE)
Network Models
• Network models make it possible to test the probability
that a dyad or triad exists due to chance or not
• Dyads and triads are considered local structures
• Network modeling is based on the concept that patterns of
local structures may aggregate to a global structure
• Ultimately, the global structure that is observed may in part
emerge from local structures, from attributes or a
combination of both
Five reasons to construct a network model
(Garry Robins, Pip Pattison, Yuval Kalish, Dean Lusher (2007) An introduction
to exponential random graph (p*) models for social networks Social Networks
29: 173–191)
1. Regularities in processes that give rise to ties. Models
let you understand the uncertainty associated with
observed data
2. Can determine if substructures are expected by
chance
3. Can distinguish between structural effects versus
node attribute effects
4. Simple measures (e.g. density, centrality) may not
capture processes in complex networks
5. Can traverse the micro-macro gap – Does the
distribution of local structures explain macro
structures?
Local structures -- Dyads
• Dyad – Two nodes
• There are two types of dyads in an undirected graph:
– Mutual
– Null
• There a re three types of dyads in a directed graph:
– Mutual
– Asymmetric
– Null
• P1 models (Holland and Leinhardt, 1975) are based on probabilities
of dyadic relations
P1 in UCINET
• Network->P1
• Three equations:
– Probability of a reciprocated or asymmetric tie based
on outdegree (expansiveness)
– Probability of a reciprocated or asymmetric tie based
on indegree (attractiveness)
– Probability of a null tie (the residual of these two)
• P1 on Class data
– Analysis of residuals
Local structures -- Triads
• Triads are sets of three nodes
• Transitivity refers to the notion that if A knows B
and B knows C then A should know C
• This is not always the case
• Some triads are transitive and some are
intransitive
Transitivity and network models
• If you take all possible sub-graphs of triads there is some distribution of
transitive and intransitive triads
• Holland, P.W., and Leinhardt, S. 1975. “Local structure in social networks."
In D. Heise (ed.), Sociological Methodology. San Francisco: Jossey-Bass.
• For undirected graphs there are four types
–
–
–
–
Empty
One edge
Two path
Triangle
• For directed graphs there are 16 types
– Snijders Transitivity slides 14-15
Triads in UCINET
• Transitivity Index
– Transitive ties/Potentially Transitive Ties
– For random graphs the expected value is close to
density of graph
– For actual networks values between .3 and .6 are
typical (from Tom Snijders)
• Do Cohesion->Transitivity on class data
• Do Triad Census on class data
Triads in Pajek
• Info->Network->Triadic Census
• Compare to UCINET Triad Census
ERGM (p*) models
(Exponential Random Graph Models)
• When observing a network there is the notion that the structure
could have been different
• The idea of modeling is to propose a process by which the observed
data ended up as they did
• For example, does the network demonstrate more reciprocity than
you would expect due to chance – reciprocity can be a model
parameter
• Recall the triad census and the distribution of the different types
• You can think of models as trying to explain that distribution, and in
particular determining if the distribution is essentially random
p* models (cont.)
•
Networks are graphs of nodes and edges
•
The nodes are fixed – Meaning they are not a parameter to consider
•
With models you create a probability distribution of the possible graphs with the fixed nodes
•
The observed graph is located somewhere in this distribution
•
If the observed graph has many reciprocated ties, then a model that is a good fit will also have
many reciprocated ties
•
Once you have a distribution of graphs it can be used to compare sampled graphs (from the
distribution) to the observed one on other characteristics
•
The idea is to use the model to understand the processes underlying the observed structure
•
You can test whether node attributes (e.g. homophily) or local processes (e.g. transitivity) explain
the global structure
Dependence assumptions
• The possible set of configurations of the set of nodes is constrained
by (dependent on) the statistics of the observed network
• This limits the possibilities
• Graphs in the distribution a consequence of potentially overlapping
configurations
• The evolution of ties is not random, it is in some way dependent on
the environment around it
• In considering a parameter like reciprocity, it could be further
subdivided into other parameters that use node attributes, like girlgirl reciprocity, or girl-boy reciprocity
Different Models
• Bernoulli graph – Assumes edges are
independent
• Dyadic model – Assumes dyads are
independent
• Markov random graphs – Assumes tie
between two nodes is contingent on their ties
to other nodes (conditional dependence)
ERGM on Class Data in R
Download