Centrality

advertisement
Centrality
1.
Bavelas - Leavitt experiments
a.
effects of communication structure on performance
b.
one factor: distance from integrator
2.
Intuitive notions of centrality and centralization
a.
Freeman's review & distillation
i.
center of star is the ultimate in centrality
3.
Centralization
a.
extent to which graph revolves around a single individual
b.
sum of differences between centrality of most central node and centrality of all
others, divided by maximum possible.
4.
Degree
a.
definition:
i.
number of nodes adjacent to given node
ii.
number of edges incident upon
iii.
ci = sumj(aij)
b.
normalization
c.
probability of transmission
d.
indegree and outdegree
5.
Bonacich Eigenvector centrality
a.
trouble with degree: two people could have same degree, but one person knows
only people who know hardly anyone, while the other knows people who know
lots of people
b.
solution is iterated degree: a person's centrality is equal to the sum of the degrees
of the people they are connected to:
i.
ci = xijdj
ii.
iterative version: let ci(0) = 1; ci(1) = xijcj(0); ci(2) = xijcj(1);
c.
but why stop with two iterations?
i.
ci(k) = xijcj(k-1)
ii.
ci(k) increases as k increases, but ratios of the values tend to stabilize, so
that ci(100) = λci(99)
iii.
so at some point λci = xijcj or in matrix form λc = Xc
iv.
c is an eigenvector of X and lambda is an eigenvalue
d.
λcc' = X or xij = λcicj : the principal eigenvectorof a matrix is that vector which
when multiplied by itself, gives the best possible estimate of the matrix values: its
a model of the matrix.
e.
valued data
f.
directed data
g.
i.
right-eigenvector: λc = Xc
ii.
left-eigenvector: cλ = cX (prestige)
iii.
in a symmetric matrix c-left = transpose of c-right
to sum up, degree is how many people you know; eigenvector is how many people
you know who are in the know
6.
Closeness
a.
definition:
i.
total distance of given node from all others (same as average
ii.
sumj(dij)
b.
inverse measure
c.
need connected graph
d.
speed of transmission
e.
in-closeness and out-closeness
7.
Betweenness
a.
definition:
i.
loosely: # of times someone needs given node to get to someone else.
measured as the number of paths that pass through a given point.
ii.
more accurately: # of shortest paths, and we make an adjustment for the
existence of alternative shortest paths between pairs of points. if two paths,
we count each as 1
iii.
gij = # of geodesics from i to j; gikj # of geodesics from i to j via k; ck =
sumi sumj {gijk/gij}
iv.
can interpret inner term as the probability that signal passes through k on
way from i to j
b.
control of transmission: broker, gatekeeper, liaison
c.
Pitts article
d.
based on geodesics only.
8.
Flow betweenness
a.
proportion reduction in maximum flow if a given node is removed from the
network.
9.
Influence
a.
katz
i.
ii.
iii.
iv.
v.
vi.
vii.
a person influences those they are connected to. these in turn influence
those whom they are connected to, including the first person, who
reinfluences those he is connected to. positive feedback loop.
chains of influence, long or short, are just walks.
total influence on others is the total number walks of any length from a
given node to all others.
assume we can measure i's influence on j. maybe 1/0 scale: does or doesn't.
One way to do it is ask "who knows what he's talking about in this
organization?" then take the transpose of that, because if i mentions j as
being knowledgeable, then j has influence over i. So R = A'
# of walks for length k from each person to each other is given by Rk, so
total influence of i on j is W = R + R2 + R3 + ... + RK + ....R and the total
influence of i on all others then, is the row sums of W. (C = W1)
two problems:
(1)
weights long walks as much as short walks (in fact more since
there are more long walks than short ones).
(2)
the series doesn't converge. numbers go to infinity.
solution:
(1)
introduce an attenuation factor α: assume that influence declines
with distance. i.e. weight the walks inversely by their length. W =
αR + α2R2 + α3R3 + ... + αkRk
(2)
if α < 1 then as k approaches infinity, αk approaches 0.
(3)
Conveniently, it happens that (I - αR)-1 = α0R0 + α1R1 + α2R2 +
α3R3 + ... + αkRk as long as abs(1/α) > λ (dominant eigenvalue of
R) or α < 1/λ
(4)
So W = (I - αR)-1 - I and C is again the row sums of W
(a)
[(I - αR)-1 - I]e = (I - αR)-1e - Ie = (I - αR)-1e - 1 = row sums
of (I - αR)-1 minus 1.
(5)
think of α as probability that i persuades j
viii.
ix.
x.
b.
c.
10.
in the example, simple degree gives (40, 20, 20, 60, 20, 80) but katz gives
(47, 4, 4, 41, 22, 45). node one has low degree but is chosen by the two
nodes that everyone else chooses.
(1)
node 1 is the expert that is not visible to the masses but is known
by the cognoscenti
(2)
if we choose α = 1/λ, then we have 1/1.6837716 which is
0.5939048. Result is {61, 0, 0, 50, 29, 54} - which is the column
eigenvector of matrix
(3)
if we choose
hard to pick α without empirical evidence. and if we choose arbitrarily, can
get numbers that don't work: can't invert the matrix.
(1)
helpful fact: λ <= greatest row or column sum. and is only equal to
that bound when all rows (or columns) have equal sums.
(2)
so can always pick α equal to 1/greatest_row_sum
(3)
incidentally, if we were to make the rows or columns stochastic
(add to 1), the eigenvalue would be 1, and this series would
converge if α is just less than 1.
(4)
status versus influence: rowsums or column sums?
hubbell
i.
Hubbell wanted a measure in which the status of an actor is a function of
the status of the people it is connected to, plus a factor that was just due to
the person herself ci = Σxijcj + ei or in matrix terms, c = Xc + e
ii.
we can solve for C as follows: c = Xc + e; c - Xc = e; (I - X)c = e; c = (I
- X)-1e
iii.
suppose e = column vector of 1s. then Hubbell's C is just row marginals of
(I - X)-1. Let X = αR. Then C = row marginals of (I - αR)-1. almost like
katz. in fact Hubbell = Katz + 1.
iv.
But if E is not vector of 1s, then Katz and Hubbell can be different.
(1)
if E is degree of X (e = X1) then katz = hubbell:
(2)
h = Xh + X1; h - Xh = X1; (I - X)h = X1; h = (I - X)-1X1 = [X0 +
X1 ...]X1 = X0X1 + X1X1 + X2X1 ... = X11 + X21 + X31.. = [X1 +
X2 + X3...]1 = KATZ
Bonacich revisited
i.
suppose e is vector of zeros. then ci = Σxijcj or c = Xc. when can this
occur? when lambda equal 1, or when X = αR and α < 1/λ. c = 1/λRc or λc
= Rc. so in this case hubbell = bonacich.
Power
a.
in general, intuitively, we expect centrality to be related to power. Lots of
empirical evidence. but the relationship clearly differs by context
b.
Cook et al
i.
created series of lab experiments in which pairs of actors must strike a deal
with each other. have to divide up 24 points. can only make deals with
c.
d.
11.
people they are connected to, which is controlled by experimenter.
ii.
objective is to reveal the structural determinants of power in exchange
networks. had two basic theoretical ideas going in: emerson dependency
theory and network centrality
(1)
emerson: a has power over b if b depends on a more than a depends
on b
(2)
b depends on a to the extent that (1) there are few alternatives to A
available to B, and (2) B really needs whatever A has got
(a)
in a star graph, center has many options, periphery has few
(b)
in a 5-pt line, center three have two options, but only B and
D connected to those with no options, so C is weak
(3)
in exhcnage experiments, (2) is a constant and can be disregarded
(4)
centrality: degree relates to dependency; closeness: hard to relate to
power in these experiments; betweenness works to the extent that
we consider only paths of length two.
iii.
in star graph, everything predicts same, and agrees with experimental
results
iv.
in 5-pt line, dependency predicts middle is weak, but centrality predicts
middle is strong. empirical result is middle is weak.
v.
in order to save graph-theoretic orientation, try to develop new centrality
measure called vulnerability = reduction in maximum flow.
(1)
is glimmerings of notion of excludability, developed by markovsky
(2)
is based on specific process (power in exchange networks) rather
than general principles.
Markovsky et al
i.
gpi = # of vertex-disjoint paths of odd length minus # of even length paths
ii.
they add on some additional assumptions: i seeks exchange with j only if
p(i) > p(j) or p(i) - p(j) > p(i) - p(k) for all k
iii.
excludability
Bonacich power
Purely graph-theoretic measures versus theoretical measures tied to specific substantive
arena
DISPLAY
─────────────────────────────────────────────────────────────────
Input dataset:
1
2
3
4
5
6
1
0
0
0
1
0
1
2
0
0
1
0
0
0
3
0
1
0
0
0
0
4
0
0
1
0
1
1
5
0
0
0
1
0
0
C:\DATA\KATZ
6
1
1
1
0
1
0
Indegree:
2 1 1 3 1 4
INFLUENCE
─────────────────────────────────────────────────────────────────
Attenuation factor:
0.5000000
Method:
KATZ
Input dataset:
C:\DATA\KATZ
1
2
3
4
5
6
1
---1.00
2.67
3.33
2.00
2.00
2.00
2
---0.00
0.33
0.67
0.00
0.00
0.00
3
---0.00
0.67
0.33
0.00
0.00
0.00
4
---0.80
2.40
3.20
1.40
2.00
1.60
5
---0.40
1.20
1.60
1.20
1.00
0.80
6
---1.20
2.93
3.47
1.60
2.00
1.40
Descriptive Statistics
1
Mean
2 Std Dev
3
Sum
4 Variance
5 Euc Norm
6 Minimum
7 Maximum
8 N of Obs
1
2
3
4
5
6
----- ----- ----- ----- ----- ----2.17 0.17 0.17 1.90 1.03 2.10
0.71 0.25 0.25 0.76 0.37 0.83
13.00 1.00 1.00 11.40 6.20 12.60
0.51 0.06 0.06 0.58 0.14 0.69
5.59 0.75 0.75 5.02 2.69 5.53
1.00 0.00 0.00 0.80 0.40 1.20
3.33 0.67 0.67 3.20 1.60 3.47
6.00 6.00 6.00 6.00 6.00 6.00
Descriptive Statistics
1
2
3
4
5
6
1
2
3
4
5
6
7
8
MeanStd De
SumVarianEuc NoMinimuMaximuN of O
----- ----- ----- ----- ----- ----- ----- ----0.57 0.47 3.40 0.22 1.80 0.00 1.20 6.00
1.70 1.01 10.20 1.02 4.84 0.33 2.93 6.00
2.10 1.29 12.60 1.67 6.04 0.33 3.47 6.00
1.03 0.77 6.20 0.59 3.16 0.00 2.00 6.00
1.17 0.90 7.00 0.81 3.61 0.00 2.00 6.00
0.97 0.77 5.80 0.59 3.03 0.00 2.00 6.00
EIGENVECTOR ANALYSIS
Eigenvalue = 1.6837716
Column Eigenvector:
1
2
3
4
5
6
.61439642
0.00000000
0.00000000
.49500917
.29398832
.53949405
Row Eigenvector:
1
2
3
4
5
6
.30276
1.04830
1.25532
0.55559
0.63273
0.50978
INFLUENCE
─────────────────────────────────────────────────────────────────
Attenuation factor:
0.5939048
Method:
KATZ
Input dataset:
C:\DATA\KATZ
1
2
3
4
5
6
1
2
3
4
5
6
------------ ------------ ------------ ------------ ------------ -----------112660288.00
0.00
0.00 90768560.00 53907884.00 98925640.00
390083616.00
0.54
0.92 314284000.00 186654784.00 342527712.00
467117504.00
0.92
0.54 376348960.00 223515456.00 410170240.00
206741392.00
0.00
0.00 166568176.00 98925632.00 181537120.00
235444992.00
0.00
0.00 189694192.00 112660288.00 206741392.00
189694192.00
0.00
0.00 152833520.00 90768560.00 166568176.00
Descriptive Statistics COLUMN MEANS ONLY
266956992
0.24
0.24
215082896
Descriptive Statistics ROW MEANS ONLY
1
2
3
4
5
6
59377064.00
205591680.00
246192032.00
108962056.00
124090144.00
99977408.00
127738768
234411712
Download