lect09.pptx

advertisement
Overview
Social Goal. Explain why information and disease
spread so quickly in social networks.
Mathematical Approach. Model social networks as
random graphs and argue that they are likely to
have low diameter.
The Science of Networks
6.1
Definition: The clustering coefficient of a node v
is the fraction of pairs of v’s friends that are
connected to each other by edges.
Clustering
Coefficient = 1/2
The higher the clustering coefficient of a node, the
more strongly triadic closure is acting on it
The Science of Networks
6.2
Erdos-Renyi Graph models


Randomly choose
How well does the E-R model characterize real
networks?





high school friendship networks
road networks
peer-to-peer filesharing networks
product co-purchase networks (people who bought X also
bought Y)
other?
The Science of Networks
6.3
Random graph diameter
Technique. Grow trees to bound path lengths.
1.
2.
3.
Show when trees are small enough (< √n),
# of leaves doubles.
Grow small trees (< √n) around a pair of
nodes.
Use birthday paradox to argue trees prob.
intersect.
Conclude diameter is 2 log √n = log n.
The Science of Networks
6.4
Small world phenomenon
Milgram’s experiment (1960s).
Ask someone to pass a letter to another person via
friends knowing only the name, address, and
occupation of the target.
The Science of Networks
6.5
Small world phenomenon
Bernard, David’s cousin
who went to college with
David, mayor
of Bob’s town
Bob, a farmer
in Nebraska
Maya, who grew
up in Boston
With Lashawn
The Science of Networks
6.6
Last time: short paths exist.
(argument: flood the network)
The Science of Networks
6.7
This time: and people can find short paths!
(without flooding the network)
The Science of Networks
6.8
The Milgram experiment
Experiment:
We choose a random “source” and “target”.
Your goal: pass ball from “source” to “target” by
throwing it to people you know on a first-name
basis.
The Science of Networks
6.9
How did you find short paths?
The Science of Networks
6.10
What information did you use to choose the next hop?
The Science of Networks
6.11
Social network models
Random rolodex model
Random wiring model
The Science of Networks
6.12
Random wiring model
People have a predictable structure of
local links (e.g., neighbors, colleagues)
And a few random long-range links
(e.g., someone you meet on a trip)
The Science of Networks
6.13
Random wiring model
Local links have grid-like structure (you know the
person on your left/right and front/back)
n
n
The Science of Networks
6.14
Random wiring model
Local links represent homophily, the idea that we
know people similar to us
n
n
The Science of Networks
6.15
Random wiring model
Long-range links are random – each node chooses
one long-range link
n
n
The Science of Networks
6.16
Random wiring model
Long-range links represent weak ties, the links to
acquantainces that would otherwise be far away
n
n
The Science of Networks
6.17
Random wiring model
What do you expect the diameter to be?
Do you expect people to find short paths?
If so, how short?
The Science of Networks
6.18
Short paths?
Problem:
What if longWe’ll end up
navigational
clues
range links are
overlost repeatedly
inuniformly
long-range
shooting target.
links. random?
The Science of Networks
6.19
Short paths?
Problem: increases
What
We’ll end
if longup
path-length.
taking
range forever
links areto
very
get anywhere.
localized?
The Science of Networks
6.20
Decentralized search
Idea: Suppose long-range links are just slightly
more likely to be to close nodes.
Result: Then decentralized search finds short paths.
The Science of Networks
6.21
Tradeoff
Discovere
d path
length
Actual path
length
path
lengt
h
uniform
The Science of Networks
highly local
6.22
Optimal tradeoff
Suppose links are proportional to (1/distance)2,
i.e., inverse square.
Inverse square?? WTF?
The Science of Networks
6.23
Inverse square intuition
“Scales of resolution”:
Building
Street
City
The Science of Networks
Room
number
LSRC D235
450 Research Dr.
Durham, NC
USA
State
Countr
y
6.24
Scales of resolution
Street: location
within 2 miles
The Science of Networks
6.25
Scales of resolution
City: location
within 4 miles
The Science of Networks
6.26
Scales of resolution
County: location
within 8 miles
The Science of Networks
6.27
Scales of resolution
Each new scale
doubles distance
from the center.
The Science of Networks
6.28
Scales of resolution
Long-range links equally likely to connect to each
different scale of resolution!
(allows people to make progress towards
destination no matter how far away they
are)
The Science of Networks
6.29
How many people do you know?
The Science of Networks
6.30
How many people do you know?
The Science of Networks
6.31
How many people do you know?
The Science of Networks
6.32
How many people do you know?
The Science of Networks
6.33
How many people do you know?
The Science of Networks
6.34
How many people do you know?
The Science of Networks
6.35
How many people do you know?
The Science of Networks
6.36
How well did this work?
For me:
Neighborhood
Region (RTP)
State (NC)
Southeast
Eastern US
United States
World
The Science of Networks
(Trinity Heights)
6.37
Next topic
decentralized search
The Science of Networks
6.38
How to route
(sub)
Problem. How can I get this message
from me to the far-away target?
Solution. Pass message to a friend.
closer
The Science of Networks
6.39
Scales of resolution
Each new scale
doubles distance
from the center.
The Science of Networks
6.40
Long-range links
Suppose each person has a long-range friend in each
scale of resolution.
The Science of Networks
6.41
How to route
Algorithm. Pass the message to your
farthest friend that is to the left of the
target.
The Science of Networks
6.42
Trace of route
The Science of Networks
6.43
Analysis
new
dist.
1 2
4
2j
2j+1
old dist.
The Science of Networks
6.44
Distance is cut in half every step!
The Science of Networks
6.45
Analysis
1.
2.
3.
Original distance is ? at most n.
Distance is cut in half every step (at least).
Number of steps is ? at most log n.
The Science of Networks
6.46
And in real life …
The Science of Networks
6.47
Finding the Short Paths

Milgram’s experiment, Columbia Small Worlds, E-R, amodel…


How do individuals find short paths?





in an incremental, next-step fashion
using purely local information about the NW and location of target
note: shortest path might require taking steps “away” from the
target!
This is not (only) a structural question, but an algorithmic
one


all emphasize existence of short paths between pairs
statics vs. dynamics
Navigability may impose additional restrictions on
formation model!
Briefly investigate two alternatives:


a local/long-distance mixture model [Kleinberg]
a “social identity” model [Watts, Dodd, Newman]
The Science of Networks
6.48
Kleinberg’s Model

Start with an n by n grid of vertices (so N = n^2)

add some long-distance connections to each vertex:
• k additional connections
• probability of connection to grid distance d: ~ (1/d)^r
– c.f. dollar bill migration paper




Kleinberg’s question:



what value of r permits effective navigation?
# hops << N, e.g. log(N)
Assume parties know only:



so full model given by choice of k and r
large r: heavy bias towards “more local” long-distance connections
small r: approach uniformly random
grid address of target
addresses of their own immediate neighbors
Algorithm: pass message to nbr closest to target in grid
The Science of Networks
6.49
Kleinberg’s Result

Intuition:


if r is too large (strong local bias), then “long-distance” connections
never help much; short paths may not even exist (remember, grid has
large diameter, ~ sqrt(N))
if r is too small (no local bias), we may quickly get close to the target;
but then we’ll have to use grid links to finish
• think of a transport system with only long-haul jets or donkey carts


effective search requires a delicate mixture of link distances
The result (informally):


r = 2 is the only value that permits rapid navigation (~log(N) steps)
any other value of r will result in time ~ N^c for 0 < c <= 1
• N^c >> log(N) for large N


a critical value phenomenon or “knife’s edge”; very sensitive
Note: locality of information crucial to this argument


centralized, “birds-eye” algorithm can still compute short paths at r <
2!
can recognize when “backwards” steps are beneficial
The Science of Networks
6.50
Navigation via Identity

Watts et al.:




Represent individuals by a vector of attributes





profession, religion, hobbies, education, background, etc…
attribute values have distances between them (tree-structured)
distance between individuals: minimum distance in any attribute
all jobs
only need one thing in common to be close!
Algorithm:



we don’t navigate social networks by purely “geographic”
information
we don’t use any single criterion; recall Dodds et al. on Columbia SW
different criteria used at different points in the chain
CS
given attribute vector of target
forward message to neighbor closest to target
scientists
chemistry
Permits fast navigation under broad conditions

athletes
track
soccer
not as sensitive as Kleinberg’s model
The Science of Networks
6.51
Related documents
Download