The Language of Networks: A Painless Introduction to Graph Theory

advertisement
Agenda: Thursday, Feb 3
•
•
•
•
Midterm date: Thursday, March 3
New readings in Watts
Our navigation experiment: some analysis
Brief introduction to graph theory
News and Notes: Tuesday Feb 8
•
•
•
•
From the Field: NY Times article 2/8 on hate groups on Orkut
Duncan Watts talk Friday Feb 11 at noon!
No MK office hours tomorrow
Return of NW Construction, Task 1:
–
–
first of all, staple your own work
grading:
–
–
–
if you received 2/2: leave your assignment here
if you received 1/2: leave your assignment here, or revise and return on Thursday
if you received 0/2: revise and return on Thursday
–
–
MK out of town, but mandatory class experiment
once again, print and bring your Lifester neighbor profiles
–
–
–
further analysis of Lifester NW navigation experiment
quick review and completion of Intro to Graph Theory
start on Social Network Theory
•
•
•
2/2: proceed as described
1/2: some problems, usually of specificity
0/2: fundamental flaw or lack of clarity
•
Next Tuesday’s class:
•
Today’s agenda:
Description of the Experiment
•
•
Participation is mandatory and for credit
If you don’t have your Lifester neighbor profiles, you cannot participate
•
•
•
•
•
We will play two rounds
In each round, each of you will be the source of a navigation chain
You will be given a destination user to route a form to
Give the form to one of your Lifester neighbors who you think is “closer” to the target
Write your Lifester UserID on forms you receive, and continue to forward them towards
their destinations
Points will be deducted for violations of the neighborhood structure
In one round, you will be given the Lifester profile of the destination
In the other round, you will not be given the destination profile
Then we’ll do some brief analysis with more detail to follow
•
•
•
•
–
unless you have memorized your neighbor info
diameter:
worst-case: 5
average: 2.86
With destination profile:
optimal mean = 3.67
class mean = 5.18
delta = 1.51
2 cycles
Without destination profile:
optimal mean = 3.6
class mean = 5.48
delta = 1.86
4 cycles
Comparison to Random Walks
number of chains
degree vs. betweenness, class chains
degree of user
number of chains
degree vs. betweenness, optimal chains
degree of user
A Brief Introduction
to Graph Theory
Networked Life
CSE 112
Spring 2005
Prof. Michael Kearns
Undirected Graphs
•
Recall our basic definitions:
– set of vertices denoted 1,…N; size of graph is N
– edge is an (unordered) pair (i,j)
–
–
–
–
• (i,j) is the same as (j,i)
• indicates that i and j are directly connected
a graph G consists of the vertices and edges
maximum number of edges: N(N-1)/2 (order N^2)
i and j connected if there is a path of edges between them
all-pairs shortest paths: efficient computation via Dijkstra's algorithm (another)
•
Subgraph of G:
•
Connected components of G:
– restrict attention to certain vertices and edges between them
– subgraphs determined by mutual connectivity
– connected graph: only one connected component
– complete graph: edge between all pairs of vertices
linear functions:
tractable
computation time
size of graph
2^N
exponential:
intractable
size of graph
computation time
computation time
Complexity Theory in One Slide
N^3
N^2
polynomials:
tractable
size of graph
• 1000^2 = 1 million
• 2^1000: not that many atoms!
• most known problems:
• either low-degree polynomial…
• … or exponential
Properties and Measures
of Graphs
Cliques and Independent Sets
• A clique in a graph G is a set of vertices:
–
–
–
–
–
–
informal: that are all directly connected to each other
formal: whose induced subgraph is complete
all vertices in direct communication, exchange, competition, etc.
the tightest possible “social structure”
an edge is a clique of just 2 vertices
generally interested in large cliques
• Independent set:
– set of vertices whose induced subgraph is empty (no edges)
– vertices entirely isolated from each other without help of others
• Maximum clique or independent set: largest in the graph
• Maximal clique or independent set: can’t grow any larger
Some Interesting Properties
• Computation of cliques and independent sets:
– maximal: easy, can just be greedy
– maximum: difficult --- believed to be intractable (NP-hard)
• computation time scales exponentially with graph size
– however, approximations are possible
• Social design and Ramsey theory:
– suppose large cliques or independent sets are viewed as “bad”
– e.g. in trade:
• large clique: too much collusion possible
• large independent set: impoverished subpopulation
– would be natural to want to find networks with neither
– Ramsey theory: may not be possible!
– Any graph with N vertices will have either a clique or an independent set
of size > log(N)
– A nontrivial “accounting identity”; more later
Graph Colorings
• A coloring of an undirected graph is:
– an assignment of a color (label) to each vertex
– such that no pair connected by an edge have the same color
– chromatic number of graph G: fewest colors needed
• Example application:
– classes and exam slots
– chromatic number determines length of exam period
• Here’s a coloring demo
• Computation of chromatic numbers is hard
– (poor) approximations are possible
• Interesting fact: the four-color theorem for planar graphs
Matchings in Graphs
• A matching of an undirected graph is:
–
–
–
–
–
a subset of the edges
such that no vertex is “touched” more than once
perfect matching: every vertex touched exactly once
perfect matchings may not always exist (e.g. N odd)
maximum matching: largest number of edges
• Can be found efficiently; here is a perfect matching demo
• Example applications:
– pairing of compatible partners
• perfect matching: nobody “left out”
– jobs and qualified workers
• perfect matching: full employment, and all jobs filled
– clients and servers
• perfect matching: all clients served, and no server idle
Cuts in Graphs
• A cut of a (connected) undirected graph is:
–
–
–
–
a subset of the edges (edge cut) or vertices (vertex cut)
such that the removal of this set would disconnect the graph
min/maximum cut: smallest/largest (minimal) number
computation can be done efficiently
• Often related to robustness of the network
–
–
–
–
small cuts ~ vulnerability
edge cut: failure of links
vertex cut: failure of “individuals”
random versus maliciously chosen failures (terrorism)
Spanning Trees
• A spanning tree of a (connected) undirected graph is:
–
–
–
–
a subgraph G’ of the original graph G
such that G’ is connected but has no cycles (a tree)
minimum spanning tree: fewest edges
computation: can be done efficiently
• Minimal subgraphs needed for complete communication
• Different spanning tree provide different solutions
• Applications:
– minimizing wire usage in circuit design
Summary of Graph Properties
object
type of object
interpretation
computational status
shortest paths
and diameter
simple paths of
vertices and edges
“distances” between
vertices
can compute
efficiently
cliques and
independent sets
sets of vertices
fully connected or
disconnected groups
max clique and min IS
intractable
colorings
assignment of color
to every vertex
Edges are constraints,
colors are resources
chromatic number
intractable
matchings
set of edges
pairing of compatibles
max matching
efficiently computable
cuts
set of edges or
vertices
measure of robustness
min and max cut
efficiently computable
spanning trees
subgraph with no
cycles
connected substructure
min spanning tree
efficiently computable
Special Types of Graphs
Directed Graphs
•
•
•
•
Graphs in which the edges have a direction
Edge (u,v) means u  v; may also have (v,u)
Common for capturing asymmetric relations
Common examples:
– the web
– reporting/subordinate relationships
• corporate org charts
• code block diagrams
– causality diagrams
Weighted Graphs
• Each edge/vertex annotated by a weight or capacity
• Directed or undirected
• Used to model
– cost of transmission, latency
– capacity of link
– hubs and authorities (Google PageRank algorithm)
• Common problem: network flow, efficiently solvable
Planar Graphs
• Graphs which can be drawn in the plane with no edges
crossing (except at vertices)
• Of interest for
– maps of the physical world
– circuit/VLSI design
– data visualization
• Graphs of higher genus
• Planarity testing efficiently solvable
Bipartite Graphs
• Vertices divided into two sets
• Edges only between the two sets
• Example: affiliation networks
– vertices are individuals and organizations
– edge if an individual belongs to an organization
• Men and women, servers and clients, jobs and workers
• Some problems easier to compute on bipartite graphs
We’ll make use of these graph types… but will
generally be looking at classes of graphs generated
according to a probability distribution, rather than
obeying some fixed set of deterministic properties.
Download