Agenda: Thursday, Feb 3 • • • • Midterm date: Thursday, March 3 New readings in Watts Our navigation experiment: some analysis Brief introduction to graph theory News and Notes: Tuesday Feb 8 • • • • From the Field: NY Times article 2/8 on hate groups on Orkut Duncan Watts talk Friday Feb 11 at noon! No MK office hours tomorrow Return of NW Construction, Task 1: – – first of all, staple your own work grading: – – – if you received 2/2: leave your assignment here if you received 1/2: leave your assignment here, or revise and return on Thursday if you received 0/2: revise and return on Thursday – – MK out of town, but mandatory class experiment once again, print and bring your Lifester neighbor profiles – – – further analysis of Lifester NW navigation experiment quick review and completion of Intro to Graph Theory start on Social Network Theory • • • 2/2: proceed as described 1/2: some problems, usually of specificity 0/2: fundamental flaw or lack of clarity • Next Tuesday’s class: • Today’s agenda: Description of the Experiment • • Participation is mandatory and for credit If you don’t have your Lifester neighbor profiles, you cannot participate • • • • • We will play two rounds In each round, each of you will be the source of a navigation chain You will be given a destination user to route a form to Give the form to one of your Lifester neighbors who you think is “closer” to the target Write your Lifester UserID on forms you receive, and continue to forward them towards their destinations Points will be deducted for violations of the neighborhood structure In one round, you will be given the Lifester profile of the destination In the other round, you will not be given the destination profile Then we’ll do some brief analysis with more detail to follow • • • • – unless you have memorized your neighbor info diameter: worst-case: 5 average: 2.86 With destination profile: optimal mean = 3.67 class mean = 5.18 delta = 1.51 2 cycles Without destination profile: optimal mean = 3.6 class mean = 5.48 delta = 1.86 4 cycles Comparison to Random Walks number of chains degree vs. betweenness, class chains degree of user number of chains degree vs. betweenness, optimal chains degree of user A Brief Introduction to Graph Theory Networked Life CSE 112 Spring 2005 Prof. Michael Kearns Undirected Graphs • Recall our basic definitions: – set of vertices denoted 1,…N; size of graph is N – edge is an (unordered) pair (i,j) – – – – • (i,j) is the same as (j,i) • indicates that i and j are directly connected a graph G consists of the vertices and edges maximum number of edges: N(N-1)/2 (order N^2) i and j connected if there is a path of edges between them all-pairs shortest paths: efficient computation via Dijkstra's algorithm (another) • Subgraph of G: • Connected components of G: – restrict attention to certain vertices and edges between them – subgraphs determined by mutual connectivity – connected graph: only one connected component – complete graph: edge between all pairs of vertices linear functions: tractable computation time size of graph 2^N exponential: intractable size of graph computation time computation time Complexity Theory in One Slide N^3 N^2 polynomials: tractable size of graph • 1000^2 = 1 million • 2^1000: not that many atoms! • most known problems: • either low-degree polynomial… • … or exponential Properties and Measures of Graphs Cliques and Independent Sets • A clique in a graph G is a set of vertices: – – – – – – informal: that are all directly connected to each other formal: whose induced subgraph is complete all vertices in direct communication, exchange, competition, etc. the tightest possible “social structure” an edge is a clique of just 2 vertices generally interested in large cliques • Independent set: – set of vertices whose induced subgraph is empty (no edges) – vertices entirely isolated from each other without help of others • Maximum clique or independent set: largest in the graph • Maximal clique or independent set: can’t grow any larger Some Interesting Properties • Computation of cliques and independent sets: – maximal: easy, can just be greedy – maximum: difficult --- believed to be intractable (NP-hard) • computation time scales exponentially with graph size – however, approximations are possible • Social design and Ramsey theory: – suppose large cliques or independent sets are viewed as “bad” – e.g. in trade: • large clique: too much collusion possible • large independent set: impoverished subpopulation – would be natural to want to find networks with neither – Ramsey theory: may not be possible! – Any graph with N vertices will have either a clique or an independent set of size > log(N) – A nontrivial “accounting identity”; more later Graph Colorings • A coloring of an undirected graph is: – an assignment of a color (label) to each vertex – such that no pair connected by an edge have the same color – chromatic number of graph G: fewest colors needed • Example application: – classes and exam slots – chromatic number determines length of exam period • Here’s a coloring demo • Computation of chromatic numbers is hard – (poor) approximations are possible • Interesting fact: the four-color theorem for planar graphs Matchings in Graphs • A matching of an undirected graph is: – – – – – a subset of the edges such that no vertex is “touched” more than once perfect matching: every vertex touched exactly once perfect matchings may not always exist (e.g. N odd) maximum matching: largest number of edges • Can be found efficiently; here is a perfect matching demo • Example applications: – pairing of compatible partners • perfect matching: nobody “left out” – jobs and qualified workers • perfect matching: full employment, and all jobs filled – clients and servers • perfect matching: all clients served, and no server idle Cuts in Graphs • A cut of a (connected) undirected graph is: – – – – a subset of the edges (edge cut) or vertices (vertex cut) such that the removal of this set would disconnect the graph min/maximum cut: smallest/largest (minimal) number computation can be done efficiently • Often related to robustness of the network – – – – small cuts ~ vulnerability edge cut: failure of links vertex cut: failure of “individuals” random versus maliciously chosen failures (terrorism) Spanning Trees • A spanning tree of a (connected) undirected graph is: – – – – a subgraph G’ of the original graph G such that G’ is connected but has no cycles (a tree) minimum spanning tree: fewest edges computation: can be done efficiently • Minimal subgraphs needed for complete communication • Different spanning tree provide different solutions • Applications: – minimizing wire usage in circuit design Summary of Graph Properties object type of object interpretation computational status shortest paths and diameter simple paths of vertices and edges “distances” between vertices can compute efficiently cliques and independent sets sets of vertices fully connected or disconnected groups max clique and min IS intractable colorings assignment of color to every vertex Edges are constraints, colors are resources chromatic number intractable matchings set of edges pairing of compatibles max matching efficiently computable cuts set of edges or vertices measure of robustness min and max cut efficiently computable spanning trees subgraph with no cycles connected substructure min spanning tree efficiently computable Special Types of Graphs Directed Graphs • • • • Graphs in which the edges have a direction Edge (u,v) means u v; may also have (v,u) Common for capturing asymmetric relations Common examples: – the web – reporting/subordinate relationships • corporate org charts • code block diagrams – causality diagrams Weighted Graphs • Each edge/vertex annotated by a weight or capacity • Directed or undirected • Used to model – cost of transmission, latency – capacity of link – hubs and authorities (Google PageRank algorithm) • Common problem: network flow, efficiently solvable Planar Graphs • Graphs which can be drawn in the plane with no edges crossing (except at vertices) • Of interest for – maps of the physical world – circuit/VLSI design – data visualization • Graphs of higher genus • Planarity testing efficiently solvable Bipartite Graphs • Vertices divided into two sets • Edges only between the two sets • Example: affiliation networks – vertices are individuals and organizations – edge if an individual belongs to an organization • Men and women, servers and clients, jobs and workers • Some problems easier to compute on bipartite graphs We’ll make use of these graph types… but will generally be looking at classes of graphs generated according to a probability distribution, rather than obeying some fixed set of deterministic properties.