Sociology and CS Philip Chan How close are people connected? Are people closely connected, not closely connected, isolated into groups, … Degree of Separation The number of connections to reach another person Milgram’s Experiment Stanley Milgram, psychologist Experiment in the late 1960’s Chain letter to gather data Stockbroker in Boston 160 people in Omaha, Nebraska Given a packet Add name and forward it to another person who might be closer to the stockbroker Partial “social network” Small World Six degrees of separation Everyone is connected to everyone by a few people—about 6 on the average. Obama might be 6 connections away from you “Small world” phenomenon Bacon Number Number of connections to reach actor Kevin Bacon http://oracleofbacon.org/ Is a connection in this network different from the one in Milgram’s experiment? Problem Formulation Given (input) People Connections/links/friendships Find (output) the average number of connections between two people Simplification don’t care about how long/strong/… the friendships are Problem Formulation Formulate it into a graph problem (abstraction) Given (input) People Connections Find (output) the average number of connections between two people Problem Formulation Formulate it into a graph problem (abstraction) Given (input) People -> vertices Connections -> edges Find (output) the average number of connections between two people -> ? Problem Formulation Formulate it into a graph problem (abstraction) Given (input) People -> vertices Connections -> edges Find (output) the average number of connections between two people -> average shortest path length Algorithm Shortest Path Dijkstra’s algorithm Limitations? Algorithm Shortest Path Dijkstra’s algorithm This could be an overkill, why? Algorithm Unweighted edges Each edge has the same weight of 1 Simpler algorithm? Any ideas? Breadth-First Search Start from the origin Explore people one link away (dist=1) Explore people two links away (dist=2) … Until the destination is reach Algorithm Breadth-first search (BFS) Single origin, all destinations Algorithm Breadth-first search (BFS) Single origin, all destinations Repeat BFS for each origin ShortestDistance(x,y) = shortestDistance(y,x) Algorithm Breadth-first search (BFS) Single origin, all destinations Repeat BFS for each origin ShortestDistance(x,y) = shortestDistance(y,x) Each subsequent BFS can stop earlier BFS Algorithm—A Closer Look Add origin to the open list While open list is not empty 1. 2. 3. Remove the first person from the open list Mark the person as visited Add non-visited neighbors of the person to the open list BFS Algorithm—A Closer Look Add (origin, 0) to the open list While open list is not empty 1. 2. 3. Remove the first item (person, dist) from the open list Mark person as visited For each non-visited neighbor of person Add (neighbor, dist + 1) to the open list Implementation Data Structures What do we need to keep track of? Whether we have visited some person before An ordered list of persons to be visited Implementation Boolean visited[person] Open list of to be visited people Queue First in first out (FIFO) Queue Operations Enqueue Add an item at the end Dequeue Remove an item from the front Queue Implementation 1 Consider using an array How to implement the two operations? Enqueue Update tail, add item at tail Dequeue Queue Implementation 1 Consider using an array How to implement the two operations? Enqueue Update tail, add item at tail Dequeue Remove the first item Shift the rest of the items up, update tail Additional Operations What do we need to check before Enqueue? What do we need to check before Dequeue? Additional Operations What do we need to check before Enqueue? What do we need to check before Dequeue? Enqueue: Is the queue full? Dequeue Is the queue empty? Queue Implementation 1(shifting items) Is the queue empty? Queue Implementation 1(shifting items) Is the queue empty? When tail is -1 Is the queue full? Queue Implementation 1(shifting items) Is the queue empty? When tail is -1 Is the queue full? When tail is length - 1 Number of assignments (Speed of algorithm) N = number of items on the queue Count number of assignments involving items in the queue Number of assignments (Implementation 1) Enqueue Number of assignments (Implementation 1) Enqueue 1 (assign new item at tail) Dequeue Number of assignments (Implementation 1) Enqueue 1 (assign new item at tail) Dequeue 1 (assign first item to somewhere else) N-1 shifts (assignments) Total N Shifting items in Dequeue Quite a bit of work if we have many items in the queue Can we avoid shifting? If so, how? Ideas? Queue Implementation 2 Consider using an array How to implement the two operations? Enqueue Head and tail Update tail, add at tail Dequeue Queue Implementation 2 Consider using an array How to implement the two operations? Enqueue Head and tail update tail, add at tail Dequeue Get item from head Update head Queue Implementation 2 (head and tail) Is the queue empty? Queue Implementation 2 (head and tail) Is the queue empty? When head is larger than tail Is the queue full? Queue Implementation 2 (head and tail) Is the queue empty? When head is larger than tail Is the queue full? When tail is length - 1 Number of assignments (Implementation 2) Enqueue Number of assignments (Implementation 2) Enqueue 1 (assign new item at tail) Dequeue Number of assignments (Implementation 2) Enqueue 1 (assign new item at tail) Dequeue 1 (assign item at tail to somewhere else) What if tail reaches the end of array But there is room in the front of the array? Any ideas to reuse the space in the front? Queue Implementation 3 Circular queue Wrap around If we use head and tail (Implementation 2) What needs to be changed? Circular Queue Operations Length = length of array Enqueue Circular Queue Operations Length = length of array Enqueue If not full If tail < length increment tail Else tail = 0 Add item Dequeue Circular Queue Operations Length = length of array Enqueue If not full If tail < length increment tail Else tail = 0 Add item Dequeue Similarly for head Queue Implementation 3 (circular queue) Is the queue empty? Queue Implementation 3 (circular queue) Is the queue empty? When head is larger than tail With wrap-around adjustment Is the queue full? Queue Implementation 3 (circular queue) Is the queue empty? When head is larger than tail With wrap-around adjustment Is the queue full? When head is larger than tail With wrap-around adjustment Same!! How can we distinguish them? Queue Implementation 3 (circular queue) Is the queue empty? When head is larger than tail With wrap-around adjustment Is the queue full? When head is larger than tail With wrap-around adjustment Same!! How can we distinguish them? Keep track of size instead Keeping track of queue size Enqueue: increment size Dequeue: decrement size Summary Understand how people are connected Degrees of separation shortest distance between two people Average degree of separation (community) Sum of shortest distance of all pairs / # of pairs Algorithm for shortest distance (distance 1 in all edges) Breadth-first search (BFS) Different origins BFS Open list: people to be explored Queue Circular queue (implementation 3)