lect07.ppt

advertisement
Today’s topics
Networks
 Definitions
 Modeling
 Analysis
 Slides from Michael Kearns - Univ. of Pennsylvania
 Slides from Patrick Reynolds – Duke CS 2007
Reading
Kearns, Michael. "Economics, Computer Science, and Policy."
Issues in Science and Technology, Winter 2005.
CompSci 001
11.1
Emerging science of networks




Examining apparent similarities between many human and
technological systems & organizations
 Importance of network effects in such systems
How things are connected matters greatly
 Structure, asymmetry and heterogeneity
Details of interaction matter greatly
 The metaphor of viral spread
 Dynamics of economic and strategic interaction
 Qualitative and quantitative; can be very subtle
A revolution of
 measurement
 theory
 breadth of vision
CompSci 001
(M. Kearns)
11.2
Graphs: Structures and Algorithms

How do packets of bits/information get routed on the internet
 Message divided into packets on client (your) machine
 Packets sent out using routing tables toward destination
• Packets may take different routes to destination
• What happens if packets lost or arrive out-of-order?

Routing tables store local information, not global (why?)

What about The Oracle of Bacon, Erdos Numbers, and Word
Ladders?
 All can be modeled using graphs
 What kind of connectivity does each concept model?

Graphs are everywhere in the world of algorithms (world?)
CompSci 001
11.3
Vocabulary

Graphs are collections of vertices
and edges (vertex also called
node)
 Edge connects two vertices
• Direction can be important,
directed edge, directed graph
• Edge may have associated
weight/cost

A vertex sequence v0, v1, …, vn-1 is
a path where vk and vk+1 are
connected by an edge.
 If some vertex is repeated, the
path is a cycle
 A graph is connected if there is
a path between any pair of
vertices
CompSci 001
78
NYC
Phil
268
204
190
Wash DC
LGA
$412
Boston
394
$441
$186
LAX
$1701
DCA
$186
ORD
11.4
Network/Graph questions/algorithms

What vertices are reachable from a given vertex?
 Two standard traversals: depth-first, breadth-first
 Find connected components, groups of connected vertices

Shortest path between any two vertices (weighted graphs?)!

Longest path in a graph
 No known efficient algorithm
 Longest shortest path: Diameter of graph
Visit all vertices without repeating? Visit all edges?
 With minimal cost? Hard!
What are the properties of the network?
 Structural: Is it connected?
 Statistical: What is the average number of neighbors?


CompSci 001
11.5
Six Degrees of Bacon

Background
 Stanley Milgram’s Six Degrees of Separation?
 Craig Fass, Mike Ginelli, and Brian Turtle invented it
as a drinking game at Albright College
 Brett Tjaden, Glenn Wasson, Patrick Reynolds have run t
online website from UVa and beyond
 Instance of Small-World phenomenon

http://oracleofbacon.org handles 2 kinds of requests
1. Find the links from Actor A to Actor B.
2. How good a center is a given actor?
 How does it answer these requests?
CompSci 001
11.6
How does the Oracle work?


Not using Oracle™
Queries require traversal of the graph
BN = 1
Sean Penn
BN = 0
Kevin Bacon
Mystic River
Tom Hanks
Apollo 13
Footloose
CompSci 001
Tim Robbins
Bill Paxton
Sarah Jessica Parker
John Lithgow
11.7
How does the Oracle Work?


BN = Bacon Number
Queries require traversal of the graph
BN = 2
Woody Allen
BN = 1
Sean Penn
Sweet and Lowdown
Judge Reinhold
Fast Times at Ridgemont High
Miranda Otto
War of the Worlds
Mystic River
BN = 0
Tim Robbins
The Shawshank Redemption
Morgan Freeman
Cast Away
Helen Hunt
Tom Hanks
Kevin Bacon
Apollo 13
Bill Paxton
Footloose
Forrest Gump
Sarah Jessica Parker
Sally Field
Tombstone
John Lithgow
A Simple Plan
CompSci 001
Val Kilmer
Billy Bob Thornton
11.8
How does the Oracle work?


How do we choose which movie or actor to explore next?
Queries require traversal of the graph
BN = 2
Woody Allen
BN = 1
Sean Penn
Sweet and Lowdown
Judge Reinhold
Fast Times at Ridgemont High
Miranda Otto
War of the Worlds
Mystic River
BN = 0
Tim Robbins
The Shawshank Redemption
Morgan Freeman
Cast Away
Helen Hunt
Tom Hanks
Kevin Bacon
Apollo 13
Bill Paxton
Footloose
Forrest Gump
Sarah Jessica Parker
Sally Field
Tombstone
John Lithgow
A Simple Plan
CompSci 001
Val Kilmer
Billy Bob Thornton
11.9
Center of the Hollywood Universe?



1,018,678 people can be connected to Bacon
Is he the center of the Hollywood Universe?
 Who is?
 Who are other good centers?
 What makes them good centers?
Centrality
 Closeness: the inverse average distance of a node to all
other nodes
• Geodesic: shortest path between two vertices
• Closeness centrality: number of other vertices divided by the
sum of all distances between the vertex and all others.


Degree: the degree of a node
Betweenness: a measure of how much a vertex is between
other nodes
CompSci 001
11.10
Oracle of Bacon



Name someone who is 4 degrees or more away from Kevin
Bacon
1
4
2
5
3
6
What characteristics makes someone farther away?
What makes someone a good center? Is Kevin Bacon a good
center?
CompSci 001
11.11
Business & Economic Networks




Example: eBay bidding
 vertices: eBay users
 links: represent bidder-seller or buyer-seller
 fraud detection: bidding rings
Example: corporate boards
 vertices: corporations
 links: between companies that share a board member
Example: corporate partnerships
 vertices: corporations
 links: represent formal joint ventures
Example: goods exchange networks
 vertices: buyers and sellers of commodities
 links: represent “permissible” transactions
CompSci 001
(M. Kearns)
11.12
Enron
CompSci 001
11.13
Physical Networks



Example: the Internet
 vertices: Internet routers
 links: physical connections
 vertices: Autonomous Systems (e.g. ISPs)
 links: represent peering agreements
 latter example is both physical and business network
Compare to more traditional data networks
Example: the U.S. power grid
 vertices: control stations on the power grid
 links: high-voltage transmission lines
 August 2003 blackout: classic example of interdependence
CompSci 001
(M. Kearns)
11.14
US Power Grid
CompSci 001
11.15
Content Networks


Example: Document similarity
 Vertices: documents on web
 Edges: Weights defined by similarity
 See TouchGraph GoogleBrowser
Conceptual network: thesaurus
 Vertices: words
 Edges: synonym relationships
CompSci 001
11.16
Social networks

Example: Acquaintanceship networks
 vertices: people in the world
 links: have met in person and know last names
 hard to measure

Example: scientific collaboration
 vertices: math and computer science researchers
 links: between coauthors on a published paper
 Erdos numbers : distance to Paul Erdos
 Erdos was definitely a hub or connector; had 507 coauthors
How do we navigate in such networks?

CompSci 001
11.17
Acquaintanceship & more
CompSci 001
11.18
Network Models (Barabasi)

Differences between Internet, Kazaa, Chord
 Building, modeling, predicting

Static networks, Dynamic networks
 Modeling and simulation

Random and Scale-free
 Implications?

Structure and Evolution
 Modeling via Touchgraph
CompSci 001
11.19
What’s a web-based social network?

Accessible over the web via a browser

Users explicitly state relationships
 Not mined or inferred

Relationships visible and browsable by others
 Reasons?

Support for users to make connections
 Simple HTML pages don’t suffice

Why are they so darn popular? What’s Web 2.0?
CompSci 001
11.20
Types of networks


Pick a class of network:
Give a real-world example of such a network:
 What are the vertices (nodes)?

What are the edges (links)?

How is the network formed? Is it decentralized or
centralized? Is the communication or interaction local or
global?

What is the network's topology? For example, is it
connected? What is its size? What is the degree
distribution?
CompSci 001
11.21
Graph properties

Max Degree?

Center?
CompSci 001
11.22
Download