ppt - Computer Science and Engineering

advertisement

CIS 6930.5:

Federated

Distributed Systems

Adriana Iamnitchi (Anda) anda@cse.usf.edu

Contact Info

Email: anda@cse.usf.edu

Office: ENB 334

Office hours: Wednesdays, 10:45 – 1:00 and by appointment

Course page: http://www.csee.usf.edu/~anda/CIS6930.5

CIS6930.5: Federated Distributed Systems (Fall 2005)

2

Examples of Distributed Systems

ATT web Gnutella network

A Sensor Network

CIS6930.5: Federated Distributed Systems (Fall 2005)

The Internet

3

Definition (a version)

A distributed system is a collection of autonomous, programmable, failure-prone entities that are able to communicate through a communication medium that is unreliable.

– Entity=a process on a device (PC, PDA, mote)

– Communication Medium=Wired or wireless network

“Federated” – spanning multiple institutional or network (DNS) domains

4

CIS6930.5: Federated Distributed Systems (Fall 2005)

Outline

Case study: Seti, Napster, Gnutella

Administravia

CIS6930.5: Federated Distributed Systems (Fall 2005)

5

CIS6930.5: Federated Distributed Systems (Fall 2005)

6

SETI@home Operations tape backup tape archive, delete redundancy checking data recorder

DLT tapes garbage collector user DB acct.

queue splitters science DB result queue screensavers

WU storage data server

CGI program web page generator web site master DB

RFI elimination repeat detection

CIS6930.5: Federated Distributed Systems (Fall 2005)

7

Master-worker architecture

How does it work?

SETI@home

Fixed-rate data processing task

Low bandwidth/computation ratio

Independent parallelism

Error tolerance

CIS6930.5: Federated Distributed Systems (Fall 2005)

8

History and Statistics

Conceived 1995, launched April 1999

scientific experiment that uses Internet-connected computers in the Search for Extraterrestrial

Intelligence (SETI). You can participate by running a free program that downloads and analyzes radio

telescope data. “

No ET signals yet, but other results

Total Last 24 Hours

(as of Wed Feb 23 07:04:51)

Users 5,361,313 4,391

Results received 1,779 millions

Total CPU time 2.2 million years

Average CPU time/work unit

10 hr 58 min 14.0 sec

CIS6930.5: Federated Distributed Systems (Fall 2005)

5 million

3610.717 years

6 hr 19 min 30.1 sec

9

Public-resource computing

Utilizes idle computing cycles over Internet

Other systems:

– Original: GIMPS, distributed.net

– Commercial: United Devices, Entropia,

Porivo, Popular Power

– Academic, open-source

> Cosm, folding@home

CIS6930.5: Federated Distributed Systems (Fall 2005)

10

None of the popularity of SETI!

ET

How to get and retain users (from David Anderson, the leader of the SETI@home project)

– Graphics are important (but monitors do burn in)

– Teams: users recruit other users

– Keep users informed

Science news

System management news

Periodic project emails

Reward users:

– PDF certificates

– Milestone pages and emails

– Leader boards (overall, country, …)

CIS6930.5: Federated Distributed Systems (Fall 2005)

11

Millions and millions of computers!

(Problems)

Server scalability

Dealing with excess CPU time

Cheating

Bad behavior:

– Team recruitment by spam

– Sale of accounts on eBay

Malfunctions

Network bandwidth costs money

CIS6930.5: Federated Distributed Systems (Fall 2005)

12

SETI@home: Summary

Master-worker design

– Centralized solution

> Master=central point of control

> Single point of failure

> Performance bottleneck

Incentives for participation

– Mean sometimes incentives for cheating

Massive (“embarrassing”) parallelism

Low bandwidth/computation ratio

Users do donate real resources: $1.5M / year consumed power

More information: http://setiathome.ssl.berkeley.edu

CIS6930.5: Federated Distributed Systems (Fall 2005)

13

Outline

Case study: Seti, Napster, Gnutella

Administravia

CIS6930.5: Federated Distributed Systems (Fall 2005)

14

The File Location Problem

(Napster and Gnutella)

Where is file A?

CIS6930.5: Federated Distributed Systems (Fall 2005)

15

Napster: How It Works napster.com

• Client-server: Use central server to locate files

• Download files directly from peers

CIS6930.5: Federated Distributed Systems (Fall 2005)

16

1.

File list is uploaded

Napster napster.com

CIS6930.5: Federated Distributed Systems (Fall 2005) users

17

2.

User requests search at server.

Napster napster.com

Request and results user

CIS6930.5: Federated Distributed Systems (Fall 2005)

18

Napster

3.

User pings hosts that apparently have data.

Looks for best transfer rate.

pings napster.com

user pings

CIS6930.5: Federated Distributed Systems (Fall 2005)

19

4.

User retrieves file

Napster napster.com

Retrieves file user

CIS6930.5: Federated Distributed Systems (Fall 2005)

20

Napster: History

Program for sharing files over the Internet

History:

– 5/99: Shawn Fanning (freshman, Northeasten U.) founds Napster Online music service

– 12/99: first lawsuit

– 3/00: 25% UWisc traffic Napster

– 2000: est. 60M users

– 2/01 : US Circuit Court of

Appeals: Napster knew users violating copyright laws

– 7/01: # simultaneous online users:

Napster 160K, Gnutella: 40K, Morpheus: 300K

CIS6930.5: Federated Distributed Systems (Fall 2005)

21

Napster: Summary

Centralized server:

– Client-server architecture

– Single logical point of failure

– Potential for congestion (bottleneck)

– Napster “in control” (freedom is an illusion)

No security:

– Passwords in plain text

– No authentication

– No anonymity

CIS6930.5: Federated Distributed Systems (Fall 2005)

22

Outline

Public-resource computing

– Case study: Seti@home

Peer-to-peer systems

– Case study 1: Napster

– Case study 2: Gnutella

Discuss:

– Characteristics

– Impact

– Architecture

– Killer application

CIS6930.5: Federated Distributed Systems (Fall 2005)

23

Gnutella: Search for Files with No

Central Server napster.com

CIS6930.5: Federated Distributed Systems (Fall 2005)

24

Ideas?

Where is file A?

CIS6930.5: Federated Distributed Systems (Fall 2005)

25

I have file A.

Gnutella: Search

I have file A.

Reply

Flooding

Query

Where is file A?

CIS6930.5: Federated Distributed Systems (Fall 2005)

26

Gnutella: History and Statistics

Gnutella history:

– 3/14/00: release by AOL, almost immediately withdrawn

– too late: 1,859,340 users on Gnutella on August 25, 2am

– many iterations to fix poor initial design

High impact:

– Versions implemented

– Different designs

– Lots of research papers/ideas

Network eDonkey2K

FastTrack

Users

4,123,688

2,521,887

Gnutella

Overnet

DirectConnect

MP2P

1,516,762

1,146,880

294,255

251,137

(www.slyck.com, 06/24/’05)

27

CIS6930.5: Federated Distributed Systems (Fall 2005)

What would you ask about Gnutella?

CIS6930.5: Federated Distributed Systems (Fall 2005)

28

Gnutella: Heterogeneity

All Peers Equal? (1)

1.5Mbps DSL

1.5Mbps DSL

1.5Mbps DSL

56kbps Modem

1.5Mbps DSL

56kbps Modem

10Mbps LAN

56kbps Modem

CIS6930.5: Federated Distributed Systems (Fall 2005)

29

Gnutella: Free Riding

All Peers Equal? (2)

More than 25% of

Gnutella clients share no files; 75% share 100 files or less

Conclusion: Gnutella has a high percentage of free riders

If only a few individuals contribute to the public good, these few peers effectively act as centralized servers.

CIS6930.5: Federated Distributed Systems (Fall 2005)

Adar and Huberman (Aug ’00)

30

Flooding in Gnutella: Loop Prevention

Seen request already

CIS6930.5: Federated Distributed Systems (Fall 2005)

31

Gnutella Topology Mismatch

CIS6930.5: Federated Distributed Systems (Fall 2005)

32

Gnutella Summary

Search by flooding

Self-configuring

Phenomena:

– Not all peers equal

– Free riding

Problems:

– Topology mismatch

– Duplicates due to flooding

Good source for technical info/open questions:

– http://www.limewire.com/index.jsp/tech_papers

CIS6930.5: Federated Distributed Systems (Fall 2005)

33

Problems in Distributed Systems

Communication

– Routing [IP,BGP]

– Multicast [IP multicast, SRM, RMTP]

Post and retrieve [Usenet]

Search [Gnutella, Kazaa, etc., Google]

Storage [Databases]

Coordination [SETI@Home]

CIS6930.5: Federated Distributed Systems (Fall 2005)

34

Failures

Scale

Asynchrony

Security

Deployment

Adoption

Challenges

CIS6930.5: Federated Distributed Systems (Fall 2005)

35

Challenges (2)

Learn from usage

– Example 1: The Internet

– Example 2: Napster

Conflicting requirements:

– Light but adaptable?

– Light but data-consistent? (think transactions)

– … (other examples?)

… (other examples?)

CIS6930.5: Federated Distributed Systems (Fall 2005)

36

Course Organization/Syllabus/etc.

CIS6930.5: Federated Distributed Systems (Fall 2005)

37

Administravia: Grading

Reviewing:30%

Discussion leading: 15%

Project: 55%

– Aim high!

– Have fun!

CIS6930.5: Federated Distributed Systems (Fall 2005)

38

Administravia:

Paper Reviewing (1)

Goals:

– Think of what you read

– Get used to writing paper reviews

Reviews due by midnight before class

Follow the form when relevant .

State the main contribution of the paper

Critique the main contribution.

– Rate the significance of the paper on a scale of 5

(breakthrough), 4 (significant contribution), 3 (modest contribution), 2 (incremental contribution), 1 (no contribution or negative contribution). Explain your rating in a sentence or two.

39

CIS6930.5: Federated Distributed Systems (Fall 2005)

Administravia:

Paper Reviewing (2)

Rate how convincing the methodology is.

Do the claims and conclusions follow from the experiments?

Are the assumptions realistic?

Are the experiments well designed?

Are there different experiments that would be more convincing?

Are there other alternatives the authors should have considered?

(And, of course, is the paper free of methodological errors?)

CIS6930.5: Federated Distributed Systems (Fall 2005)

40

Administravia:

Paper Reviewing (3)

What is the most important limitation of the approach?

What are the three strongest and/or most interesting ideas in the paper?

What are the three most striking weaknesses in the paper?

Name three questions that you would like to ask the authors.

Detail an interesting extension to the work not mentioned in the future work section.

Optional comments on the paper that you’d like to see discussed in class.

41

CIS6930.5: Federated Distributed Systems (Fall 2005)

Paper Reviewing (final)

Be professional in your writing

Have an eye on the writing style:

– Clarity

– Beware of traps: learn to use them in writing and detect them in reading

– Detect (and stay away from) trivial claims.

E.g., 1 st sentence in the Introduction:

“The tremendous/unprecedented/phenomenal growth/scale/ubiquity of the Internet…”

42

CIS6930.5: Federated Distributed Systems (Fall 2005)

Administravia:

Discussion leading

Come prepared!

– Prepare discussion outline

– Prepare questions:

> “What if”s

> Unclear things

> …

– Similar ideas in different contexts

– Initiate short brainstorming sessions

Leaders do NOT need to submit paper reviews

Main goals:

– Keep discussion flowing

– Keep discussion relevant

– Engage everybody (I’ll have an eye on this, too)

CIS6930.5: Federated Distributed Systems (Fall 2005)

43

Administravia:

Projects

Combine with your research if relevant to the class

Get approval from all instructors if you overlap final projects:

– Don’t sell the same piece of work twice

– You can get more than twice as many results with less than twice as much work

Aim high!

– Put one extra month and get a publication out of it

– It is doable

Try ideas that you postponed out of fear: it’s just a class, not your PhD.

CIS6930.5: Federated Distributed Systems (Fall 2005)

44

Administravia:

Project deadlines (tentative)

Sept. 15: 1-page project proposal

Oct. 11: 3-page literature survey

– Know relevant work in your problem area

– If implementation project, list tools, similar projects

Nov. 11: 5-page Midterm project due

– Have a clear image of what’s possible/doable

– Report preliminary results

Last class(es):In-class project presentation

– Demo, if appropriate

Dec. 16:

– 10-page write-up

CIS6930.5: Federated Distributed Systems (Fall 2005)

45

Next Class (Wed, August 31)

Read the 4 chapters from the Grid book

Send brief summaries (lists of ideas/problems discussed, etc)

– Do not follow the reviewing form

– Be brief and efficient!

– Be BRIEF and EFFICIENT!

In-class discussion + some project ideas

Need discussion leader to team up with me for the class next week:

– The structure of networks (pick 2):

1.

Small-world file sharing communities, Iamnitchi, Ripeanu, Foster.

Infocom 2004.

2.

On Power-Law Relationships of the Internet Topology, Faloutsos,

Faloutsos, and Faloutsos, SIGCOMM 1999

3.

Mapping the Gnutella network, M. Ripeanu et al, IEEE Computing

Journal 2002.

CIS6930.5: Federated Distributed Systems (Fall 2005)

46

Questions?

CIS6930.5: Federated Distributed Systems (Fall 2005)

47

Download