Chapter 1 Introduction DISTRIBUTED SYSTEMS (dDist) Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Plan • Definition • Characteristics of distributed systems • Main types of distributed systems Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Definition of a Distributed System (1) • A distributed system is: – A collection of independent computers that appears to its users as a single coherent system. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 An Example • Electronic Healthcare Record – Medication, planning, integration – Critical system – A large, distributed system • System architecture – – – – – Clients Servers Legacy systems Peripherals … A distributed system is a collection of independent computers that appears to its users as a single coherent system. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Definition of a Distributed System (2) A distributed system is a collection of independent computers that appears to its users as a single coherent system. • Other examples of distributed systems? Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Definition of a Distributed System (3) Figure 1-1. A distributed system organized as middleware. The middleware layer extends over multiple machines, and offers each application the same interface. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Definition of a Distributed System (4) • Leslie Lamport – “A distributed system is a system where I can’t get my work done because a computer has failed that I’ve never even heard of.” • Why are such failures central and specific to distributed systems? • Can we not just build better systems? Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Motivation and Goals • Making resources available – Systems should allow for efficient and controlled access to and sharing of distributed/remote resources • Hiding distribution – Systems should look as if only a single computer system is running • Openness – Systems should offer services according to standard rules that describe the syntax and semantics of those services • Scalability – Systems should scale with respect to size, geography, and administrative boundaries Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Transparency in a Distributed System Figure 1-2. Different forms of transparency in a distributed system (ISO, 1995). Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Scalability? Figure 1-3. Examples of scalability limitations. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Scalability Problems • Using decentralized algorithms: – No machine has complete information about the system state. – Machines make decisions based only on local information. – Failure of one machine does not ruin the algorithm. – There is no implicit assumption that a global clock exists. • Why the last requirement? Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Scaling Techniques (1) • Synchronous communication – Ttotal > T1 + T2 • Asynchronous communication – Possible that Ttotal <= T1 + T2 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Scaling Techniques (2) Figure 1-4. The difference between letting (a) a server or (b) a client check forms as they are being filled. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Scaling Techniques (3) Figure 1-5. An example of dividing the DNS name space into zones. • Can distribution and replication give problems? Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Pitfalls when Developing Distributed Systems • Transparency cannot be a reality • False assumptions made by first time developer: – The network is reliable. – The network is secure. – The network is homogeneous. – The topology does not change. – Latency is zero. – Bandwidth is infinite. – Transport cost is zero. – There is one administrator. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Main Distributed System Types • Distributed computing systems – Focus on computation • Distributed information systems – Focus on interoperability • Distributed pervasive systems – Focus on mobile, embedded, communicating systems Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Cluster Computing Systems Figure 1-6. An example of a cluster computing system. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 NFIT Spam Cluster • University of Aarhus cluster – 2 * 2 redundant front-ends (64-bit Opteron servers) at two sites – 2 redundant back-ends at each site (64-bit Opteron servers) – It is possible to add new back-ends • Use – Spam filtering for postAU, NF, SAMF, HUM, ADM, HIH – 25k-40k users – Now around 500k spams a day Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Grid Computing Systems • Figure 1-7. A layered architecture for grid computing systems. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 NorduGrid • Grid research and development collaboration – Denmark, Finland, Sweden, Norway – Supporting researchers – Originally in particular for Large Hadron Collider experiments • Advanced Resource Connector (ARC) middleware Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Transaction Processing Systems • Figure 1-8. Example primitives for transactions. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Example BEGIN_TRANSACTION salary1 = doctor1.getSalary() doctor1.setSalary(salary1 + bonus) salary2 = doctor2.getSalary() doctor2.setSalary(salary2 - bonus) END_TRANSACTION Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Characteristic properties of transactions • Atomic – To the outside world, the transaction happens indivisibly. • Consistent – The transaction does not violate system invariants. • Isolated – Concurrent transactions do not interfere with each other. • Durable – Once a transaction commits, the changes are permanent. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Isolation/Serializability • T1 and T2 executed concurrently/interleaved, effects will be as – T1 executed, then T2 executed, or – T2 executed, then T1 executed • Simple, but slow approach: – Run transactions serially, i.e., no interleaving... • The point here is that database server may optimize based on interleaved scheduling of transcation steps! Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Lost Update • T1 • T2 – salary1 = doctor1.getSalary() – salary1 = doctor1.getSalary() – doctor1.setSalary(salary1 + bonus) – doctor1.setSalary(salary1 + bonus) – salary2 = doctor2.getSalary() – doctor2.setSalary(salary2 - bonus) – salary3 = doctor3.getSalary() – doctor3.setSalary(salary3 - bonus) Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 A Serializable Transaction Interleaving • T1 – salary1 = doctor1.getSalary() – doctor1.setSalary(salary1 + bonus) – salary2 = doctor2.getSalary() – doctor2.setSalary(salary2 - bonus) • T2 – salary1 = doctor1.getSalary() – doctor1.setSalary(salary1 + bonus) – salary3 = doctor3.getSalary() – doctor3.setSalary(salary3 - bonus) Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Inconsistent Retrievals • T1 • T2 – ward1.discharge(patient) – sum += ward1.getNoPatients() – sum += ward2.getNoPatients() – ward2.admit(patient) Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Transaction Processing Systems (3) • Figure 1-9. A nested transaction. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Transaction Processing Systems (4) • Figure 1-10. The role of a TP monitor in distributed systems. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Enterprise Application Integration • Figure 1-11. Middleware as a communication facilitator in enterprise application integration. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Distributed Pervasive Systems ! ! ! Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Distributed Pervasive Systems • Requirements for pervasive systems – Embrace contextual changes. – Encourage ad hoc composition. – Recognize sharing as the default. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Electronic Health Care Systems • Figure 1-12. Monitoring a person in a pervasive electronic health care system, using – (a) a local hub or – (b) a continuous wireless connection. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Network Centric Warfare Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Sensor Networks (1) • Questions concerning sensor networks: – How do we (dynamically) set up an efficient tree in a sensor network? – How does aggregation of results take place? Can it be controlled? – What happens when network links fail? Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Sensor Networks (2) • Figure 1-13. Organizing a sensor network database, while storing and processing data (a) only at the operator’s site or … Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Sensor Networks (3) • Figure 1-13. Organizing a sensor network database, while storing and processing data … or (b) only at the sensors. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Summary A distributed system is a collection of independent computers that appears to its users as a single coherent system. • Goals and problems include – – – – Access to resources Transparency Openness Scalability • Types of systems – Distributed Computing Systems – Distributed Information Systems – Distributed Pervasive Systems Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5