Distributed Systems Principles and Paradigms Reza Rafeh Arak University 1 INTRODUCTION 1.1 DEFINITION AND CHARACTERISTICS OF A DISTRIBUTED SYSTEM 1.2 TYPES OF DISTRIBUTED SYSTEMS 2 ARCHITECTURES 2.1 ARCHITECTURAL STYLES 2.2 SYSTEM ARCHITECTURES 3 PROCESSES , THREADS, MIGRATION 4 COMMUNICATION 4.1 FUNDAMENTALS 4.2 Types of Communication 4.2.1 REMOTE PROCEDURE CALL 4.2.2 MESSAGE-ORIENTED COMMUNICATION 4.2.3 STREAM-ORIENTED COMMUNICATION 4.2.4 MULTICAST COMMUNICATION 5 NAMING 5.1 NAMES, IDENTIFIERS, AND ADDRESSES 5.2 Structured Naming 5.3 Attribute-Based Naming 6 Synchronization 6.1 Clock Synchronization 6.2 Logical Clocks 6.3 Mutual Exclusion 6.4 Global Positioning of Nodes 6.5 Election Algorithms 7 Consistency and Replication 7.1 INTRODUCTION 7.2 DATA-CENTRIC CONSISTENCY MODELS 7.3 CLIENT-CENTRIC CONSISTENCY MODELS 7.4 REPLICA MANAGEMENT 7.5 CONSISTENCY PROTOCOLS 8 Fault Tolerance 8.1 Introduction 8.2 Process Resilience 8.3 Reliable Client-Server Communication 8.4 Reliable Group Communication 8.5 Disturbed Commit 8.6 Recovery 9 Security 9.1 Introduction 9.2 Secure Channels 9.3 Access Control 9.4 Security Management 10 Disturbed Object-Based Systems 10.1 Architecture 10.2 Processes 10.3 Communication 10.4 Naming 10.5 Synchronization 10.6 Consistency and Replication 10.7 Fault Tolerance 10.8 Security 11 Disturbed File Systems 11.1 Architecture 11.2 Processes 11.3 Communication 11.4 Naming 11.5 Synchronization 11.6 Consistency and Replication 11.7 Fault Tolerance 11.8 Security 12 Disturbed Web-Based Systems 12.1 Architecture 12.2 Processes 12.3 Communication 12.4 Naming 12.5 Synchronization 12.6 Consistency and Replication 12.7 Fault Tolerance 12.8 Security 13 Disturbed Coordination-Based Systems 13.1 Architecture 13.2 Processes 13.3 Communication 13.4 Naming 13.5 Synchronization 13.6 Consistency and Replication 13.7 Fault Tolerance 13.8 Security Definition of a Distributed System (1) A distributed system is: A collection of independent computers that appears to its users as a single coherent system. Definition of a Distributed System (2) A distributed system organized as middleware. The middleware layer extends over multiple machines, and offers each application the same interface. Motivation and Goals • Making resources available – Systems should allow for efficient and controlled access to and sharing of distributed resources • Hiding distribution – Systems should look as if only a single computer system is running • Openness – Systems should offer services according to standard rules that describe the syntax and semantics of those Services • Scalability – Systems should scale with respect to size, geography, and administrative boundaries Features Transparency Degree of Transparency • • A trade-off between a high degree of transparency and the performance of a system Full distribution transparency is simply impossible Openness • A system • Offer services according to standard rules • Such rules are formalized in protocols • Interoperability • Different manufacturers can co-exist and work together • Portability • System A can be executed, without modification, on a different distributed system B Scalability: Size scalable: easily add more users & recourses to the system Geographically scalable: users & resources may lie far apart Administratively scalable: it can still be easy to manage even if it spans independent administrative organizations Scalability Scalability Characteristics of decentralized algorithms: • No machine has complete information about the system state. • Machines make decisions based only on local information. • Failure of one machine does not ruin the algorithm. • There is no implicit assumption that a global clock exists. Scaling Techniques • Three techniques for scaling • Hiding communication latencies • Distribution • Replication Scaling Techniques (1) 1.4 The difference between letting: a) a server or b) a client check forms as they are being filled Scaling Techniques (2) 1.5 An example of dividing the DNS name space into zones. Pitfalls when Developing Distributed Systems • • • • • • • • The network is reliable. The network is secure. The network is homogeneous. The topology does not change. Latency is zero. Bandwidth is infinite. Transport cost is zero. There is one administrator. Types of Distributed Systems: 1- Distributed Computing Systems : Focus on computation Cluster Computing Systems Grid Computing Systems 2- Distributed Information Systems:Focus on interoperability Transaction Processing Systems Enterprise Application Integration (Exchange info via RPC or RMI) 3- Distributed Pervasive Systems (usually small, batterypowered systems, Mobile & wireless): Focus on mobile, embedded, communicating Home Systems (e.g. Smart phones, PDAs) Electronic Health care systems (Heart monitors, BAN: Body Area Networks) Sensor Networks (distributed Databases connected wirelessly) Cluster Computing Systems - Hooking up a collection of simple computers via high-speed networks to build a supercomputing environment - Mostly homogenous - Example: server clusters at Banks Grid Computing Systems - have a high degree of heterogeneity -Users and recourses from different organizations are brought together to allow collaboration (i.e. a V.O. = Virtual Organization) - Members belonging to the same V.O. have access rights to a common set of recourses (e.g. Police, FBI, and some local agencies may form a computing grid) Transaction Processing Systems (1) BEGIN_TRANSACTION salary1 = doctor1.getSalary() doctor1.setSalary(salary1 + bonus) salary2 = doctor2.getSalary() doctor2.setSalary(salary2 - bonus) END_TRANSACTION Transaction Processing Systems (2) Characteristic: • Atomic: The transaction happens indivisibly. • Consistent: The transaction does not violate system invariants. • Isolated: Concurrent transactions do not interfere with each other. • Durable: Once a transaction commits, the changes are permanent. Transaction Processing Systems (3) Transaction Processing Systems (4) Enterprise Application Integration -RPC (Remote Procedure Call) -RMI (Remote Method Invocation) Distributed Pervasive Systems • • Embrace contextual changes (a phone now is a web access device. A device must continuously be aware of the fact that its environment may change) • Encourage ad hoc composition • (used differently by different users, e.g. PDA) • Recognize sharing as the default • (easily read, store, manage, and share info) Electronic Health Care Systems (1) • • Where and how should monitored data be stored? How can we prevent loss of crucial data? • What infrastructure is needed to generate and propagate alerts? • How can physicians provide online feedback? • How can extreme robustness of the monitoring system be realized? • What are the security issues and how can the proper policies be enforced? Electronic Health Care Systems (2) Monitoring a person in a pervasive electronic health care system, using (a) a local hub or - (b) a continuous wireless connection. Sensor Networks (1) • How do we (dynamically) set up an efficient tree in a sensor network? • How does aggregation of results take place? Can it be controlled? • What happens when network links fail? Sensor Networks (2) Organizing a sensor network database, while storing and processing data (a) only at the operator’s site or … Sensor Networks (3) Organizing a sensor network database, while storing and processing data … or (b) only at the sensors.