Massively Distributed Database Systems Spring 2015 Ki-Joune Li http://isel.cs.pusan.ac.kr/~lik Pusan National University An assignment - Choose an electronic (or electric) device - Define additional functions with . data storage . communication (infrastructure or ad-hoc) . some computation . and others Do not mind how difficult it would be to implement. - Example Massively Distributed Databases vs. Distributed Databases This lecture includes • Distributed Systems and Database Systems • Overview • Synchronization • Consistency and Replications • P2P • Ad-Hoc Network and MANET • Data on Air • Cloud • RESTful • Ubiquitous Computing and IoT Definition of a Distributed System • Distributed system : 1) A collection of (scalability) 2) independent computers that (heterogeneity) 3) appears to its users as a single coherent system (transparency) • Distributed System versus Parallel System • Separated Operating System vs. Single Operating System • Message Passing vs. Shared Memory 5 Distributed Systems and Distributed Database Systems - Overview Why Distributed System ? • Performance • Incremental Growth (Scalability) • 1 single mainframe of price W • N small machines of price W/N • Fault Tolerance • 1 single mainframe : critical weak point • Failure of a machine : replacement by other machines • Geographical Distribution and Availbility • Flexible configuration • e.g. 1 Disk server, 3 Computing servers, 1 Graphic server, etc. • Geographical availibility 7 Distributed System - Scalibility and Heterogeneity 1.1 A distributed system organized as middleware. Heterogeneity and Scalibility 8 Distributed System - Transparency Different forms of transparency in a distributed system. 9 Transparency Description Access Hide differences in data representation and how a resource is accessed Location Hide where a resource is located Migration Hide that a resource may move to another location Relocation Hide that a resource may be moved to another location while in use Replication Hide that a resource may be shared by several competitive users Concurrency Hide that a resource may be shared by several competitive users Failure Hide the failure and recovery of a resource Persistence Hide whether a (software) resource is in memory or on disk Distributed System : Heterogeneity Application Program or Client Driver for A Server A Driver for B Driver for C Server B Server C Client has to be provided with one different driver for each server 10 Distributed System : Heterogeneity and Object-Oriented Approach Application Program or Client Predefined interface Server A Server B Server C Wrapping with predefined interface Encapsulation : Object-Oriented Approach 11 Software Concepts System Description Main Goal DOS Tightly-coupled operating system for multiprocessors and homogeneous multicomputers Hide and manage hardware resources NOS Loosely-coupled operating system for heterogeneous multicomputers (LAN and WAN) Offer local services to remote clients Middleware Additional layer on top of NOS implementing generalpurpose services Provide distribution transparency • An overview of • DOS (Distributed Operating Systems) • NOS (Network Operating Systems) • Middleware 12 Network Operating System • General structure of a network operating system. 1-19 13 Network Operating System • Two clients and a server in a network operating system. 1-20 14 Middleware • General structure of a distributed system as middleware. 1-22 15 Middleware and Openness 1.23 • In an open middleware-based distributed system, the protocols used by each middleware layer should be the same, as well as the interfaces they offer to applications. 16 Multitiered Architectures: Example 17 Alternatives of Multitiers Architectures 1-29 18 Multicomputer Operating Systems • General structure of a multicomputer operating system 1.14 19 Distributed Shared Memory Systems a) Pages of address space distributed among four machines b) Situation after CPU 1 references page 10 c) Situation if page 10 is read only and replication is used 20 Distributed Shared Memory Systems a) Pages of address space distributed among four machines b) Situation after CPU 1 references page 10 c) Situation if page 10 is read only and replication is used 21 Comparison between Systems A comparison between multiprocessor operating systems, multicomputer operating systems, network operating systems, and middleware based distributed systems. Distributed OS Item 22 Network OS Middleware-based OS Multiproc. Multicomp. Degree of transparency Very High High Low High Same OS on all nodes Yes Yes No No Number of copies of OS 1 N N N Basis for communication Shared memory Messages Files Model specific Resource management Global, central Global, distributed Per node Per node Scalability No Moderately Yes Varies Openness Closed Closed Open Open