Distributed Systems Session 2: Distributed Software Engineering Christos Kloukinas Dept. of Computing City University London Software Engineering: the study of techniques used to produce high-quality software © City University London, Dept. of Computing Distributed Systems / 2 - 1 Outline 0 LAST Session Summary + additional material. 1 Motivation 2 The CORBA Object Model 3 The OMG Interface Definition Language (IDL) 4 Other Approaches 5 Summary © City University London, Dept. of Computing Distributed Systems / 2 - 2 Summary & Key Points of Lecture 1 What is a Distributed System? Adoption of DS is driven by Non-Functional Requirements Distribution needs to be transparent to users and application designers Transparency has several dimensions Transparency dimensions depend on each other © City University London, Dept. of Computing Distributed Systems / 2 - 3 Definition A distributed system consists of a collection of autonomous computers, connected through a network and distributed operating system software, which enables computers to coordinate their activities and to share the resources of the system, so that users perceive the system as a single, integrated computing facility. Certain common characteristics can be used to assess distributed systems: Resource Sharing, Openness, Concurrency, Scalability, Fault Tolerance, and Transparency © City University London, Dept. of Computing Distributed Systems / 2 - 4 Distributed System Types (Enslow 1978) Fully Distributed Control Autonomous fully cooperative Local data, local directory Autonomous transaction based Master-slave Not fully replicated master directory Fully replicated Homog. Homog. Processors general special purpose purpose Heterog. Heterog. special general purpose purpose © City University London, Dept. of Computing Distributed Systems / 2 - 5 Dimensions Of Transparency Scalability Transparency Performance Transparency Failure Transparency Migration Transparency Replication Transparency Concurrency Transparency Access Transparency Location Transparency © City University London, Dept. of Computing Distributed Systems / 2 - 6 1.3 Model of a Distributed System Component 1 .. Component n Middleware Component 1 .. Component n ………... Network Operating System Middleware Hardware Network Operating System Host n Hardware Host 1 Network © City University London, Dept. of Computing Distributed Systems / 2 - 7 Middleware Examples Transaction-oriented » » » » IBM CICS BEA Tuxedo IBM Encina MS Transaction Server Message-oriented » » » » » MS Message Queue NCR TopEnd IBM MQSeries Sun Tooltalk Sun JavaSpaces © City University London, Dept. of Computing Procedural » Sun ONC » Linux RPCs » OSF DCE Object-oriented » » » » OMG CORBA Sun Java/RMI Microsoft COM Sun Enterprise Java Beans Distributed Systems / 2 - 8 0.3 Client-Server Computing(O’Leary 2000) A client is defined as a requester of services A server is defined as a provider of services A single machine can be both a client and a server depending on the software configuration © City University London, Dept. of Computing Distributed Systems / 2 - 9 0.4 Client-Server (O’Leary 2000) Processing can be improved because client and server share processing loads » Client/server computing considers that the client has computing power that is not being used » Fundamental idea is to break apart an application into components that can run on different platforms Thin vs. Fat Clients: » a thin client has most of the functionality with server; » a fat client has most of the functionality with the client. © City University London, Dept. of Computing Distributed Systems / 2 - 10 0.5 Two tier architectures The user system interface is usually located in the user's desktop environment. Database management services are usually in a server that is a more powerful machine that services many clients. Processing management is split between the user system interface environment and the database management server environment. The database management server provides stored procedures and triggers. Good for LAN with work group users < 100 © City University London, Dept. of Computing Distributed Systems / 2 - 11 0.6 Three-Tiered Architecture Three Tiered Architecture is an information model with distinct pieces -- client, applications services and data sources -- that can be distributed across a network. Client Tier -- The user component displays information, processes, graphics, communications, keyboard input and local applications. Applications Service Tier -- A set of sharable multitasking components that interact with clients and the data tier. It provides the controlled view of the underlying data sources. Data Source Tier -- One or more sources of data such as mainframes, servers, databases, data warehouses, legacy applications etc. © City University London, Dept. of Computing Distributed Systems / 2 - 12 0.7 Examples of three tier architectures Three tier architecture with transaction processing monitor technology Three tier with message server Three tier with an application server Three tier with an ORB architecture( e.g CORBA) Distributed/collaborative enterprise architecture. © City University London, Dept. of Computing Distributed Systems / 2 - 13 0.8 Three Tier Client/Server Object Style Business Objects ORB DBMS ORB ORB CORBA IIOP Lotus Notes ORB TP Monitors Tier 1 View Objects © City University London, Dept. of Computing Tier 2 Server Objects Tier 3 Legacy Applications Distributed Systems / 2 - 14 1 CORBA – Motivation & Overview Distributed Systems consist of multiple components. Components are heterogeneous. Components still have to be interoperable. There has to be a common model for components, which expresses » component states, » component services, and » interaction of components with other components. © City University London, Dept. of Computing Distributed Systems / 2 - 15 1.1 Example1: Java Object Model & Java Language Object »Runtime entity instance of class Interface » declare a set of methods for a Java object without implementation Method Invocation » primitive type passed by value » object references passed by value © City University London, Dept. of Computing Distributed Systems / 2 - 16 1.2 Ex 2: Distributed Object Model (Wolrath et al) Remote object » object whose methods can be accessed from another address space Remote interface » an interface that declares the methods of a remote object throws Remote Exception to deal with different failure models RMI » non-remote object passed by value » remote object passed by remote reference © City University London, Dept. of Computing Distributed Systems / 2 - 17 1.3 CORBA Object Model & OMG IDL Model describes components, states, interactions and other concepts OMG/IDL is a language for expressing all concepts of the CORBA object model. » separation of interface from implementation » Enables interoperability and transparency » IDL compiles into client stubs and server skeletons » Stubs and skeletons serve as proxies for clients and servers, respectively © City University London, Dept. of Computing Distributed Systems / 2 - 18 1.4 CORBA Client C C C++ Java Ada Cobol Smalltalk C++ Java Ada Cobol Smalltalk CORBA Object Implementations IDL IDL Client Stub IDL IDL IDL IDL Server Skeleton Request Object Request Broker © City University London, Dept. of Computing CORBA Services Distributed Systems / 2 - 19 1.5 Example1: StockQuoter IDL Interface module Quoter { //stock quoter server, some interface to query the prices of stock exception Invalid_Stock_Symbol {}; interface Stock; interface Stock_Factory { Stock get_stock (in string stock_symbol) raises (Invalid_Stock_Symbol); }; interface Stock { readonly attribute string symbol; // Get the stock symbol. readonly attribute string full_name; // Get the name. double price (); // Get the price }; }; © City University London, Dept. of Computing Distributed Systems / 2 - 20 2.3 Example 2: ATM Controller © City University London, Dept. of Computing Distributed Systems / 2 - 21 Teller Controller IDL Definition interface ATM; interface TellerCtrl { typedef sequence<ATM> ATMList; exception InvalidPIN; exception NotEnoughMoneyInAccount {...}; readonly attribute ATMList ATMs; readonly attribute BankList banks; void accept_request(in Requester req, in short amount) raises(InvalidPIN,NotEnoughMoneyInAccount); }; © City University London, Dept. of Computing Distributed Systems / 2 - 22 2 The CORBA Object Model Components objects. Component state object attributes. Usable component services object operations. Component interactions operation execution requests. Component service failures exceptions. © City University London, Dept. of Computing Distributed Systems / 2 - 23 3 The OMG Interface Definition Language OMG/IDL is a language for expressing all concepts of the CORBA object model. IDL is a 'contractual' language that lets you specify a component's (object's) boundaries and its interfaces with potential clients CORBA IDL is language neutral and totally declarative, i.e., it does not define implementations details Provides operating system and programming language independent interfaces to all services and objects that reside on the CORBA bus. Different programming language bindings are available. (We’ll work with Java) © City University London, Dept. of Computing Distributed Systems / 2 - 24 2.1 Types of Distributed Objects Attributes and operations and exceptions are properties defined in object types. Object types are those properties that are shared by similar objects. Only their identity and values of their attributes differ. Objects may export these properties to other objects. Objects are instances of types. Object types are specified through interfaces that determine the operations that clients can request, that is, they define a contract that binds the interaction between client and sever objects. © City University London, Dept. of Computing Distributed Systems / 2 - 25 3.1 Types A type is one of the following: Atomic types (void, boolean, short, long, float, char, string), Object types (interface), Constructed types: » Records (struct), » Variants (union), and » Lists (sequence), or Named types – aliases (typedef). © City University London, Dept. of Computing Distributed Systems / 2 - 26 3.1 Types (Examples) struct Requester { int PIN; string AccountNo; string Bank; }; typedef sequence<ATM> ATMList; © City University London, Dept. of Computing Distributed Systems / 2 - 27 2.2 Attributes Attributes have a (unique) name and a type Type can be an object type or a non-object type (e.g., Boolean values, characters or numbers). Attributes are readable by other components Attributes may or may not be modifiable by other components (readonly). Attributes correspond to one or two operations (get/set). Attributes are declared within an interface. Attribute name must be unique within interface. © City University London, Dept. of Computing Distributed Systems / 2 - 28 2.2 Attributes (Examples) readonly attribute ATMList ATMs; readonly attribute BankList banks; readonly attribute string symbol; readonly attribute string full_name; © City University London, Dept. of Computing Distributed Systems / 2 - 29 2.3 Operations Operations modify the state of an object or just compute functions Used for service requests Operations have a signature that consists of » a name, » a list of in, out, or inout parameters, » a return value type (result) or void if none, and » a list of exceptions that the operation can raise. © City University London, Dept. of Computing Distributed Systems / 2 - 30 2.3 Operations (Examples) void accept_request(in Requester req, in short amount) raises(InvalidPIN, NotEnoughMoneyInAccount); short money_in_dispenser(in ATM dispenser) raises(InvalidATM); © City University London, Dept. of Computing Distributed Systems / 2 - 31 2.4 Operation Execution Requests A client object can request an operation execution from a server object. Operation request is expressed by sending a message (operation name) to server object. Conceptually, an object request is a triple consisting of an object reference, the name of an operation and a list of actual parameters. Parameters are marshaled (packaged and transmitted, e.g., serialisation ) Client have to react to exceptions that the operation may raise. © City University London, Dept. of Computing Distributed Systems / 2 - 32 2.5 Exceptions Service requests may not be executed properly. Exceptions have a unique name. Exceptions may declare additional data structures. Exceptions are used to explain (and locate) the reason of failure to the requester of the operation Operation execution failures may be » generic (system), raised by the middleware, e.g., an unreachable server object; or » specific, raised by the server object, when the execution of a request would violate the object’s integrity, e.g., not enough money in a bank account. © City University London, Dept. of Computing Distributed Systems / 2 - 33 2.5 Exceptions cont… Specific Failures may be explained by specific exceptions Example exception InvalidPIN; exception InvalidATM; exception NotEnoughMoneyInAccount { short available; }; © City University London, Dept. of Computing Distributed Systems / 2 - 34 3.5 Interfaces In distributed systems, services are syntactically specified through interfaces that capture the names of the functions available together with types of the parameters, return values, possible exceptions, etc. There is no legal way a process can access or manipulate the state of an object other than invoking methods made available to it through an object’s interface. © City University London, Dept. of Computing Distributed Systems / 2 - 35 3.5 Interfaces Attributes, exceptions and operations are defined in interfaces. Interfaces have an identifier, which denotes the object type associated with the interface. Interfaces must be declared before they can be used. Interfaces can be declared in a forward manner © City University London, Dept. of Computing Distributed Systems / 2 - 36 3.5 Interfaces (Example) interface ATM; /* forward declaration! */ interface TellerCtrl { typedef sequence<ATM> ATMList; exception InvalidATM; exception InvalidPIN; exception NotEnoughMoneyInAccount {short available;}; readonly attribute ATMList ATMs; readonly attribute BankList banks; void accept_request(in Requester req, in short amount) raises(InvalidPIN,NotEnoughMoneyInAccount); }; © City University London, Dept. of Computing Distributed Systems / 2 - 37 3.6 Modules A single global name space for all identifiers is unreasonable. IDL includes Modules to restrict visibility of identifiers. Access to identifiers from other modules by qualification with module identifier: moduleName::identifierName © City University London, Dept. of Computing Distributed Systems / 2 - 38 3.6 Modules (Example) module Bank { interface AccountDB {}; }; module ATMNetwork { typedef sequence<Bank::AccountDB> BankList; exception InvalidPIN; interface ATM; interface TellerCtrl {...}; }; © City University London, Dept. of Computing Distributed Systems / 2 - 39 2.6 Sub-typing/Inheritance Object types are organised in a type hierarchy. Subtypes inherit attributes, exceptions and operations from their supertypes. Subtypes can add more specific properties. Subtypes can redefine inherited properties. Advantages: » Reuse » Changes are easier to manage » Abstraction makes designing DS elegant and easier to understand » Enables polymorphism (an attribute or parameter can refer to instances of different types). © City University London, Dept. of Computing Distributed Systems / 2 - 40 3.7 Inheritance Notation to define object type hierarchy. Type hierarchy has to form an acyclic graph. Type hierarchy graph has one root called (Object). Subtypes inherit the attributes, exceptions and operations of all super-types. © City University London, Dept. of Computing Distributed Systems / 2 - 41 3.7 Inheritance (Examples) interface Controllee; interface Ctrl { typedef sequence<Controllee> CtrleeList; readonly attribute CtrleeList controls; void add(in Controllee new_controllee); void discard(in Controllee old_controllee); }; interface ATM : Controllee {...}; interface TellerCtrl : Ctrl {...}; © City University London, Dept. of Computing Distributed Systems / 2 - 42 3.7 Multiple Inheritance An object type can inherit from more than one super-type. May cause name clashes if different super-types export the same identifier. Example: interface Set { void add(in Element new_elem); }; interface TellerCtrl:Set, Ctrl { ... }; Name clashes are not allowed! © City University London, Dept. of Computing Distributed Systems / 2 - 43 3.8 Redefinition Behaviour of an operation as defined in a super-type may not be appropriate for a subtype. Operation can be re-defined in the subtype. Binding messages to operations is dynamic. Operation signature must not be changed. Operations in (abstract) super-types are not implemented. © City University London, Dept. of Computing Distributed Systems / 2 - 44 3.8 Redefinition (Example) interface Ctrl { void add(in Controllee new_controllee); }; interface TellerCtrl : Ctrl { void add(in ATM new_controllee); }; TellerCtrl cannot redefine add’s interface – only its behaviour! It cannot overload it either! © City University London, Dept. of Computing Distributed Systems / 2 - 45 3.9 Polymorphism Objects can be assigned to an attribute or passed as a parameter, even though they are instances of subtypes of the attribute’s/parameter’s respective type. Attributes, parameters and operations are polymorph. Example: Using Polymorphism, instances of type ATM can be inserted into attribute controls that Ctrl has inherited from Ctrl. © City University London, Dept. of Computing Distributed Systems / 2 - 46 2.7 Problems of the Model Interactions between components are not defined in the model. No concept for abstract or deferred types. Model does not include primitives for the behavioural specification of operations. Semantics of the model is only defined informally. © City University London, Dept. of Computing Distributed Systems / 2 - 47 4 Other Approaches: (D)COM (D)COM is Microsoft’s Distributed Component Object Model (http://microsoft.com/com/). Evolved from OLE/COM. Weaker than CORBA object model since it » does not support inheritance, » does not have a strong type system and » does not support exceptions. © City University London, Dept. of Computing Distributed Systems / 2 - 48 4. Other Approaches: Darwin Experimental language developed at Imperial College http://www-dse.doc.ic.ac.uk/Research/Darwin Supports dynamic configuration of distributed components. Graphical and textual notation. Components provide and require services. Primitive for binding service requester to service provider. Formal semantics based on Milner’s -calculus. © City University London, Dept. of Computing Distributed Systems / 2 - 49 5 Summary Client-Server vs n-Tier Architecture Why do we need a component model? What are the primitives of the CORBA object model? What is OMG/IDL? What are the strengths and weaknesses of the CORBA approach? © City University London, Dept. of Computing Distributed Systems / 2 - 50 EXTRA MATERIAL (Not to be examined) © City University London, Dept. of Computing Distributed Systems / 2 - 51 0.2 Further Examples:Computational Grids Inspired by the electrical power grid’s pervasiveness, reliability and easy to use, computer scientists in the mid-90s began exploring the design and development of an analogous infrastructure called the computational power Grid © City University London, Dept. of Computing Distributed Systems / 2 - 52 Vision To build an environment that enables » sharing, » selection, » aggregation of a wide variety of geographically distributed resources including » supercomputers, » storage systems, data sources, and » specialised devices owned by different organisations for solving large-scale resource intensive problems in science, engineering, and commerce (Buyya, 2002). © City University London, Dept. of Computing Distributed Systems / 2 - 53 0.2 Computational Grid Motivation: Small computing resources such as PCs have the potential to provide vast computing power when connected. And yet… Many of these resources lie idle most of the time. Millions of online-PCs are only involved in tasks like word processing or browsing the Internet. The computing resources of many organisations are often severely under-utilised, specially outside of peak business hours. At the same time, there are many individuals and organisations that have intensive computations to perform but only have limited access to resources that are available to execute them. © City University London, Dept. of Computing Distributed Systems / 2 - 54 Possible exploitation (Source: IBM) Analyze the value of an investment portfolio in minutes rather than hours? Unite research teams with others around the world to take advantage of the most up-to-date knowledge? Significantly accelerate the drug discovery process? Scale your business to meet cyclical demand? Cut the design time of your products in half while reducing the instances of defects? Source: http://www-1.ibm.com/grid/about_grid/index.shtml © City University London, Dept. of Computing Distributed Systems / 2 - 55 Online Access to Scientific Instruments Advanced Photon Source wide-area dissemination real-time collection archival storage desktop & VR clients with shared controls tomographic reconstruction DOE X-ray grand challenge: ANL, USC/ISI, NIST, U.Chicago © City University London, Dept. of Computing Distributed Systems / 2 - 56 Data Grids for High Energy Physics ~PBytes/sec Online System ~100 MBytes/sec ~20 TIPS There are 100 “triggers” per second Each triggered event is ~1 MByte in size ~622 Mbits/sec or Air Freight (deprecated) France Regional Centre SpecInt95 equivalents Offline Processor Farm There is a “bunch crossing” every 25 nsecs. Tier 1 1 TIPS is approximately 25,000 Tier 0 Germany Regional Centre ~100 MBytes/sec CERN Computer Centre Italy Regional Centre FermiLab ~4 TIPS ~622 Mbits/sec Tier 2 ~622 Mbits/sec Institute Institute ~0.25TIPS Physics data cache Institute Institute Caltech ~1 TIPS Tier2 Centre Tier2 ~1 Centre Tier2 ~1 Centre Tier2 ~1 Centre ~1 TIPS TIPS TIPS TIPS Physicists work on analysis “channels”. Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server ~1 MBytes/sec Tier 4 Physicist workstations © City University London, Dept. of Computing Image courtesy Harvey Newman, Caltech Distributed Systems / 2 - 57 Network for Earthquake Engineering Simulation NEESgrid: national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other On-demand access to experiments, data streams, computing, archives, collaboration Distributed Systems / 2 - 58 NEESgrid: Argonne, Michigan, NCSA, UIUC, USC © City University London, Dept. of Computing Home Computers Evaluate AIDS Drugs Community = » 1000s of home computer users » Philanthropic computing vendor (Entropia) » Research group (Scripps) Common goal= advance AIDS research © City University London, Dept. of Computing Distributed Systems / 2 - 59 Computational Grids Resourses Global Grid Forum (http://www.gridforum.org/): communityinitiated forum of 5000+ individual researchers and practitioners working on distributed computing, or "grid" technologies GridComputing (http://www.gridcomputing.com/) myGrid (http://www.mygrid.org.uk/), an EPSRC project Platforms: » Globus (http://www.globus.org/) » Unicore (http://www.unicore.org/) » Load Sharing Facility (http://www.platform.com/) © City University London, Dept. of Computing Distributed Systems / 2 - 60