Adaptation of Legacy Software to Grid Services Bartosz Baliś, Marian Bubak, and Michał Węgiel Institute of Computer Science / ACC CYFRONET AGH Cracow, Poland bubak@uci.agh.edu.pl WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Outline Introduction - motivation & objectives System architecture – static model (components and their relationships) System operation – dynamic model (scenarios and activities) System characteristics Migration framework (implementation) Performance evaluation Use case & summary WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Introduction Legacy software • Validated and optimized code • Follows traditional process-based model of computation (language & system dependent) • Scientific libraries (e.g. BLAS, LINPACK) Service oriented architecture (SOA) • Enhanced interoperability • Language-independent interface (WSDL) • Execution within system-neutral runtime environment (virtual machine) WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Objectives Originally: adaptation of the OCM-G to GT 3.0 Tool OMIS Grid Service Site OMIS LM Node OMIS LM Node SM After generalization: • design of a versatile architecture enabling for bridging between legacy software and SOA • implementation of a framework providing tools facilitating the process of migration to SOA WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Related Work Lack of comprehensive solutions Existing approaches possess numerous limitations and fail to meet grid requirements Kuebler D., Einbach W.: Adapting Legacy Applications as Web Services (IBM) Client Service Adapter Server Web Service Container Main disadvantages: insecurity & inflexibility WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Roadmap Introduction - motivation & objectives System architecture – static model (components and their relationships) System operation – dynamic model (scenarios and activities) System characteristics Migration framework (implementation) Performance evaluation Use case & summary WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 General Architecture Hosting Environment Service Requestor SOAP Registry Legacy System Factory Master Instance SOAP Slave Proxy Factory Proxy Instance Monitor Service WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Process Service Requestor From client’s perspective, cooperation with legacy systems is fully transparent Only two services are accessible: factory and instance; the others are hidden Standard interaction pattern is followed: • First, a new service instance is created • Next, method invocations are performed • Finally, the service instance is destroyed We assume a thin client approach WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Legacy System (1/4) Constitutes an environment in which legacy software resides and is executed Responsible for actual request processing Hosts three types of processes: master, monitor and slave, which jointly provide a wrapper encapsulating the legacy code Fulfills the role of network client when communicating with hosting environment (thus no open ports are introduced and process migration is possible) WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Legacy System (2/4) Legacy System Master creates Monitor controls Slave one per host permanent process responsible for host registration and creation of monitor and slave processes WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Legacy System (3/4) Legacy System Master creates one per client transient process Monitor controls Slave responsible for reporting about and controlling the associated slave process WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Legacy System (4/4) Legacy System Master creates Monitor provides means of interfacebased stateful conversation with legacy software controls Slave one per client transient process WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Hosting Environment (1/5) Maintains a collection of grid services which encapsulate interaction with legacy systems Provides a layer of indirection shielding the service requestors from collaboration with backend hosts Responsible for mapping between clients and slave processes (one-to-one relationship) Mediates communication between service requestors and legacy systems WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Hosting Environment (2/5) Hosting Environment Registry permanent services Factory Proxy Factory transient services Instance Proxy Instance WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 one per service keeps track of backend hosts which registered to participate in computations Hosting Environment (3/5) Hosting Environment Registry permanent services Factory Proxy Factory transient services Instance Proxy Instance WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 one per service responsible for creation of the corresponding instances Hosting Environment (4/5) Hosting Environment Registry permanent services Factory Proxy Factory transient services Instance Proxy Instance WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 one per client directly called by client, provides externally visible functionality Hosting Environment (5/5) Hosting Environment Registry permanent services Factory Proxy Factory transient services Instance Proxy Instance WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 one per client responsible for mediation between backend host and service client Roadmap Introduction - motivation & objectives System architecture – static model (components and their relationships) System operation – dynamic model (scenarios and activities) System characteristics Migration framework (implementation) Performance evaluation Use case & summary WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Resource Management (1/2) Resources = processes (master/monitor/slave) Registry service maintains a pool of master processes which can be divided into: • static part – configured manually by site administrators (system boot scripts) • dynamic part – managed by means of job submission facility (GRAM) Optimization: coarse-grained allocation and reclamation performed in advance in the background (efficiency, smooth operation) WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Resource Management (2/2) Coarse-grained resource = master process Fine-grained resource = monitor & slave process Coarse-Grained Allocation (c) Resource Broker c.2 c.3 Information Services c.1 Registry Monitor/Slave f.1 Fine-Grained Allocation (f) c.4 Data Management f.2 Master c.5 WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Job Submission Invocation patterns Apart from synchronous and sequential mode of method invocation our solution supports: 1. Asynchronism – assumed to be embedded into legacy software; our approach: invocation returns immediately and a separate thread is blocked on a complementary call waiting for the output data to appear 2. Concurrency – slave processes handle each client request in a separate thread 3. Transactions - the most general model of concurrent nested transactions is assumed WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Legacy Side Scenarios (1/2) 1. Client assignment - master process repetitively volunteers to participate in request processing (reporting host CPU load). When registry service assigns a client before timeout occurs, new monitor and slave processes are created. 2. Request processing – embraces: input retrieval, request processing and output delivery. 3. System self-monitoring - monitor process periodically reports to proxy instance about the status of the slave process and current CPU load statistics (both system- and slave-related). WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Legacy Side Scenarios (2/2) Registry Master Proxy Instance Assign [success] Create Monitor Create Assign [timeout] Slave Request Heartbeat [continue] Response Request Heartbeat Assign [migration] [timeout] Destroy WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Response Client Side Scenarios (1/2) 1. Instance construction - involves two steps: • Creation of the associated proxy instance, • Assignment of one of the currently registered master processes. 2. Method invocation - client call is forwarded to the proxy instance, from where it is fetched by the associated slave process; the requestor is blocked until the response arrives. 3. Instance destruction - destruction request is forwarded to the associated proxy instance. WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Client Side Scenarios (2/2) Factory Create Proxy Factory Registry Instance New Create New Proxy Instance Assign Invoke Destroy Invoke Destroy WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Process Migration (1/5) Indispensable when we need to: • dynamically offload work onto idle machines (automatic load-balancing) • silently mask recovery from system failures (transparent fail-over) Challenges: state extraction & reconstruction Low-level approach • Suitable only for homogeneous environment (e.g. cluster of workstations) • Supported by our solution since legacy systems act as clients rather than servers WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Process Migration (2/5) High-level approach • Can be employed in heterogeneous environment • State restoration is based on the combination of checkpointing and repetition of the short-term method invocation history • Requires additional development effort (state serialization, snapshot dumping and loading) Proxy instance initiates high-level recovery upon detection of failure (lost heartbeat) or overload Only slave and monitor processes are transferred onto another computing node WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Process Migration (3/5) Selection of optimal state reconstruction scenario is based on transaction flow and checkpoint sequence (multiple state snapshots are recorded and the one enabling for fastest recovery procedure is chosen) Committed Committed Committed Unfinished Aborted Aborted Aborted Time Transaction omitted Transaction repeated Check point WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Failure point Process Migration (4/5) load [%] CPU load generated by slave process (as reported by monitor process) is approximated as a function of time and used to estimate the cost of invocations t2 c f l(t)dt t1 probe points t1 t2 time [ms] WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 c – total cost f – frequency l – CPU load t – time Process Migration (5/5) In case of concurrent method invocations, emulation of synchronization mechanisms employed on the client side is necessary • Timing data is gathered (method invocation start & end timestamps), • If two operations overlapped in time, they are executed concurrently (otherwise sequentially). Prerequisite: repeatable invocations (unless system state was changed, in response to the same input data identical results are expected to be obtained). WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Roadmap Introduction - motivation & objectives System architecture – static model (components and their relationships) System operation – dynamic model (scenarios and activities) System characteristics Migration framework (implementation) Performance evaluation Use case & summary WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 System Features (1/3) Non-functional requirements: • QoS-related (the fashion that service provisioning takes place in): performance & dependability, • TCO-related (expenses incurred by system maintenance): scalability & expandability. Efficiency – coarse-grained resource allocation; pool of master processes always reflects actual needs; algorithms have linear time complexity; checkpointing and transactions jointly allow for selection of optimal recovery scenario. WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 System Features (2/3) Availability – fault-tolerance based on both lowlevel and high-level process migration; failure detection and self-healing; checkpointing allows for robust error recovery; in the worst case A = 50% (when the whole call history needs to be repeated we have MTTF = MTTR). Security – no open incoming ports on backend hosts are introduced; authentication of legacy systems is possible; we rely upon the grid security infrastructure provided by the container. WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 System Features (3/3) Scalability - processing is highly distributed and parallelized (all tasks are always delegated to legacy systems); load balancing is guaranteed (by registry and proxy instance); job submission mechanism is exploited (resource brokering). Versatility - no assumptions are made as regards programming language or run-time platform; portability; non-intrusiveness (no legacy code alteration needed); standardscompliance and interoperability. WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Migration Framework (1/2) Code-named L2G (Legacy To Grid) Based on GT 3.2 (hosting environment) and gSOAP 2.6 (legacy system) Objective: to facilitate the adaptation of legacy C/C++ software to GT 3.2 services by automatic code generation (with particular emphasis on ease of use and universality) Structural and operational compliance with the proposed architecture Served as a proof of concept of our solution WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Migration Framework (2/2) Most typical development cycle: 1. Programmer specifies the interface that will be exposed by the deployed service (Java) 2. Source code generation takes place (Java/C++/XML/shell scripts) 3. Programmer provides the implementation for the methods on legacy system side (C++) Support for process migration, checkpointing, transactions, MPI (parallel machine consists of multiple slave processes one of which is in charge of communication with proxy instance) WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Roadmap Introduction - motivation & objectives System architecture – static model (components and their relationships) System operation – dynamic model (scenarios and activities) System characteristics Migration framework (implementation) Performance evaluation Use case & summary WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Performance evaluation (1/5) Benchmark: comparison of two functionally equivalent grid services (the same interface) one of which was dependent on legacy system Both services were exposing a single operation: int length (String s); Time measurement was performed on the client side; all components were located on a single machine; no security mechanism was employed; relative overhead was estimated WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Performance evaluation (2/5) Measurement results for method invocation time [ms] time = length/bandwidth + latency 180 160 140 legacy service 120 100 ordinary service 80 60 40 20 0 0 5 10 15 20 25 30 35 40 45 length [kB] WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 50 Performance evaluation (3/5) Measurement results for instance construction time [s] time = iterations/throughput 20 18 16 14 legacy service 12 10 8 ordinary service 6 4 2 0 0 5 10 15 20 25 30 35 iterations [] WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 40 Performance evaluation (4/5) Measurement results for instance destruction time = iterations/throughput time [s] 3,5 3 2,5 legacy service 2 ordinary service 1,5 1 0,5 0 0 5 10 15 20 25 30 35 iterations [] WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 40 Performance evaluation (5/5) Instance construction and destruction Scenario Ordinary service Legacy service Relative change Construction 6.2 iterations/s 2.0 iterations/s Reduced 3.1 x Destruction 25.4 iterations/s 12.2 iterations/s Reduced 2.1 x Method invocation Quantity Ordinary service Legacy service Relative change Bandwidth 909.1 kB/s 370.4 kB/s Reduced 2.5 x Latency 15.4 ms 37.8 ms Increased 2.5 x WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Use Case: OCM-G Grid application monitoring system composed of two components: Service Manager (SM) and Local Monitor (LM), compliant to OMIS interface SM MCI Slave MCI SOAP MCI LM LM Node Node Site Proxy Instance SOAP Instance WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 Summary We elaborated a universal architecture enabling to integrate legacy software into the grid services environment We demonstrated how to implement our concept on the top of existing middleware We developed a framework (comprising a set of the command line tools) which automates the process of migration of C/C++ codes to GT 3.2 Further work: WSRF, message-level security, optimizations, support for real-time applications WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004 More info www.icsr.agh.edu.pl/lgf/ see also www.eu-crossgrid.org and www.cyfronet.krakow.pl/ICCS2004/ WS on Component Models and Systems for Grid Applications, St Malo, June 26, 2004