Sib: Alternatives of stateful replication in application servers Huaigu WU Mcgill University February, 2003 Purpose of Sib • An alternative approach to implement replication in application server – Abstract level (derived from the J2EE specification) – Focus on Stateful Session Bean – Will extend to Stateless Session Bean and Entity Bean Outline • Sib VS. JBoss implementation – Simple Calls – Calling Hierarchies – Direct Database Access Client 1 2 Application Server SFB 1 Client 1 4 Application Server SFB 1 Client • Compare – Normal process – Site crash and failover 2 3 1 4 Application Server SFB 1 3 2 SFB 2 Data Assumption • We don’t consider partition. • Abstract component architecture – Component in this talk consists of client, Stateful Session Bean (SFB) and third-party database • Third-party database will be used without modification • Deterministic component for basic analysis • One primary, one (or more) backup • So far, we assume every component is single threaded Normal Processing in Simple Calls (JBoss) • Only primary receives and processes request • Replicate primary’s up-todate state to backups at the end of every request before returning its reply • Backup updates state after every request Client 1 3 4 6 Cluster SFB 1 SFB 1 Primary Backup 2 5 Normal Processing in Simple Calls (Sib) • Independent for component • Deterministic component: it could be any time point 3 Client 1 • At the beginning, states of all replicas are the same • request is broadcast to both primary and backup • Primary processes requests • Backup receives requests • Occasionally (periodically), Primary’s state is replicated to backup – Replication Point (operational quiescence ) 2 4 Cluster SFB 1 SFB 1 Primary Backup 0 4' Comparison • Network overhead for one component – Message types: • Request (req), Reply (rep), Replicated state information (s) – Message size: s >> req • JBoss analysis: s = 2-3 KByte – Message number (N is number of sites) • JBoss: 1 request: 1 req + 1 rep + N*s • Sib: 1 requset: N req + 1 rep + q * N * s (q is the frequency of replication, q<=1) • Compare: JBoss > Sib • Backup’s CPU overhead – JBoss: receive and install state – Sib: receive requests – Assume install state > install request, so JBoss > Sib Site Crash and Failover for Simple Calls (JBoss) • Characters – When primary fails, client resends request to the new primary • Issues Client 3 1 Resend 1 Cluster – Resend must be repeatable • Content – Duplicated requests must be identified by SFB Client SFB 1 SFB 1 Backup Primary 2 Client Identify duplicated request 1 (Does JBoss do it ?) 3 1 Cluster SFB 1 Primary 3 1 Cluster Resend 1 SFB 1 Backup 2 Resend 1 SFB 1 SFB 1 Primary 2 Backup Site Crash and Failover for Simple Calls (Sib) • Issues – Duplicated replies must be identified by client – Replies must be deterministic 1 – When primary fails, the new primary automatically reexecutes all requests starting from last replication point – Client does not need to resend request 2 Execute request 1 Cluster SFB 1 SFB 1 Backup Primary Identify duplicated reply 2 (testable result) Client 1 • Characters Client 2 Execute request 1 2 Cluster SFB 1 Primary SFB 1 Backup Normal Processing in Calling Hierarchies Client 1 Client 6 1 4 Cluster Cluster SFB 1 5 2 4 Primary 3 SFB 2 SFB 2 3 JBoss SFB 1 2 SFB 1 Backup SFB 1 0 SFB 2 SFB 2 Primary 0 3' Sib Backup Site Crash and Failover for Calling Hierarchies (JBoss) • Assumption • Issues – Node fails (all beans in the node fail) – Duplicated internal requests must be identified by SFB • Characters – Similar to simple calls for one component Client Client Client Resend 1 1 2 Primary 4 5 SFB 2 Backup 3 2 6 Resend 1 1 Cluster SFB 1 SFB 1 2 SFB 2 Resend 1 1 Cluster Cluster SFB 1 Duplicated internal request 2 6 6 24 4 5 SFB 2 SFB 2 Primary SFB 1 SFB 1 3 Backup 2 4 SFB 1 5 SFB 2 SFB 2 Primary 3 Backup Site Crash and Failover for Calling Hierarchies (Sib) • Assumption: • Issues – Node fails (all beans in the node fail) – Duplicated internal requests must be identified by SFB – Duplicated internal replies must be identified by SFB • Character: – Every backup SFB re-executes requests starting from its own last replication point 7 0 23 67 SFB 2 SFB 2 Primary Backup 0 3' SFB 1 6 Discard duplicated internal requests 5 1 Re-execute request 6 8 4 3 7 SFB 1 0 4' 3 67 SFB 2 SFB 2 Primary Backup 0 Duplicated internal replies Duplicated internal request SFB 1 6 Identify duplicated internal requests Duplicated reply 8 Cluster 2 1 SFB 1 6 2 Re-execute request 2,6 Cluster SFB 1 3 Duplicated reply 8 Re-execute request 5 Client 8 4 Cluster 2 1 5 Duplicated reply 8 Duplicated reply 4 8 4 Re-execute request 5 Client 5 Client 3 7 SFB 1 0 Duplicated internal request 4' 67 SFB 2 SFB 2 Primary Backup 0 3' Direct Database Access (JBoss) • Normal process • Failover – NO “Exactly-once” Semantic Client Client 3 2 Primary te Wri te Wri and co m it Ackcomm and Resource Manager (Resource Adapter) Backup SFB 1 SFB 1 nd c o Ack mmit SFB 1 SFB 1 Primary Resend 1 1 Cluster 3 mi t Writ ea 1 Cluster Data Data Backup 2 Possible duplicated database access Direct Database Access (Sib) Client – Data bind to specified Resource Manager 1 • Normal process 2 Cluster – All requests to some data is forwarded to its specified RM Ack SFB 1 0 RM RM Primary ite Wr Backup 0 t Ackommi &C – Focus on write operation SQL SFB 1 Data Direct Database Access (Sib) • Failover suggestion 1 – Characters – Conclusion • Simple during normal process • Complex during recovery 1 Duplicated internal SQL request Cluster SFB 1 0 Generate undo RM Primary 0 Re-execute SQL request SFB1 SQL & ite Wr mit m Co • Tables must have primary key to be able to generate undo • In case primary committed, other access between commit and undo might be possible 2 Ack RM Undo t rollba o c k (re p e atable ) Ack Write & Com mit Ack – Issues Re-execute request 1 SQL • Write bind to commit • Converse Undo, start from the last write. • If original write is successful, undo rollback it. • If original write is failed, undo has no effect. Client Data Backup Direct Database Access (Sib) • Failover Suggestion 2 – Characters 1 SFB1 0 Ack SQL RM 0 Mark Exist? k Primary Data Redo if no m , a rk RM Ack • Complex during normal process • Simple and correct during recovery SFB 1 mi t co m e rt Ins k r Ma te Wri – Conclusion Cluster k Ma r • additional insert overhead In normal process • RM must maintain global transaction ids Duplicated internal SQL request Chec – Issues Re-execute request 1 2 SQL • Insert Additional mark before every commit • If mark exists, redo is not required • If mark does not exist, redo Client Backup Summary • Sib’s Advantage – Better performance • State information (including response information) is much bigger than request • Infrequent replication points enough since EJB is short-living so that failure is unlikely – More powerful for “Exactly-once” database access • Sib’s Disadvantage – More complicated recovery – Duplicated external responses • Client needs to remove duplicated responses Current Work • Compare schemas for complex transactions – A transaction crosses several requests – Entity beans replication in complex transactions • Multi-Thread: Typical Non-deterministic environment – Multi-Thread Client / Single-Thread SFB • JBoss: – Repeatable request might be problematic (e.g. resend in different order) • Sib – Avoid resend requests – Total order o requests in all replicas – Multi-Thread Client / Multi-Thread SFB • Both JBoss and Sib might be problematic – Sib: special concurrency control and synchronous mechanism • Component Failure Problematic Example under MultiThread Environment Client Resend 1 1 2 Cluster X=3 Resend 2 X=0 X=0 S1 X+3 S2 X-3 S2 X-3 Backup Primary 2 X=-3 Reference • • • • • • • • [1] W.Zhao, L.E.Moser and P.M.Melliar-smith, “Unification of Replication and Transaction Processing in the Three-Tier Architectures”, 22nd International Conference on Distributed Computing Systems (ICDCS'02), Vienna, Austria, July 02 - 05, 2002. [2] Marcia Pasin, Michel Riveill and Taisy Silva Weber, “High-Available Enterprise JavaBeans Using Group Communication System Support”, Proceedings of the Fourth European Research Seminar on Advances in Distributed Systems - ERSADS 2001, pages 161-166, University Residential Center of University of Bologna, Berliner (Frolic), Italy, 1418 May 2001. [3] Roger Barga, David Lomet, Gerhard Weikum, “Recovery Guarantees for General MultiTier Aplications”, Proceedings of the 18th International Conference on Data Engineering (ICDE'02) [4] P.Narasimhan, L.E.Moser and P.M.Melliar-smith, “State Synchronization and Recovery for Strongly Consistent Replicated CORBA Objects”, International Conference on Dependable Systems and Networks, Göteborg, Sweden (July 2001). [5] N.Narasimhan, L.E.Moser and P.M.Melliar-smith, “Transparent Consistent Replication of Java RMI Object”, DOA 2000: 17-26 [6] Svend Frølund, Rachid Guerraoui, “e-Transactions: End-to-End Reliability for ThreeTier Architectures”, TSE 28(4): 378-395 (2002) [7] Svend Frølund, Rachid Guerraoui, “Implementing e-Transactions with Asynchronous Replication”, IEEE Transactions on Parallel and Distributed Systems 12(2): 133-146 (2001) [8] Object Management Group, “Fault Tolerant CORBA Specification”. December 1999, OMG Technical Committee Document (orbos/00-04-04) Reference • • • • • • • • • • [9] IONA Technologies PLC, “White Paper Orbix E2A Application Load Balancing and Fault Tolerance”, online documentation, http://www.iona.com/forms/wprequest.htm, April, 2002. [10] BEA WebLogic Server™, “Programming WebLogic Enterprise JavaBeans”, Copyright © 2000 BEA Systems, Inc. All rights reserved. [11] Gerhard Weikum, Gpttfried Vossen, “Transactional Information Systems: Theory. Algorithms, and the Practice of Concurrency Control and Recover”, Morgan Kaufmann Publishers, January 2002. [12] Ed Roman, Scott Ambler and Tyler Jewell, “Mastering Enterprise JavaBeans second edition”, Wiley Computer Publishing, John Wiley & Sons, Inc., ISBN: 0-47141711-4, December 2001. [13] Sun Microsystems Inc., “Java™ 2 Platform Enterprise Edition Specification, v1.3”, online documentation, http://java.sun.com/j2ee/, Oct. 20, 2000. [14] Sun Microsystems Inc., “Enterprise JavaBeansTM Specification, Version 2.0”, online documentation, http://java.sun.com/j2ee/, Oct. 23, 2000. [15] R. Jimenez-Peris, M. Patino-Martinez, “Deterministic Scheduling and Online Recovery for Replicated Multithreaded Transactional Servers”, WDMS 2002, June 26, 2002. [16] Roman Vitenberg, Idit Keidar, Gregory V.Chockler and Danny Dolev, “Group Communication Specifications: A Comprehensive Study”, In ACM Computing Surveys 33(4), pages 1-43, December 2001. [17] Sacha Labourey, Bill Burke, “JBoss Clustering”, The JBoss Group, November 2002 [18] Andreas Schaefer, “JBoss 3.0 Quick Start Guide”, The JBoss Group, July 2002