Idempotent Transactional Workflow (POPL 2013) G. Ramalingam Kapil Vaswani Microsoft Research India The Problem Application Partitioned Data Can we simplify writing such applications? scale-out Transfer (amt, acct1, acct2) { Debit amt from acct1; Credit amt to acct2; } ACID Transaction + Strong consistency − acct1, Distributed Transfer (amt, acct2)transaction atomic { Debit amt from acct1; Credit amt to acct2; } Workflow − Weaker consistency − No isolation + No distributed transaction Transfer (amt, acct1, acct2) atomic {Debit …}; atomic {Credit …}; What process Claim: about Workflows arefailure? common in applications over partitioned data The Problem Modern Cloud Platforms Goal Application • Fault-tolerance in Logic application • A transactional workflow engine Stopping • (non-byzantine) decentralized! failure Storage Layer (failures handled by storage layer) Making Workflows Fault-Tolerant request response Taking a step back … Request or response may be lost! Transfer (amt, acct1, acct2) { Debit amt from acct1; Credit amt to acct2; } Resending messages is a critical element of fault-tolerance Must be Idempotent! (tolerate duplicate messages) Goal: Idempotent Fault-Tolerance • (Idempotent Workflow) • A program is said to be idempotent & faulttolerant iff – its behavior is unaffected by process failures – its behavior is unaffected by duplicate input requests • Behavioral equivalence: – duplicate output responses allowed – progress (liveness) conditions • slightly weakened Making Workflows Idempotent & Fault-Tolerant request response Making Computations Idempotent request response Make every effectful step idempotent: 1. Associate unique id with every step 2. Modify step to log execution of step 3. Modify step to check if it has already executed All must be done atomically ! Automated Idempotent Fault-Tolerance • As a library – In C# & F# – Technically, a monad • As a compiler • As a programming-language construct Formal Results Theorem. A well-typed monadic program is idempotent and fault-tolerant. Any (well-typed) program e can be automatically translated (compiled) into a program compile[e] Theorem. compile[e] is an idempotent and fault-tolerant realization of e. Idempotence: A Language Construct • “idworkflow uid e’’ transfer (uid, amt, acct1, acct2) { idworkflow uid { atomic T1 {Debit amt from acct1}; atomic T2 {Credit amt to acct2} } } Extensions • Compensating actions – Undo earlier actions when later actions encounter logical failure • Automatic retry – Detect process failures & restart • Checkpointing – Restart at most recent checkpoint Questions? Fault-Tolerance & Idempotence: Simpler Together Problem Setting client service Application Logic Storage Layer partitioned data