Boosting STM Replication via Speculation Roberto Palmieri, Paolo Romano, Francesco Quaglia, Luis Rodrigues Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Replication • Is a typical way to achieve Fault Tolerance • Several techniques: – Primary Backup approach – Active Replication approach –… Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Why Active Replication • Each replica keeps all data and executes the same transactions in the same order • PRO (+) – Full failure masking – No coordination for processing read-only transactions – Prone to target performance issues • CONS (-) – Agreement on common execution order – Deterministic business logic Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Literature solutions for actively replicated transactional systems Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon State Machine Approach (SM) • • • • Implements Active Replication paradigm Based on Atomic Broadcast as GCS Does not exploit any kind of optimism Requires deterministic Local CC Coordination phase to-broadcast (m) Processing m to-delivery (m) Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Optimistic Approach (OPT) • Based on Optimistic Atomic Broadcast as GCS • It processes in optimistic manner: • At most one conflicting transaction • Any non-conflicting transactions Optimistic processing of m Processing m Coordination phase if(m is non-conflicting transaction) Commit(m) if(opt-delivery order == to-delivery order) Commit(m) if(opt-delivery order <> to-delivery order) Abort & Restart(m) to-broadcast (m) opt-delivery (m) to-delivery (m) Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Critiques to OPT (1) • A-priori knowledge on transaction read/write sets (to detect conflicts) might be infeasible (non-determinism) • Unpredictability of transactions data access pattern imposes safe (over-) estimation of accessed data sets (reduced concurrency) Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Critiques to OPT (2) • Limited overlapping in case of fine-grain transactions Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Critiques to OPT (3) • Ineffective if deployed unpredictable networks on less or – Networks without spontaneous ordering of optimistic deliveries Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Unbalancing Ratio Coordination delay Local Transaction Execution Time Traditional Scenarios Modern (STM) Scenarios ≈ 2 msec ≈ 2 msec ≈ 1/10 msec ≈ 10/100 μsec Long stall periods -> Resources underutilization Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Target: Maximize the overlap • Coordination delay Vs Local transaction processing • Local transaction processing Vs Local transaction processing …… Opt-delivery (m1) to-broadcast (m1) to-broadcast (m2) to-broadcast (m3) Coordination phase m1 Coordination phase m2 Coordination phase m3 …… to-delivery (m1) Multiple or same Serialization Orders Opt-delivery (m3) Opt-delivery (m2) to-delivery (m2) to-delivery (m3) Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon How: Speculative Processing • Basic ideas: – Activate all transactions as soon as they are optimistic delivered – Explore (in depth and/or in breadth) multiple serialization orders opt-del(TB) TA: TB: opt-del(TA) exec TA TA->TB exec TB TB->TA to-del(TA) exec T’A TB->TA exec T’B TA->TB to-del(TB) time abort(T’A) abort(TB) commit(TA) commit(T’B) Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Speculative Processing: desirable properties • Correctness – The history of committed transaction generated is 1-copy serializable • Non-Redundancy – No two speculative instances of the same transaction observe the same snapshot • Completeness – Eventually, every permutation of optimistic delivered transactions (not yet final delivered) that produces a distinct snapshot is explored Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Relaxing completeness • The relevance of the completeness property depends on the likelihood of mismatches between final and optimistic delivery orders completeness min. ✗ Completeness ✗ Completeness [NCA 2011] at-most one speculative order [SRDS 2011] some “opportunistic” speculative orders ✔ Completeness [SPAA2010] [ISPA2010] every distinct speculative order Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Aggressively Optimistic Transaction Processing (AGGRO) • Tailored for networks with spontaneous order • Lock based concurrency control • Optimized lock semantic to reduce the lock wait time of conflicting transactions • No a priori knowledge on transactions read/write sets • No sibling transactions (reduced overhead) Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon AGGRO: key ideas 1. Uncommitted data item versions are aggressively made visible to other transactions (upon transaction completion) independently of whether the creating transactions will be eventually committed 2. Speculative transactions guided to follow the Optimistic delivery order The ordering relations based on OPT Ti → Tj Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Data item version states • Each version of shared objects in the transactional memory could be: – Committed – Uncommitted • Committed – The creator transaction has already committed • Uncommitted – Work-In-Progress (WIP) state, the creator transaction has not reached the complete stage yet – Complete, the creator transaction has reached the complete stage, but is not finalized as committed or aborted yet Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Transactions’ ordering • Two transaction lists: – Opt-Delivered transactions (ODL) – TO (final)-Delivered transactions (FDL) OPT Ti → Tj • Transaction Ti → Tj if – Ti and Tj are both currently recorded within ODL, with Ti ordered before Tj – Ti is currently recorded within FDL, while Tj is currently recorded within ODL – Ti and Tj are both currently recorded within FDL, with Ti ordered before Tj Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Algorithm sketch: Read Read dataitem X by Ti Get version V according to OPT → Yes Wait until the mark is removed V is marked as WiP No Read V and update the read-form set of Ti Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Algorithm sketch: Write Write on dataitem X by Ti The dataitem X is marked as WiP Abort (and restart) all transactions that follow Ti (according to OPT ) and read → X from a different transaction Early abort mechanism Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Algorithm sketch: Complete Ti completes its execution Remove all the WiP marks Make available written dataitems to other speculative transactions Permits to activate next conflicting transactions Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Trace-based Simulator • A discrete event simulator based on JavaSim framework • Realistic timing and data access patterns (coming from execution of JVSTM) • (Optimistic) Atomic Broadcast service entails the possibility of batching messages to improve performance • Atomic Broadcast average delays: – Optimistic delivery: 500 μsec – Final (total ordered) delivery: 2 msec • LAN environment with no mismatch between optimistic and final deliveries (spontaneous order) Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Test bed • Benchmarks: Name Type Mean Transaction Execution Time RB-Tree Micro Benchmark 77 μsec SkipList Micro Benchmark 281 μsec List Micro Benchmark 324 μsec Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Simulation results • Baseline (OPT) Vs AGGRO no speculation speculation speedup Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon CPU Utilization Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Simulation results • Baseline (OPT) Vs AGGRO no speculation speculation Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon CPU Utilization Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Thank you for the attention Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon