palmieri - Distributed Systems Group

advertisement
Boosting STM Replication via
Speculation
Roberto Palmieri, Paolo Romano, Francesco Quaglia, Luis Rodrigues
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Replication
• Is a typical way to achieve Fault Tolerance
• Several techniques:
– Primary Backup approach
– Active Replication approach
–…
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Why Active Replication
• Each replica keeps all data and executes the same
transactions in the same order
• PRO (+)
– Full failure masking
– No coordination for processing read-only transactions
– Prone to target performance issues
• CONS (-)
– Agreement on common execution order
– Deterministic business logic
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Literature solutions for actively
replicated transactional systems
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
State Machine Approach (SM)
•
•
•
•
Implements Active Replication paradigm
Based on Atomic Broadcast as GCS
Does not exploit any kind of optimism
Requires deterministic Local CC
Coordination phase
to-broadcast (m)
Processing
m
to-delivery (m)
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Optimistic Approach (OPT)
• Based on Optimistic Atomic Broadcast as GCS
• It processes in optimistic manner:
• At most one conflicting transaction
• Any non-conflicting transactions
Optimistic processing of m
Processing m
Coordination phase
if(m is non-conflicting transaction)
Commit(m)
if(opt-delivery order == to-delivery order)
Commit(m)
if(opt-delivery order <> to-delivery order)
Abort & Restart(m)
to-broadcast (m) opt-delivery (m)
to-delivery (m)
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Critiques to OPT (1)
• A-priori
knowledge
on
transaction
read/write sets (to detect conflicts) might
be infeasible (non-determinism)
• Unpredictability of transactions data
access pattern imposes safe (over-)
estimation of accessed data sets (reduced
concurrency)
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Critiques to OPT (2)
• Limited overlapping in case of fine-grain
transactions
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Critiques to OPT (3)
• Ineffective if deployed
unpredictable networks
on
less
or
– Networks without spontaneous ordering of
optimistic deliveries
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Unbalancing Ratio
Coordination delay
Local
Transaction Execution Time
Traditional Scenarios
Modern (STM) Scenarios
≈ 2 msec
≈ 2 msec
≈ 1/10 msec
≈ 10/100 μsec
Long stall periods -> Resources underutilization
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Target: Maximize the overlap
• Coordination delay Vs Local transaction processing
• Local transaction processing Vs Local transaction processing
……
Opt-delivery (m1)
to-broadcast (m1)
to-broadcast (m2)
to-broadcast (m3)
Coordination phase m1
Coordination phase m2
Coordination phase m3
……
to-delivery (m1)
Multiple or same
Serialization Orders
Opt-delivery (m3)
Opt-delivery (m2)
to-delivery (m2)
to-delivery (m3)
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
How: Speculative Processing
• Basic ideas:
– Activate all transactions as soon as they are optimistic delivered
– Explore (in depth and/or in breadth) multiple serialization orders
opt-del(TB)
TA:
TB:
opt-del(TA)
exec TA
TA->TB
exec TB
TB->TA
to-del(TA)
exec T’A
TB->TA
exec T’B
TA->TB
to-del(TB)
time
abort(T’A)
abort(TB)
commit(TA)
commit(T’B)
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Speculative Processing:
desirable properties
• Correctness
– The history of committed transaction generated is 1-copy
serializable
• Non-Redundancy
– No two speculative instances of the same transaction
observe the same snapshot
• Completeness
– Eventually, every permutation of optimistic delivered
transactions (not yet final delivered) that produces a
distinct snapshot is explored
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Relaxing completeness
• The relevance of the completeness property depends on
the likelihood of mismatches between final and optimistic
delivery orders
completeness
min.
✗ Completeness
✗ Completeness
[NCA 2011]
at-most one
speculative order
[SRDS 2011]
some “opportunistic”
speculative orders
✔ Completeness
[SPAA2010]
[ISPA2010]
every distinct
speculative order
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Aggressively Optimistic Transaction
Processing (AGGRO)
• Tailored for networks with spontaneous order
• Lock based concurrency control
• Optimized lock semantic to reduce the lock wait
time of conflicting transactions
• No a priori knowledge on transactions read/write
sets
• No sibling transactions (reduced overhead)
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
AGGRO: key ideas
1. Uncommitted data item versions are aggressively
made visible to other transactions (upon transaction
completion) independently of whether the creating
transactions will be eventually committed
2. Speculative transactions guided to follow the
Optimistic delivery order
The ordering relations based on
OPT
Ti → Tj
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Data item version states
• Each version of shared objects in the transactional memory
could be:
– Committed
– Uncommitted
• Committed
– The creator transaction has already committed
• Uncommitted
– Work-In-Progress (WIP) state, the creator transaction has not
reached the complete stage yet
– Complete, the creator transaction has reached the complete
stage, but is not finalized as committed or aborted yet
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Transactions’ ordering
• Two transaction lists:
– Opt-Delivered transactions (ODL)
– TO (final)-Delivered transactions (FDL)
OPT
Ti → Tj
• Transaction Ti → Tj if
– Ti and Tj are both currently recorded within ODL, with Ti
ordered before Tj
– Ti is currently recorded within FDL, while Tj is currently
recorded within ODL
– Ti and Tj are both currently recorded within FDL, with Ti
ordered before Tj
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Algorithm sketch: Read
Read dataitem X by Ti
Get version V according to OPT
→
Yes
Wait until the
mark is removed
V is
marked
as WiP
No
Read V and update the
read-form set of Ti
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Algorithm sketch: Write
Write on dataitem X by Ti
The dataitem X is marked as WiP
Abort (and restart) all transactions that
follow Ti (according to OPT ) and read
→
X from a different transaction
Early abort mechanism
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Algorithm sketch: Complete
Ti completes its execution
Remove all the WiP marks
Make available written dataitems to
other speculative transactions
Permits to activate next conflicting transactions
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Trace-based Simulator
• A discrete event simulator based on JavaSim framework
• Realistic timing and data access patterns (coming from
execution of JVSTM)
• (Optimistic) Atomic Broadcast service entails the possibility of
batching messages to improve performance
• Atomic Broadcast average delays:
– Optimistic delivery: 500 μsec
– Final (total ordered) delivery: 2 msec
• LAN environment with no mismatch between optimistic and
final deliveries (spontaneous order)
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Test bed
• Benchmarks:
Name
Type
Mean Transaction Execution
Time
RB-Tree
Micro Benchmark
77 μsec
SkipList
Micro Benchmark
281 μsec
List
Micro Benchmark
324 μsec
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Simulation results
• Baseline (OPT) Vs AGGRO
no speculation
speculation
speedup
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
CPU Utilization
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Simulation results
• Baseline (OPT) Vs AGGRO
no speculation
speculation
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
CPU Utilization
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Thank you for the attention
Roberto Palmieri – Workshop on Distributed Transactional Memory
(WDTM 2012) - 22/02/2012 Lisbon
Download