Art of Multiprocessor Programming

advertisement
An Introduction to Software
Transactional Memory
Alessia Milani
Labri, Bordeaux
Popularizing Concurrent
Programming
• A multi-core revolution is underway
• Exploit the power of concurrent
computing, by restructuring applications
• Devise scalable concurrent programs is
hard…unless good abstractions :
☞ Transaction
2
Transaction
A transaction is a sequence of operations by a single
process on a set of shared data items (transactional
objects) that ends
Either by committing : all of its updates take effect
atomically
or by aborting : has no effect (typically restarted)
3
Transactional Memory (TM)
• To simplify : Just wrap
(sequential) code in begin / end
transaction
• TM synchronizes memory
accesses so that each
transaction seems
to execute sequentially
and in isolation
begin-transaction
--------------------------------------------------------end-transaction
4
Implementing Transactional
Memory
• TM was originally suggested as hardware
platform [Herlihy and Moss 1993]
– HTM is in today hardware platforms, e.g. Intel,
IBM, Sun
• Purely in software :
– First Software Transactional Memory (STM) only
for static transactions [Shavit & Touitou 1995]
– First dynamic STM [Herlihy, Luchnagco, Moir and Schrer 2003]
• Hybrid schemes (HyTM) that combine
hardware and software [Moir et al. 2006]
Implementing Transactional
Memory in Software (STM)
• Data representation for
transactions and data items
using base objects
• Algorithms for operations on
data items, applying
primitives to base objects
– registers, CAS, DCAS
begin-transaction
read
read
…
Algorithms
base objects
write
TryCommit
end-transaction
Asynchronous processes
execute these algorithms
to execute the operations
of the transactions
6
3 levels of abstractions
• Transaction
read
• Operations
read
write
tryC
– on data items: E.g., read
and write
– tryCommit / tryAbort
• Primitives on base objects
(registers, CAS…)
7
STM algorithms
Main Techniques
8
Back to TM Consistency
Serializability: committed transactions
appear to execute sequentially
begin-Tx
Commit
write
read
write
read
TryC
begin-Tx
read
Commit
write
begin-Tx
write
read
Commit
read
write
read TryC
TryC
begin-Tx
read
Commit
write
read
TryC
Back to TM Consistency
Serializability: committed transactions
appear to execute sequentially
Strict serializability: also preserves the
order of non-overlapping transactions
[Papadimitriou 1979]
Opacity: even transactions that later
abort are (strictly) serializable
[Guerraoui, Kapalka POPL 2008]
Much more …
serializability
strict
serializability
opacity
Conflicts
begin-Tx
p1
Commit
Read(x)0
Write(x)1
begin-Tx
p2
Read(x)0
begin-Tx
p1
Commit
Write(x)2
Commit
Read(x)0
Write(y)1
begin-Tx
p2
Read(y)0
Commit
Write(x)1
• Two concurrent transactions have a conflict if they
access the same data item and at least one of these
accesses is a Write operation.
• Two transactions that cannot be serialized have a
conflict. (The converse is not true)
11
Design approaches
• Deferred/Direct updates : operate on local copies of
the data items & install changes at commit/ Modify in
place, roll back on abort
• Detect conflicts
– Commit time
– Encountering time
• Resolve a conflict either by aborting one of the
conflicting transactions or by waiting/helping it to
complete
– This depends on the progress you want
– Contention manager
12
Contention Manager Strategies
• Priority to
– Oldest?
– Most work?
…
• None Dominates
• Lots of empirical work but formal work in
infancy
13
Progress for TM
• Lock-free TM
– Wait-freedom : each non-faulty process completes (successfully) its
transaction within a finite number of steps
– Obstruction-freedom : a process running solo eventually commits
its transaction
• Lock-Based TM
– Weakly progressive: a transaction aborts only if it has conflicts
[Guerraoui, Kapalka POPL 2009]
– Strongly progressive: at least one of the transactions involved in the
conflict commits
– Multi-version permissive: only writing transaction that conflicts with
another writing transaction aborts
[Perelman, Fan, Keidar PODC 2010]
 Read-only transactions always commit
STM algorithms
Two Case Studies
15
A lock-based STM : TL2
[Dice et al. DISC 2006]
• Each data item is associated with a version number
– TL2 relies on a global versioning clock
• Transaction keeps
– Read set: data items & values read
– Write set: data items & values to be written
• Deferred update
– Changes installed at commit
• Lazy conflict detection
– Conflicts detected at commit
16
Read-Only Transactions
Mem
Locks
Copy version clock to local
read version clock RV
12
32
56
19
100
100
17
Shared Version
Clock
Private Read
Version (RV)
17
Read-Only Transactions
Mem
Locks
12
32
56
Copy version clock to local
read version clock
Each read operation is postvalidated checking the lock and
version # of the corresponding
memory location
19
100
100
17
Shared Version
Clock
Private Read
Version (RV)
18
Read-Only Transactions
Mem
Locks
Copy version clock to local
readversion
version#,clock
Read lock,
and
32
memory,
check versionfails.
# less
COMMIT
if no post-validation
thanasread
clock
Otherwise
ABORT
soon
as one
56
Read fails
12
19
100
100
17
Shared Version
Clock
Private Read
Version (RV)
19
Read-Only Transactions
Mem
Locks
12
32
56
We have taken a snapshot without
keeping an explicit read set!
19
100
100
17
Shared Version
Clock
Private Read
Version (RV)
20
Example Execution: Read Only
Trans
Mem
Locks
87
87
0
34
34
34
00
88
88
0
V#
99
99
0
44
44
0
50
50
V#
0
100
Shared Version Clock
1. RV  Shared Version Clock
2. On Read: read lock, read mem,
read lock: check unlocked,
unchanged, and v# <= RV
3. Commit.
Reads form a snapshot of memory.
No read set!
100
RV
Writing Transactions
Mem
Locks
Copy version clock to local
read version clock RV
12
32
56
19
100
100
17
Shared Version
Clock
Private Read
Version (RV)
22
Writing Transactions
Mem
Locks
12
32
56
Copy version clock to local
read version clock
On read/write, check:
Unlocked & version # < RV
Add to R/W set
19
100
100
17
Shared Version
Clock
Private Read
Version (RV)
23
On Commit
Mem
Locks
Acquire write locks
12
32
56
19
100
17
Shared Version
Clock
100
Private Read
Version (RV)
24
On Commit
Mem
Locks
12
Acquire write locks
Increment Version Clock
32
56
19
100
100
101
17
Shared Version
Clock
Private Read
Version (RV)
Art of Multiprocessor
Programming
25
On Commit
Mem
Locks
12
32
Acquire write locks
Increment Version Clock
Check version numbers ≤ RV
56
19
100
100
101
17
Shared Version
Clock
Private Read
Version (RV)
Art of Multiprocessor
Programming
26
On Commit
Mem
Locks
12
x
32
Acquire write locks
Increment Version Clock
Check version numbers ≤ RV
Update memory
56
19
100
101
y
17
Shared Version
Clock
100
Private Read
Version (RV)
27
On Commit
Mem
Locks
12
x
32
101
56
Acquire write locks
Increment Version Clock
Check version numbers ≤ RV
Update memory
Update write version #s
19
100
101
y
17
101
Shared Version
Clock
100
Private Read
Version (RV)
28
Example: Writing Transaction
Mem
X
X
Y
Y
Locks
121
120
100
87
87
87
0
00
121
34
34
121
0
00
1
88
88
0
0
V#
121
99
121
0
0
10
44
44
0
0
50
V#
50
V#
50
50
0
0
00
Commit
Shared Version Clock
1. RV  Shared Version Clock
2. On Read/Write: check
unlocked and v# <= RV then
add to Read/Write-Set
3. Acquire Locks
4. WV = F&I(VClock)
5. Validate each v# <= RV
6. Release locks with v#  WV
100
RV
29
A lock-free STM :
Dynamic Software Transactional Memory
• Proposed in [Herlihy et al. DISC 2003]
– Opacity & obstruction freedom
• Transaction keeps
– Read set: data items & values read
• Direct update
– Changes installed when the corresponding Write
is executed
• Eager conflict detection
– Conflicts detected at encountering time
30
DSTM : transaction and
transactional object representation
• A transaction has
– a status field that is initialized to be ACTIVE, and it is later COMMITED or
ABORTED using a CAS primitive
– a readlist to store the data items read together with the values read
• Each transactional object has the following structure
transaction
start
TMObject
status
new object
old object
Data
Locator
Data
Status of the
transaction that
most recently
accessed the
object to write
it
Current object version
• The current object version is determined by
the status of the transaction that most
recently accessed the object to WRITE :
– committed: the new object is the current
– aborted: the old object is the current
– active: the old object is the current, and the new is
tentative
• The actual version only changes when a
commit is successful
Write operation : example
• Transaction A tries to write object o. Let B be the transaction that
most recently accessed o to WRITE it
committed
transaction
start
o
4
Use CAS in
order to
replace locator
3
1
Data
new object
old object
Data
B’s Locator
transaction
active
new object
old object
If CAS fails,
A restarts
A’s Locator
from the
beginning
A sets
old aobject
to the
creates
new Locator
previous new
Data
2
A copies the previous new
object, and sets new
copy
Which is the current version of
the object if B is active?
• A and B are conflicting transactions,
that run at the same time
• Use Contention Manager to decide
which should continue and which should
abort
• If B needs to abort, try to change its
status to aborted (using CAS)
Read operation
• To read object o by a transaction A
– Fetch the current version v just as before
– Add the pair (o, v) to the read set of A
Validating a transaction
• Before returning the value either read or
written, check consistency
• For each pair (o,v) in the read set, verify that v
is still the most recently committed version of
the transactional object o.
• Check that the status of the transaction is still
ACTIVE
Committing a transaction
• The commit needs to do the following:
1. Validate the transaction
2. Change the transaction’s status from
active to committed (using CAS)
That’s it?
You are here
Elastic Txs
Irrevocable
transactions
Multiversioning
Privatization
Distributed STM
Nested
transactions
Lower bounds
More references and credits
Many of these slides are (largely inspired) from
– The slides of “The Art of Multiprocessor
Programming” by Maurice Herlihy and Nir Shavit
– A PODC 2010 talk by Hagit Attiya
– Teaching slides by Danny Hendler
Other reference :
– Transactional Memory,Foundations, Algorithms,
Tools, and Applications. COST Action Euro-TM
IC1001. Lecture Notes in Computer Science,
Springer 2014.
Download