An Interim Report from the Trenches: Using STM Michael L. Scott

MLS

An Interim Report from the

Trenches: Using STM

Michael L. Scott

University of Rochester www.cs.rochester.edu/research/synchronization/

Microsoft Faculty Summit

July 2007

1

MLS

RSTM









TM library for C++

Outgrowth of 3+ years experience building TM libraries for Java, C, and C++

“Smart pointer” API

» separate “hooks” at first access (initialization) and subsequent access (dereference)

» different pointer types for reading, writing, private access

» checks for common mistakes

Set out to gain some experience (Dec.’06—Feb.’07)

2

MLS

Delaunay Triangulation











Important problem in computational geometry

Our version distinctive in mix of barriers and transactions

Highlights importance of fast private access (95% of time in barrier-protected private code)

3200 lines; 400 w/in txns

See papers on app [IISWC ’07],

API experience [TRANSACT ’07], privatization [UR TR 915]; 2 BAs at PODC ‘07

3

MLS

Lots of Annoying Problems







Simple awkwardness

» accessors in C++ (Cf. C#), explicit validators, 4 pointer types, clone/redo/deactivate methods, template-based code sharing

Programming model limitations

» no nontrivial constructors, destructors, non-static methods (can’t use this )

» no non-local exits from transactions

» explicit privatization

» explicit labeling of txnal types — can’t share code

Inscrutable error messages



Compiler can fix all of these easily

4

MLS

And Some Deeper Challenges











(without mentioning nesting, I/O, condition synch, or interoperation with legacy code)

What reverts on abort?

» intuition says “everything”

» library API says “what you label”

At least 4 implementation cases

» actively transactional

» privatized

» long-lived but always thread-local

» thread-local and transient

Should the compiler associate these with types? object instances? references?

How should they propagate/flow?

How much should the programmer have to know/say?

» e.g. wrt privatization

5

MLS

Conclusions







STM likely to be fast enough to use

» though HW support desirable

Library-based STM

» can be used for nontrivial apps

» can be reasonably fast

» can usefully drive runtime development

» but cannot be given to naive users (“obvious” in hindsight)

Compiler support essential

» for semantics, not just performance

» distinguishing among sharing classes an important open problem

6

www.cs.rochester.edu/research/synchronization/

The Second ACM SIGPLAN

Workshop on Transactional Computing

To be held in conjunction with PODC 2007

Portland, Oregon, August 16, 2007

Registration deadline: July 25, 2007

www.cs.rochester.edu/meetings/TRANSACT07/

PPoPP'08

The 13th ACM SIGPLAN Symposium on

Principles and Practice of Parallel Programming

20 –23 February 2008

Salt Lake City, Utah (co-located with HPCA-14 )

Submission deadline:

13 (abstracts) / 20 Aug. 2007

www.ppopp.org

MLS 10

MLS

Concurrency Design Space

Credit: Bill Scherer

11

MLS

The Bigger Picture



Parallelism works well for

» High-end scientific computing

» Operating systems

» Internet servers written by experts

(largely) embarrassingly parallel



But

» Hard to find in desktop apps

» Hard to write, debug, maintain, scale

» TM is not a panacea; let’s be careful not to oversell it

12

MLS

Privatization









Allow data to move in/out of transactional world

» producer/consumer, spatial partitioning, …

Conceptually simple

Two implementation problems

» private code fails to see committed but not-yet-cleaned-up updates

» doomed but not-yet-aborted transaction sees private updates and takes erroneous action

Several technical solutions [URCS TR 915; PODC ’07 BA]

 how to present to user?

13

MLS

Transactional Sharing Models





Contract between the user & the system

» Cf. programmer-centric memory consistency models

» ideally enforced by compiler

Transactions appear to be strongly isolated if programmer follows the rules

» static partition — too restrictive

» partition within global consensus phases

– e.g. via barriers

» privatizing transactions

– multiple possible implementations

» strong isolation

– probably too expensive for software — overkill

14

MLS

Predicting the Future





Multicore hardware

» 8-core, 32-thread processors from Sun now

» dual-core processors from Intel, AMD, Sun, IBM

» quad-core processors from Intel

Transactional memory

» part of the HPCS projects at IBM, Cray, and Sun

» commercial black box from Azul

» software from Microsoft likely soon

» hybrid from Sun likely soon

» something from Intel likely soon

» chicken-and-egg problem with workloads

15

MLS

Where Will all the Threads

Come From?









Programming idioms / design patterns

» e.g., futures, dataflow, . . .

Higher-level abstractions

» reduce/map/scan, . . .

Speculative parallelization

» manual or automatic

» transactions for automatic detection and recovery from uncommon data races

(Your silver bullet here)

16

MLS

The Big Question



When will we have lots of multithreaded desktop applications?

» server-class chips will go aggressively multicore soon

– Sun and Azul doing it already

» consumer chips will wait for applications

– until that happens, get used to the performance you have now

17

An Interim Report from the Trenches: Using STM Michael L. Scott

An Interim Report from the

Trenches: Using STM

Michael L. Scott

RSTM

Delaunay Triangulation

Lots of Annoying Problems

And Some Deeper Challenges

Conclusions

www.cs.rochester.edu/research/synchronization/

The Second ACM SIGPLAN

Workshop on Transactional Computing

www.cs.rochester.edu/meetings/TRANSACT07/

PPoPP'08

www.ppopp.org

Concurrency Design Space

The Bigger Picture

Privatization

Transactional Sharing Models

Predicting the Future

Where Will all the Threads

Come From?

The Big Question

Related documents

Products

Support

An Interim Report from the Trenches: Using STM Michael L. Scott