The Future of Distributed Computing Renaissance or Reformation? Maurice Herlihy Brown University Le Quatorze Juillet SAN FRANCISCO, May 7. 2004 - Intel said on Friday that it was scrapping its development of two microprocessors, a move that is a shift in the company's business strategy…. New York Times PODC 2008 2 Moore’s Law Transistor count still rising Clock speed flattening sharply (hat tip: Simon Peyton-Jones) PODC 2008 3 Still on some of your desktops: The Uniprocesor cpu memory PODC 2008 Art of Multiprocessor Programming 44 In the Enterprise: The Shared Memory Multiprocessor (SMP) cache cache cache Bus Bus shared memory PODC 2008 Art of Multiprocessor Programming 55 Your New Desktop: The Multicore Processor (CMP) All on the same chip cache cache Bus cache Bus shared memory PODC 2008 Art of Multiprocessor Programming Sun T2000 Niagara 66 Multicores are Here • “Learn how the multi-core processor architecture plays a central role in Intel's platform approach. ….” • “AMD is leading the industry to multicore technology for the x86 based computing market …” • “Sun's multicore strategy centers around multi-threaded software. ... “ PODC 2008 7 Why should we care? • First time ever, – PODC research relevant to Real World™ • First time ever, – Real World™ relevant to PODC Plato vs Aristotle PODC 2008 8 Renaissance? • World (re)discovers PODC community achievements • This has already happened (sort-of) World learns of PODC results PODC 2008 9 Reformation? Bonfire of the Vanities • Can we respond to the Real World’s challenges? • Are we working on problems that matter? • Can we recognize what’s going to be important? PODC 2008 10 In Classic Antiquity • Time cured software bloat • Double your path length? – Wait 6 months, until – Processor speed catches up PODC 2008 11 Parallelism Didn’t Matter • Multiprocessor companies failed in 80s • Outstripped by sequential processors • Field respected, but not taken seriously PODC 2008 12 The Old Order Lies in Ruins • Six months means more cores, same clock speed • Must exploit more paralellism • No one really knows how to do this PODC 2008 13 What Keeps Microsoft and Intel awake at Night? • If more cores does not deliver more value … • Then why upgrade? ? PODC 2008 14 Washing Machine Science? • Computers could become like washing machines • You don’t trade it in every 2 years for a cooler model • You keep it until it breaks. PODC 2008 15 No Cores Please, we’re Theorists! • Computer Science is driven by Moore’s law • Each year we can do things we couldn’t do last year • Means funding, students, excitement PODC 2008 ! 16 With Sudden Relevance Comes Great Responsibility • Many challenges involve – concurrent algorithms – Data structures – formal models – complexity & lower bounds, –… • Stuff we’re good at. PODC 2008 17 Disclaimer • What follows are my Opinions (mine, mine, mine!) – And prejudices • Targeted to people – New in the field • No offence intended – In most cases. PODC 2008 18 Concurrent Programming Today PODC 2008 19 Coarse-Grained Locking Easily made correct … But not scalable. PODC 2008 20 Fine-Grained Locking Here comes trouble … PODC 2008 21 Locks are not Robust If a thread holding a lock is delayed … No one else can make progress PODC 2008 22 Locking Relies on Conventions • Relation between Actual comment – Lock bit and object bits from Linux Kernel (hat tip: Bradley Kuszmaul) – Exists only in programmer’s mind /* * When a locked buffer is visible to the I/O layer * BH_Launder is set. This means before unlocking * we must clear BH_Launder,mb() on alpha and then * clear BH_Lock, so no reader can see BH_Launder set * on an unlocked buffer and then risk to deadlock. */ PODC 2008 23 Sadistic Homework enq(x) FIFO queue deq(y) No interference if ends “far enough” apart PODC 2008 24 Sadistic Homework enq(x) FIFO queue deq(y) Interference OK if ends “close enough” together PODC 2008 25 You Try It … • One lock? – Too Conservative • Locks at each end? – Deadlock, too complicated, etc • Publishable result? – Once, maybe still? PODC 2008 26 Locks do not compose Hash Table lock T1 T1 add(T1, item) item Must lock T1 before adding item Move from T1 to T2 delete(T1, item) add(T2, item) lock T1 T1 lock lock T2 T2 item item Must lock T2 before deleting from T1 Exposing lock internals breaks abstraction PODC 2008 27 The Transactional Manifesto • What we do now is inadequate to meet the multicore challenge • Research Agenda – Replace locking with a transactional API – Design languages to support this model – Implement the run-time to be fast enough PODC 2008 28 Sadistic Homework Revisited Public void enq(item x) { Qnode q = new Qnode(x); q.next = this.tail; this.tail.next = q; } Write sequential Code (1) © 2006PODC Herlihy 2008 & Shavit 29 29 Sadistic Homework Revisited Public void LeftEnq(item x) { atomic { Qnode q = new Qnode(x); q.next = this.tail; this.tail.next = q; } } (1) © 2006PODC Herlihy 2008 & Shavit 30 30 Sadistic Homework Revisited Public void LeftEnq(item x) { atomic { Qnode q = new Qnode(x); q.next = this.tail; this.tail.next = q; } } Enclose in atomic block (1) © 2006PODC Herlihy 2008 & Shavit 31 Warning • Not always this simple – Conditional waits – Enhanced concurrency – Complex patterns • But often it is – Works for sadistic homework © 2006PODC Herlihy 2008 & Shavit 32 32 Composition Public void Transfer(Queue<T> q1, q2) { atomic { T x = q1.deq(); q2.enq(x); } } Trivial or what? (1) © 2006PODC Herlihy 2008 & Shavit 33 33 Not All Skittles and Beer • Algorithmic choices – Lower bounds – Better algorithms • Language design • Semantic issues – Like memory models – Atomicity checking PODC 2008 34 Contention Management & Scheduling • How to resolve conflicts? • Who moves forward and who rolls back? • Lots of empirical work but formal work in infancy Judgment of Solomon PODC 2008 35 I/O & System Calls? • Some I/O revocable – Provide transactionsafe libraries – Undoable file system/DB calls • Some not – Opening cash drawer – Firing missile PODC 2008 36 Privatization • Transaction makes object inaccessible • Works on it without synchronization • Works with locks … • But not necessarily with transactions … • Need algorithms and models! PODC 2008 37 Strong vs Weak Isolation • How do transactional & non-transactional threads synchronize? • Similar to memorymodel theory? • Efficient algorithms? PODC 2008 38 Single Global Lock Semantics? • Transactions act as if it acquires SGL • Good: – Intuitively appealing • Bad: – What about aborted transactions? – Expensive? • Need better models PODC 2008 39 Progress, Performance Metrics and Lower Bounds • Wait-free – Everyone makes progress • Lock-free – Someone makes progress • Obstruction-free – Solo threads make progress PODC 2008 40 Obstruction-Free? • Experience suggests simpler, more efficient and easier to reason about • But no real formal justification • Progress conditions imperfectly understood PODC 2008 41 Formal Models of Performance • Asynchrony PODC 2008 42 Formal Models of Performance • Asynchrony • Multi-level Memory PODC 2008 43 Formal Models of Performance • Asynchrony • Multi-level Memory • Contention PODC 2008 44 Formal Models of Performance • • • • Asynchrony Multi-level Memory Contention Memory Models PODC 2008 45 Formal Models of Performance • • • • • Asynchrony Multi-level Memory Contention Memory Models Reads, writes, CAS, TM and other stuff we may devise … PODC 2008 46 Formal Verification • Concurrent algorithms are hard • Need routine verification of real algorithms • Model checking? • Theorem proving? • Probably both PODC 2008 47 PODC Victories • Byzantine agreement PODC 2008 48 PODC Victories • Byzantine agreement • Paxos, group communication PODC 2008 49 PODC Victories • Byzantine agreement • Paxos, group communication • Replication algorithms Photoshop™ replication algorithm PODC 2008 50 PODC Victories • Byzantine agreement • Paxos, group communication • Replication • Lock-free & waitfree algorithms PODC 2008 51 PODC Victories • Byzantine agreement • Paxos, group communication • Replication • Lock-free & wait-free algorithms • Formalizing what needs to to be formalized! PODC 2008 52 An Insurmountable Opportunity! (hat tip: Walt Kelley) • Multicore forces us to rethink almost everything PODC 2008 53 An Insurmountable Opportunity! (hat tip: Walt Kelley) • Multicore forces us to rethink almost everything • The fate of CS as a vibrant field depends on our success PODC 2008 54 An Insurmountable Opportunity! (hat tip: Walt Kelley) • Multicore forces us to rethink almost everything • The fate of CS as a vibrant field depends on our success • PODC community has unique insights & advantages PODC 2008 55 An Insurmountable Opportunity! (hat tip: Walt Kelley) • Multicore forces us to rethink almost everything • The fate of CS as a vibrant field depends on our success • PODC community has unique insights & advantages • Are we equal to the task? PODC 2008 56 This work is licensed under a Creative Commons AttributionShareAlike 2.5 License. PODC 2008 57