Concurrency in Programming Languages Matthew J. Sottile Timothy G. Mattson Craig E Rasmussen © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 1 Chapter 4 Objectives • Motivate concurrency-aware languages by discussing current techniques available in traditional languages. • Discuss high-level abstractions for concurrency provided by some modern languages. • Discuss why sequential languages can be limited with respect to automatic parallelization. © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 2 Outline • • • • Limits of libraries Explicit techniques High-level techniques Limits of explicit concurrency control © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 3 Implementation vs Model • One of the biggest obstacles a programmer faces today is not concurrency models themselves, but their implementation. – Especially how the implementation interacts with the language it is used from. • In many current languages, implementations of concurrent programming models are “black-box”, meaning that compilation and analysis tools can’t see how they interact with other parts of a program. © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 4 Limits of libraries • We call this problem a limitation due to the use of libraries. – Modules, components, and libraries considered good design practice by exposing only what a programmer needs to see while hiding the rest. – Encapsulation has the unfortunate effect of hiding what could be important properties related to concurrent execution. • This causes a number of problems. © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 5 Message passing libraries • Message passing is a popular concurrent programming model. • Compilation tools can do very little to optimize programs across cores or CPUs that use message passing if message passing is implemented in a library outside the core language. © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 6 Message passing libraries • This is why languages like the PGAS family (Titanium, UPC, Co-Array Fortran) introduced syntax for message passing in the language. • Prior approaches, like the Message Passing Interface (MPI), resided outside the language as libraries. © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 7 Thread libraries • Thread libraries are similar. • POSIX threads are implemented in a library. • Compilers for languages like C or C++ can do little to reason about interactions between POSIX threads for both safety and performance since they are outside the language definition. • This is why new languages are making threads and related types and operations part of the core language. © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 8 Outline • • • • Limits of libraries Explicit techniques High-level techniques Limits of explicit concurrency control © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 9 Explicit concurrent programming • Most techniques employed in practice today require the programmer to explicitly manage concurrent parts of a program. – Concurrent threads of execution – Shared data – Concurrency control mechanisms for synchronization and data protection © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 10 Message passing • One of the oldest models of concurrent programming. • Independent threads of execution interact by sending and receiving messages. • Messages used for: – Concurrency control and coordination. – Sharing of data. © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 11 Message passing • This is a very comfortable model for many programmers since it mimics the way groups of people split up and work on problems. – Exchanging notes, text messages, talking one-on-one or to groups. © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 12 Message passing • Primitive operators often consist of: – Send and receive – Broadcast – Collective operations • Pairwise operations are simple: one process sends, the other receives. • Broadcast is also simple: one process sends something to every other. • Collectives are a little more subtle. © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 13 Collectives • Example: A gather operation, in which one process gathers contributions from all others into an array. © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 14 Message Passing • Popular libraries for message passing include: – The Message Passing Interface (MPI) in high performance computing – Remote procedure call systems, such as CORBA, SOAP, Java RMI, etc… • Some languages like Erlang, UPC, Titanium, and Co-Array Fortran support message passing in the language. © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 15 RPC and RMI • Remote procedure call and remote method invocation systems are a form of message passing. • They do so by providing an abstraction over the message exchange protocol that gives the appearance of functions being invoked in remote processes. – The RPC/RMI layer hides the packing and unpacking of arguments, return values, and object references into the underlying network messages from the user. © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 16 Explicitly controlled threads • Many programming systems provide threads. • Popular systems like POSIX and Windows threads expose much of the details related to thread management and coordination to the user. – This is why they are called out as explicitly controlled threads. • Other systems, like OpenMP, also provide threads, but their management is often deferred to a runtime or compiler. © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 17 Explicitly controlled threads • Why is it useful to distinguish these? • Systems that require the user to manage control can be cumbersome and error prone to use. – People must manually manage locks, identify data to be protected by them, etc… • It is becoming commonplace to defer this work to the language by integrating threads more tightly into the language, leading to safer code. © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 18 Outline • • • • Limits of libraries Explicit techniques High-level techniques Limits of explicit concurrency control © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 19 Higher-level techniques • A number of concurrent programming techniques have emerged that are built on top of message passing and threading, and provide nice abstractions that the programmer can use. • These abstractions help make programs that are: – Faster, by allowing a compiler or runtime to make decisions difficult to implement by hand. – Safer, by eliminating many places where concurrency bugs can emerge. – Easier to write, by reducing code size. © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 20 Transactional memory • Software transactional memory is a method for dealing with critical sections without explicit locking. • Programmers define regions of code that must execute atomically, and indicate which variables are shared versus thread-private. • A runtime layer is responsible for ensuring that execution obeys the atomicity and sharing properties. © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 21 Transactional memory • A transactional memory system automatically detects when conflicts occur (e.g., two threads attempt to modify shared data from inside critical sections). • The runtime ensures that the conflict is resolved by forcing one or more participants to redo the computation. • These systems are well suited to concurrent programs in which the probability of conflict is low – otherwise, the conflict resolution logic (retrying computations) can slow down programs. © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 22 Event-driven programs • User interfaces are a very common concurrent system. – Windows updating independently, user input occurring simultaneously and asynchronously. • Concurrent programming of interfaces often is based on asynchronous events. – “User clicked mouse”, “Window A sent message to Window B”. © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 23 Event-driven programs • Often event-driven programs don’t know which other program will either produce events for it or consume events that it produces. • This leads to a model in which programs typically dynamically register as even providers or event consumers. • This model is well suited to asynchronous interactions, and systems where participants will come and go dynamically. © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 24 Outline • • • • Limits of libraries Explicit techniques High-level techniques Limits of explicit concurrency control © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 25 Limits of explicit concurrency control • The primary complication is the lack of ability for compilers to help optimize and detect errors in code. • Examples: – Threading as an add-on library is outside the languagedefinition, and therefore is outside the scope where compilers can help. – Message passing based on programming of libraries with a single-sided view makes reasoning about interactions on both sides of an exchange difficult for tools. © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 26 Inconvenient language features • Not only is the anonymity of libraries a problem for compilers, but languages often have features that are in direct conflict with analysis of code for concurrent programming. • Consider pointers. – The aliasing problem makes it difficult to infer whether or not two pointers point at the same location in memory. © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 27 The aliasing problem • If two pointers point at the same location in memory, this can limit the reordering of statements. • Can the last three statements be reordered? © 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 28