IBM Research: Programming Technologies Panel 1: Hot topics and future directions in programming languages (PL) research Vivek Sarkar, IBM Research May 9, 2007 © 2007 IBM Corporation My Background Education B.Tech., IIT Kanpur, 1981 (Advisor: Keshav Nori) M.S., U Wisconsin-Madison, 1982 Ph.D., Stanford University, 1987 (Advisor: John Hennessy) Career at IBM 1987 - 1990, PTRAN (Manager: Fran Allen) 1991 - 1993, ASTI optimizer 1994 - 1996, Application Development Technology Institute 1997, MIT sabbatical 1998 - 2001, Jalapeno / Jikes RVM 2002 - present, PERCS (includes X10, Parallel Tools, Productivity) Family Married with two daughters, 18 and 15 Interests: Hiking, Theater, Horseback riding, Violin 2 PL Summer School, May 2007 © 2007 IBM Corporation PL Research Opportunities: Examples Programming Models and Programming Language Design X10 (contact: Vijay Saraswat) Development Tools SAFARI (contact: Robert Fuhrer) Parallel Tools Platform (contact: Evelyn Duesterwald) Compilers, Managed Runtimes, Static & Dynamic Optimization Metronome (contact: David Bacon) 3 PL Summer School, May 2007 © 2007 IBM Corporation X10 Vision: Portable Productive Parallel Programming X10 Data Structures X10 language defines mapping from X10 objects & activities to X10 places X10 Places X10 deployment defines mapping from virtual X10 places to physical processing elements Physical PEs Homogeneous Multi-core Heterogeneous Accelerators Clusters SPE PEs, L1 $ ... PEs, L1 $ SPU ... SPU SPU SPU SPU SPU SXU SXU SXU SXU SXU SXU LS LS LS LS LS LS LS SMF SMF SMF SMF SMF SMF SMF SMF ... ... 16B/cycle PPE PPU L2 L1 SMP Node MIC 16B/cycle (2x) SMP Node PEs, PEs, ... EIB (up to 96B/cycle) 16B/cycle PEs, L1 $ SPU SXU LS 16B/cycle L2 Cache PEs, L1 $ SPU SXU Memory ... PEs, PEs, ... Memory BIC PXU 32B/cycle 16B/cycle L2 Cache Dual XDRTM FlexIOTM Interconnect 64-bit Power Architecture with VMX 4 PL Summer School, May 2007 © 2007 IBM Corporation Overview of X10 (x10.sf.net) Storage classes: Activity-local Place-local Partitioned global Immutable • Dynamic parallelism with a Partitioned Global Address Space • Places encapsulate binding of activities and globally addressable data • async (P) S --- run statement S asynchronously at place P • finish S --- execute statement S, and wait for descendant async’s to terminate • atomic S --- execute statement S atomically • No place-remote accesses permitted in atomic section Deadlock safety: any X10 program written with async, atomic, and finish can never deadlock 5 PL Summer School, May 2007 © 2007 IBM Corporation Java Grande Forum Example (Monte Carlo) Multi-Threaded Java public void runThread() { Single-Threaded Java Distributed Multi-Threaded X10 results = new Vector(nRunsMC); thobjects[] = new initTasks() { tasksRunnable = new ToTask[nRunsMC]; … }Runnable [JGFMonteCarloBench.nthreads]; initTasks() { tasks = new ToTask[dist.block([0:nRunsMC-1])]; … } Thread th[] = new Thread [JGFMonteCarloBench.nthreads]; // Create (nthreads-1) to share work public void runDistributed() public void runSerial() { for(int i=1;i<JGFMonteCarloBench.nthreads;i++) { results = new Vector(nRunsMC); { thobjects[i] = new AppDemoThread(i,nRunsMC); // Now do the computation. results = new x10Vector(nRunsMC); th[i] = new Thread(thobjects[i]); PriceStock ps; // Now do the computation th[i].start(); for( int iRun=0; iRun < nRunsMC; iRun++ ) { finish ateach ( point[iRun] : tasks.distribution ) { } ps = new PriceStock(); PriceStock ps = new PriceStock(); // Parent thread acts as thread 0 ps.setInitAllTasks(initAllTasks); ps.setInitAllTasks((ToInitAllTasks) initAllTasks); thobjects[0] = new AppDemoThread(0,nRunsMC); ps.setTask(tasks[iRun]); ps.setTask(tasks[iRun]); thobjects[0].run(); ps.run(); ps.run(); // Wait for child threads results.addElement(ps.getResult()); final ToResult r = ps.getResult(); // ToResult is a value type for(int i=1;i<JGFMonteCarloBench.nthreads;i++) { } async(results) atomic results.v.addElement(r); try { th[i].join();} catch (InterruptedException e) {} } } } } } class AppDemoThread implements Runnable { Source: http://www.epcc.ed.ac.uk/javagrande/javag.html ... // initialization code - The Java Grande Forum Benchmark Suite public void run() { PriceStock ps; int ilow, iupper, slice; slice = (nRunsMC+JGFMonteCarloBench.nthreads-1) / JGFMonteCarloBench.nthreads; ilow = id*slice; iupper = Math.min((id+1)*slice, nRunsMC); for( int iRun=ilow; iRun < iupper; iRun++ ) { ps = new PriceStock(); ps.setInitAllTasks(AppDemo.initAllTasks); ps.setTask(AppDemo.tasks[iRun]); ps.run(); AppDemo.results.addElement(ps.getResult()); } } // run() } 6 PL Summer School, May 2007 © 2007 IBM Corporation Lead: Robert Fuhrer SAFARI Vision: Meta-Tooling for Language-Specific IDEs Problem Lack of tool support can be a significant barrier in adoption of new languages SAFARI Solution: Meta-tools and framework Language generation tools (scanner/parser generator, high quality automatic ASTs) Generation of Eclipse toolkit components • Encapsulate Eclipse API knowledge • Encapsulate common language structure, semantics, processing idioms Leverage language inheritance • structure/semantics implementation People P. Charles, J. Dolby, R. Fuhrer, S. Sutton, M. Vaziri 7 PL Summer School, May 2007 © 2007 IBM Corporation SAFARI Target IDE Functionality syntax highlighting, compiler annotations, hover help, source folding, formatting… navigation (hyperlinks, “Open Type”, …) content assist, quick fixes structural views compiler w/ incremental build, automatic dependency tracking analysis & refactoring 8 New Project/Type/… creation wizards launch & debug: launch configs, breakpoints, backtraces, values, evaluation PL Summer School, May 2007 © 2007 IBM Corporation Example of SAFARI Challenges: Error Handling Errors are the norm! must not cripple the IDE! mangled statement A() void A() { int x= 5; body foo blah; for(int i=0; i < a.length; i++) { int x= 5; BadStmt int y= a[i] * a[j]; x += y; } header } dangling ref int i=0; i < a.length for body i++ … SAFARI/LPG: systematic, semi-automatic error recovery for parsing/creating “prosthetic” AST nodes Polyglot: ideas for finer-grained dependencies, better robustness, make data dependencies more explicit 9 PL Summer School, May 2007 © 2007 IBM Corporation Parallel Tools Platform Vision: Integrated Workbench for High-Productivity Parallel Programming PERCS workbench enhancements: MPI tools, OpenMP tools, Remote System Exploration, Performance Exploration, Runtime Error Detection, Team Platform, Productivity measurements HPC System Parallel Tools Platform (PTP) Open HPC Workbench (Runs on Windows, Linux, Mac OS, …) 10 CPO Static Analysis Tools LL PL Summer School, May 2007 HPC Toolkit Cache injection COE ILM Meiosys ESSL User Space APPLICATION PESSL IBM’ s MPI Remote interface from Eclipse Workbench to HPC system DB SMT Exploitation SHMEM UPC X10 LAPI Kernel Space GPFS VSD/NSD DD HYP HAL SOCKETS TCP UDP IP Operating System Compilers CSM, RSCT Eclipse PTP is the integration hub for all PERCS tools IF_LS Network Adapter HMC Network = New additions through PERCS to the HPC SW architecture © 2007 IBM Corporation PTP Example: MPI Barrier Verification Tool Action to run Barrier Verifier Verify barrier synchronization in C/MPI programs Synchronization errors lead to deadlocks and stalls. Programmers may have to spend hours trying to find the source of a deadlock Static verification tools help to eliminate errors before the program is executed Contact: Evelyn Duesterwald, Yuan Zhang 11 PL Summer School, May 2007 © 2007 IBM Corporation MPI Barrier Verification Tool (contd.) MPI does not place any constraints on the placement of barriers MPI_Comm_rank(com, &rank) rank > 2 potential deadlock … MPI_Barrier(com) P(k) i = rank 12 Synchronization errors in MPI are a common and difficult to find problem MPI Barrier Verification: Verify that the number of barriers along concurrent paths is the same … i = F(0) i>0 Programmer has to ensure that the number of barriers along concurrent paths is the same not a deadlock MPI_Barrier(com) PL Summer School, May 2007 - Match barriers that synchronize - For unmatched barriers, report a synchronization error with a counter example that illustrates the error © 2007 IBM Corporation Metronome Vision:Transparent Real-time Java C++ Application Java Application Java Application Garbage Collection Java Runtime System C++ Runtime System (JVM) Metronome Java Runtime System Manual, Unsafe Automatic, Safe Automatic, Safe Predictable Unpredictable Predictable www.research.ibm.com/metronome 13 PL Summer School, May 2007 © 2007 IBM Corporation Real-time Garbage Collection Garbage collection is fundamental to Java’s value proposition Safety, reliability, programmer productivity But also causes the most non-determinism (100 ms – 10 s latencies) RTSJ standard does not support use of garbage collection for real-time Metronome is our hard real-time garbage collector Worst-case 2 ms latencies; high throughput and utilization • Research under way to further reduce real-time guarantee from ms to us 100x better than competitors’ best garbage collection technology Space Time Application Collector Base Application Memory 14 Resulting Schedule PL Summer School, May 2007 Garbage Collection Pause Times (Customer application) Worst-case 1.7 ms Average 260 us © 2007 IBM Corporation Space g Time Application (Mutator) Scheduler a*(∆GC) = Per-GC Allocate Rate u = utilization 50% 75% s = used space 45 MB 100 MB ∆t = time resolution 5 ms 15 PL Summer School, May 2007 50 MB/s m = Live Data 30 MB Collector RT = Trace Rate 50 MB/s RS = Sweep Rate 300 MB/s © 2007 IBM Corporation PL Research Opportunities Programming Models and Programming Language Design Drivers: Concurrency, Accelerators, Data Access, Web Services, DSLs, … Development Tools Drivers: Program Analysis for Software Quality, Debugging Tools, Performance Tools, Refactorings, Language-Sensitive IDE’s, … Compilers, Managed Runtimes, Static & Dynamic Optimization Drivers: Hardware roadmap, PL trends, Virtualization, Embedded systems, Real-time systems, … 16 PL Summer School, May 2007 © 2007 IBM Corporation Additional Information X10, http://x10.sf.net SAFARI, http://domino.research.ibm.com/comm/research_projects.nsf/pag es/safari.index.html Parallel Tools platform, http://eclipse.org/ptp Metronome, http://www.research.ibm.com/metronome/ IBM Research “Innovating at IBM” video, http://www.research.ibm.com/about/career.shtml “Valuing diversity: an ongoing commitment”, http://www.ibm.com/employment/us/diverse 17 PL Summer School, May 2007 © 2007 IBM Corporation