NAIS Centre for Numerical Algorithms & Intelligent Software Software Challenges in Computational Science Taking Scientific Supercomputing out of the Stone Age A D Kennedy University of Edinburgh Future Directions in Tensor-Based Computation and Modelling NAIS Overview We contemplate strongly typed, categorical, efficient, portable, reusable, modular, robust, architecture-neutral, buzzwordcompliant software and languages for high performance parallel tensor-based scientific computing. Tuesday, 12 July 2016 A D Kennedy Future Directions in Tensor-Based Computation and Modelling 3 NAIS Problem Programs need to run efficiently on multiple architectures It must be easy to make high-level algorithmic changes Programs must adapt themselves around low-level assembly routines and data layouts Programs should use state-of-the-art numerical methods Tuesday, 12 July 2016 A D Kennedy Future Directions in Tensor-Based Computation and Modelling 4 NAIS Tensors? Physical problems are highly constrained, or maybe even completely determined, by their symmetries Dynamical behaviour realized by multilinear representations of the symmetry, i.e., tensors We often need to consider continuous symmetries (Lie groups) and both finite dimensional (spin) and infinite dimensional (space-time) representations Tuesday, 12 July 2016 A D Kennedy Future Directions in Tensor-Based Computation and Modelling 5 NAIS Commonality Many (most?) large-scale physics programs are built out of a small set of common numerical “building blocks” such as Linear algebra FFTs Symplectic integrators Discrete differential operators Tuesday, 12 July 2016 A D Kennedy Future Directions in Tensor-Based Computation and Modelling 6 NAIS Numerical libraries? Nobody writes their own sqrt or exp routines Unambiguous standard definitions IEEE 754 (but who uses unnormalized underflow?) sin(3.14159e+30) is a random number in [-1,1] Efficient implementation provided by vendors for every architecture So why don’t we always use the nice libraries that kind numerical analysts write for us? Why do we keep using our own implementations of conjugate gradients to solve linear systems? Tuesday, 12 July 2016 A D Kennedy Future Directions in Tensor-Based Computation and Modelling 7 NAIS BLAS and all that Our problems involve huge sparse matrices Numerical libraries allow us to use our own “black box” functional linear operators But even our vectors are large: they may be distributed over 105 processors We need implementations of state-of-the-art linear solvers that will use our implementation of underlying vector operations AXPY, ..., inner products, norms, ... BLAS! But with our own strange vector representations Tuesday, 12 July 2016 A D Kennedy Future Directions in Tensor-Based Computation and Modelling 8 NAIS Why not standard classes? We must still write our time-critical kernels in assembler We even build our own hardware for these kernels QCDSP, QCDOC, Blue Gene, … We mainly optimize data motion (prefetching, communication, etc.) Flops are cheap, moving data is expensive Tuesday, 12 July 2016 A D Kennedy Future Directions in Tensor-Based Computation and Modelling 9 NAIS Abstraction Languages and interfaces that allows abstraction of these building blocks Higher-level algorithms expressed in terms of abstract lower-level algebraic structures Need languages than allow us to describe the necessary structure The structure of a Hilbert space is a class of classes = a category ≈ an interface Need agreement on standard structures Tuesday, 12 July 2016 A D Kennedy Future Directions in Tensor-Based Computation and Modelling 10 NAIS Categories A category is a class of classes sharing the same structure But not necessarily having anything else in common SU(2) and SU(3) are both groups, but you can’t multiply their elements together (except in C++ or Java) Structure is defined by An explicit set of signatures An implicit set of axioms Tuesday, 12 July 2016 A D Kennedy Future Directions in Tensor-Based Computation and Modelling 11 NAIS Algorithmic categories Follow mathematical structure Why not write linear system solvers the way they are expressed in textbooks? Select appropriate algorithms using type information At compile time E.g., use CG for positive symmetric matrices Mendacity is sometimes useful E.g., use CG for non-positive symmetric matrices Tuesday, 12 July 2016 A D Kennedy Future Directions in Tensor-Based Computation and Modelling 12 NAIS Compilers & Languages Syntactic sugar is good It is easier to avoid bugs if we can write z ←A*x + y rather than SAXPY(A,x,y,z) ... or was it SAXPY(A,z,x,y)? Low-level automatic optimization Compilers allocate registers better than humans do Automatic parallelization or vectorization Requires programmers to write “recognizable” patterns Work-around for lack of standard structures Automatic data pre-fetching perhaps a difficult but possible goal Tuesday, 12 July 2016 A D Kennedy Future Directions in Tensor-Based Computation and Modelling 13 NAIS Memory management Memory management for vector temporaries Not enough memory to allocate them statically Heap fragmentation Stacks awkward for common sub-expression elimination and so forth New memory management model is needed, neither universal garbage collection nor complete user responsibility is good enough Tuesday, 12 July 2016 A D Kennedy Future Directions in Tensor-Based Computation and Modelling 14 NAIS Serialization Portable interchange format for objects Data grids: exchange everything as serialised data Many ↔ one instead of many ↔ many Usually XML-based XML is verbose, why not use binary format? XML compresses quite well Data layout transformations Redistribute grid points for different number of processors FFTs? Tuesday, 12 July 2016 A D Kennedy Future Directions in Tensor-Based Computation and Modelling 15 NAIS Solutions? Standard categories for common structures Relatively easy in some areas, e.g., linear algebra Harder to get “correct” abstractions for others Language bindings We can’t persuade the world to use a new language, even if it is better Type-checking and optimization better for some languages than others Implementation of state-of-the-art algorithms in terms of these categories Tuesday, 12 July 2016 A D Kennedy Future Directions in Tensor-Based Computation and Modelling 16 NAIS Obstacles Risk: application scientist are not willing to take on the risk of using experimental software in addition to the risks of any cutting-edge scientific supercomputing project Careers: developing scientific software does not get you a permanent academic job Expertise: very few applications scientist are familiar with modern software techniques; most supercomputer codes are written by graduate students who believe that documentation is an unnecessary evil Tuesday, 12 July 2016 A D Kennedy Future Directions in Tensor-Based Computation and Modelling 17 NAIS Software strategy Must be constructed in layers with welldefined categorical interfaces Low-level machine dependent layers Intermediate building blocks (e.g., linear algebra) Top-level application specific Categorical interface means that functionality is specified, not implementation or data layout Layers can be changed independently Must be well documented Tuesday, 12 July 2016 A D Kennedy Future Directions in Tensor-Based Computation and Modelling 18 NAIS Sociological strategy Must be a cooperative venture of application scientists, numerical analysts, computer scientists, vendors, and funders Needs international cooperation Need to develop real software to get it accepted Tuesday, 12 July 2016 A D Kennedy Future Directions in Tensor-Based Computation and Modelling 19 NAIS Funding requirements Funding to encourage people to work on a long-term project that will not get them publications in their own field Input from experts in several application areas, numerical analysis, and computer science People to write and test prototypes People to write documentation Tuesday, 12 July 2016 A D Kennedy Future Directions in Tensor-Based Computation and Modelling 20 NAIS Conclusions I’ll still write it in Fortran Is it really buzzwordcompliant? Tuesday, 12 July 2016 A D Kennedy Future Directions in Tensor-Based Computation and Modelling 21 NAIS Centre for Numerical Algorithms and Intelligent Software At the end of 2008, the UK’s Engineering and Physical Sciences Research Council (EPSRC) together with the Scottish Funding Council (SFC) have provided funds to establish this centre as a joint venture of the University of Edinburgh, Heriot-Watt University and the University of Strathclyde. The total budget of NAIS is £7.5M and the duration of the grant is 5 years, with first spending from August of 2009. What is the purpose of NAIS? NAIS will bridge the gap between numerical analysts, computer scientists and HPC software developers by developing new systems of code annotation, new compilers and efficient implementations for application-oriented computational methods such as adaptive finite elements, multiscale modelling, molecular simulation and optimization. Who is involved in NAIS? NAIS is a partnership of the Schools of Mathematics and Informatics and the Edinburgh Parallel Computing Centre (EPCC) at the University of Edinburgh, and the Departments of Mathematics at Heriot-Watt and Strathclyde Universities. In addition, NAIS will include collaborations with researchers in the sciences and engineering at the three universities, and a network of collaborations with other institutions including, so far, the University of Cambridge, the University of Warwick, and Wales Institute for Mathematics and Computer Science. A programme with CERFACS in Toulouse will provide for joint workshop and training activities. Industrial collaborations are to be established with IBM, Schlumberger, D.E. Shaw, SGI, Cray, Orage/France Telecom, SAS. An international advisory committee will be established including Fran Berman, University of California at San Diego Andreas Frommer, Wuppertal David Keyes, Columbia and Lawrence Livermore National Laboratory J. Tinsley Oden, University of Texas at Austin Philippe Toint, University of Namur, Belgium Tuesday, 12 July 2016 A D Kennedy Future Directions in Tensor-Based Computation and Modelling 22 NAIS Centre for Numerical Algorithms and Intelligent Software The first director of NAIS is Benedict Leimkuhler (Mathematics, University of Edinburgh) and the Steering Committee consists of Mark Ainsworth (Mathematics, Strathclyde University) Murray Cole (Informatics, University of Edinburgh) Dugald Duncan (Mathematics, Heriot-Watt University) Arthur Trew (Edinburgh Parallel Computing Centre). What research activities are envisaged in NAIS? NAIS will operate substantial training, visitor and workshop programmes in all relevant areas of numerical analysis, computer science and HPC software development. There will be additional activities at the NAIS partners (currently Cambridge, Warwick and Wales Institute for Mathematical and Computational Science) What posts are anticipated in NAIS? 6 lectureships (permanent positions) in Mathematics (two each) at Edinburgh, Heriot-Watt and Strathclyde Universities. A lectureship in Informatics at the University of Edinburgh 10 Postdoctoral Research Assistantships (3 years each) 24 PhD studentships (typically 4 years duration) The first posts will be advertised in 2009. Where can I find out more about NAIS? See the website at http://www.nais.org.uk or send an email to info@nais.org.uk. PhD student applications are welcome at any time and should be sent to the relevant department with a cover letter that mentions the “Numerical Algorithms and Intelligent Software Science and Innovation Project.” Tuesday, 12 July 2016 A D Kennedy Future Directions in Tensor-Based Computation and Modelling 23