Architecture Aware Tensor-Based Computing Challenges for the Computer Science and Mathematics Communities CISE CCF Algorithmic Foundations: Moore’s Law and Verifiable, Scalable, Portable, and Reproducible Matrix and Tensor Software Lenore Mullin Program Director CISE CCF Algorithmic Foundations National Science Foundation lmullin@nsf.gov CoProD 08 Friday, October 3, 2008 Outline • NSF and CISE • CCF: Algorithmic Foundations and Beyond • Challenges and Open Questions • Conclusions CoProD 08 Friday, October 3, 2008 1 National Science Foundation National Science Board Office of Inspector General Office of the Director Directorate for Biological Sciences Directorate for Computer & Information Science & Engineering Directorate for Education & Human Resources Directorate for Engineering Directorate for Geosciences CoProD 08 Administrative Offices Directorate for Mathematical & Physical Sciences Directorate for Social, Behavioral & Economic Sciences Office Cyberinfrastructure Office of International Science and Engineering Office of Polar Programs Friday, October 3, 2008 2 CISE Goals 1. 2. 3. Enable the United States to remain competitive in computing, communications, and information science and engineering Promote understanding of the principles and uses of advanced computing, communications, and information systems in service to society Contribute to universal, transparent, and affordable participation in an informationbased society CoProD 08 Friday, October 3, 2008 3 Achieving CISE Goals • CISE supports investigator initiated research in all areas of computer and information science and engineering • CISE helps develop and maintain cutting-edge national computing and information infrastructure for research and education • CISE contributes to the education and training of the next generation of computer scientists and engineers. CoProD 08 Friday, October 3, 2008 4 CISE Organization http://www.nsf.gov/cise/about/org_chart.jsp Assistant Director: Jeannette Wing Deputy Assist Dir: Deborah Crawford Div Dir: Sampath Kannan CoProD 08 Div Dir: Taieb Znati Friday, October 3, 2008 Div Dir: Haym Hirsh 5 CCF: Computing and Communication Foundations Division http://www.nsf.gov/div/index.jsp?div=CCF • Emerging Models and Technologies for Computation – Computational biology; quantum computing; nano-scale computing; biologically-inspired computing • Foundations of Computing Processes and Artifacts – Advanced computation research; compilers; computer architecture; design automation (micro/nano); graphics & visualization; software engineering & languages • Theoretical/Algorithmic Foundations – Computer science and communication theory; numeric symbolic/graphic computation; theory of computing; computational algebra and geometry; signal processing CoProD 08 Friday, October 3, 2008 6 Theoretical/Algorithmic Foundations Numeric, Symbolic and Algebraic Computing • Investigations into new data structures and algorithms that yield optimizations for particular applications are encouraged. • This includes the design and construction of high quality scientific software ideally adept across numerous scientific domains. Tensors are pervasive throughout NSF disciplines. • Specific research topics of interest include, but are not limited to, the following: numerical linear and multi-linear algebras, tensor algebras and decompositions used in memory hierarchy mappings; linear and non-linear optimization; modeling and simulation of complex processes; and numerical solutions of differential equations and PDE’s. Research in numerical computing and optimization has natural interdisciplinary applications. In fact, this program seeks applications in science and engineering whose basic problems actually require the development of new numerical and optimization methods. CoProD 08 Friday, October 3, 2008 7 Theoretical/ Algorithmic Foundations Numeric, Symbolic, and Algebraic Computing • Research focused on finding powerful methods for symbolically solving algebraic - numeric systems that combine differential, integral and polynomial equations is required. Interests include foundational research in algorithms and their efficient execution. • Basic research topics include: computational algebra and analysis, computational number theory and algebraic geometry, integration of numeric and symbolic techniques, symbolic scientific applications and software. Fruitful application areas for symbolic computation include the solution of complex equation sets. • Symbolic/Numeric manipulation and Tensors: – composition of tensor operations(symbolic) and numeric instantiation: e.g. SAGE, Matlab, Mathematica, Maple, Expression Templates, XML, compilers, interpreters, … – Tensors are n-d arrays CoProD 08 Friday, October 3, 2008 8 CCF: Theoretical/Algorithmic Foundations (AF) Cluster supports research in the following areas: • • • • • • • • Models of computation Computational complexity Parallel and distributed computation Random and approximate algorithms Algorithmic algebra, geometry, topology, and logic Computational optimization Techniques for representing, coding and transmitting information $30M/Year • New TF Program Solicitation NSF 08-518 – Due Date March 12, 2008 - March 19, 2008 http://www.nsf.gov/pubs/2008/nsf08518/nsf08518.htm TR Program Officers: John Cozzens, Lenore Mullin, Richard Biegel, Sirin Tekinay, Robert Grafton, EK Park CoProD 08 Friday, October 3, 2008 9 Theoretical/Algorithmic Foundations and BEYOND!!! • How can we create transformational science when we can’t verify scientific software? • How can domain scientists doing computational experiments achieve reproducibility: – Same answer and is that answer correct? – Are the resources used the same? – Can the software scale to today’s and tomorrow’s hardware? – Can we produce software that is optimal? CoProD 08 Friday, October 3, 2008 10 Theoretical/Algorithmic Foundations and BEYOND!!! • Optimality and Large Data Sets • Optimality and Data Locality across processor/memory hierarchy • Peta-Scale Computing and Beyond: scalability and portability • Algebra of Arrays to build ANY Tensor based application – Must be a closed algebra without anomalies for verification – No language today has such an algebra CoProD 08 Friday, October 3, 2008 11 Moore’s Law: Data Density Doubles every 18 Months EXCEPT Notice flattening of slope due to Compilers CMOS ICs General Architecture 109 TX-2 106 Lattice-Gas Architecture ENIAC 1 MIPS 103 Quantum Dots 10-3 10-6 Differential Analyzer 1850 Babbage Engine 1900 CoProD 08 1950 2000 2050 Friday, October 3, 2008 Year Liquid NMR 12 Proebsting’s Law: Compiler Advances Double Computing Power Every 18 Years This means that while hardware computing horsepower increases at roughly 60%/year, compiler optimizations contribute only 4%. General Architecture 109 CMOS ICs 106 Lattice-Gas Architecture TX-2 ENIAC 1 MIPS 103 Quantum Dots 10-3 10-6 Differential Analyzer 1850 Babbage Engine 1900 CoProD 08 1950 2000 2050 Friday, October 3, 2008 Year Liquid NMR 13 What is Computational Science and Engineering? Computer Science and Engineering Physical Sciences and Biological Sciences X Mathematics X = The Intersection of Domain Sciences, Mathematics and Computer Science and Engineering CoProD 08 Friday, October 3, 2008 14 What can we do? • Recent Award: for mini-symposia at the 2009 SIAM Annual meeting (Lenore Cowen, Tufts: Uniting Discrete Methods, optimizations and CISE Community with the Community studying Matrix Operations, Tensors, Verifiable Computational Experiments and Scalability) in which Computer Scientists and students will be funded to attend and interact. This was initiated due to numerous tensor sessions at the 2008 SIAM Annual meeting. – Tensor Decompositions Solving Fundamental Problems in Chemistry – Tensor Decompositions for Large-Scale Date Applications – A Novel Higher-order Generalized Singular Value Decomposition for Comparative Analysis of DNA Microarray Data from Different Organisms – Tensor Algebraic Methods and Their Application to HighDimensional Multi-Modal Data – TensorFaces: Multilinear (Tensor) Decomposition of Image Ensembles – Multilinear (Tensor) Independent Component Analysis – Modeling of Epileptic Seizures using Tensor Analysis – On a Generalization of Sylvester Methods for Symmetric Tensor Decomposition CoProD 08 Friday, October 3, 2008 15 What can we do? • A mini-symposium at the 2008 SIAM Annual meeting (MS3) entitled Architecture-Aware Scientific Computing. Organizers and Presenters: L. Mullin (NSF) and Padma Raghavan (NSF PI). • Plans to have an invitation only workshop with Frank Olken (IIS) are planned for spring 2009 to bring together experts in Knowledge Representation, Tensors, Algorithms and other related areas in Computer Science. Charles Van Loan, Cornell • Recent Award: for a workshop at the Courant Institute to bring together Mathematicians and Computer Scientists to discuss scalable algorithms for PDEs on parallel, distributed, and multi-core algorithms. ODEs and PDEs can be represented as matrix and tensor operations. Numeric and symbolic environments are growing in popularity to combine verification and optimal implementations. • Career Application 2008: BIO and CCF AF (Orly Alter: Integrative and Comparative Tensor Algebra Models of DNA Microarray Data from Different Studies of the Cell Cycle) CoProD 08 Friday, October 3, 2008 16 What can we do? • Milestones in Computer (Invitation only workshop) Algebra 2008: Systematic Tensor Simplification: a Diagramatic Approach by A. D. Kennedy and T. Reiter. This workshop illustrated the need for combined numeric and symbolic environments to compute and symbolically prove correctness of designs. Numerous articles from this workshop discussed the need to combine environments, which was validated by an NSF supported workshop report written by E. Kaltofen (one of the organizers) November 2007 at NSF in Arlington. Symbolic/Numeric proposals entered in the 2008 NSG solicitation showed a 100% growth over 2007. CoProD 08 Friday, October 3, 2008 17 What can we do? • SCAN 2008: expected outcomes – Hardware and software support for verification tools – Theory, algorithms and arithmetic for verified numerical computations – Supercomputing and reliability – Dynamical systems and verified numerical computation – Global optimization and verified numerical computation – Programming tools for verified numerical computation – Computer aided proofs – Industrial and scientific applications of verified numerical computations • CoProD 08: expected outcomes – Definition of new directions for combining numeric and symbolic approaches in solving constraints and optimization problems in particular and in decision making in general. CoProD 08 Friday, October 3, 2008 18 What can we do? • Matrix and Tensor operations are pervasive in science and engineering • Tensors are n-d arrays, but n-d arrays are more general • Generalized multi-dimensional Inner and Outer products • Summations of multi-dimensional arrays • Projection operators • AX=B like problems • Coupled differential and integral equations, eigenvalue problems: generally translate to matrix problems • Even non-linear operations: iterative solutions • Linear and Multilinear Algebra is not enough! – Scalars, anomalies • Existing languages are not enough! CoProD 08 Friday, October 3, 2008 19 Possible Solutions • Identify a closed algebra that subsumes important matrix operations • Augment existing languages with this algebra: optional use • Solve a few important problems completely • Use the same algebra to map to processor memory hierarchies • Use the same algebra to abstract machines • These concepts proposed at Sandia Workshop on Memory Hierarchy Optimizations for Scientific Software, January 2008 CoProD 08 Friday, October 3, 2008 20 Possible Solutions • Synergize Mathematicians, Computer Scientists and Domain Scientists to collaborate • Create a new community that solves these open questions – Revisit and reinvent • Community then creates a research base for funding agencies • Workshops and Colloquia – Supplements and/or new grants CoProD 08 Friday, October 3, 2008 21 Numeric, Symbolic and Algebraic Computing Program in TF • These issues appear in last year’s and this year’s solicitations • MPS and CISE cooperating programs – Hope to develop new solicitations • Attend SIAM, SC, APS, MRS, etc. – Raise the consciousness of computational scientists in these communities • After solving small number of algorithms within this algebra, identify what to do next. – Can be used in existing programs. CoProD 08 Friday, October 3, 2008 22 Thank You Questions? Lenore Mullin Program Director National Science Foundation Computer & Information Science & Engineering Directorate Division of Computer and Communications Foundations Algorithmic Foundations Cluster lmullin@nsf.gov CoProD 08 Friday, October 3, 2008 23