MESQUITE: Mesh Optimization Toolkit Brian Miller, LLNL miller125@llnl.gov A) Project Overview • Science goal: Algorithms for improving unstructured mesh quality, achieved through optimization techniques. – Provide library of high quality mesh optimization tools to simulation code projects (Mesquite). • Pat Knupp (SNL) project PI – Brian Miller, Lori Diachin (LLNL) – Carl Ollivier-Gooch (UBC) • Long history of support through DoE Office of Science – Several successful collaborations with both SciDAC and ASC code groups. • Goals in CScADS context: Apply threaded parallelism to Mesquite optimization solvers. – Evolve algorithms and software to take advantage of current and emerging hardware and software capabilities (multicore, many core, etc.) B) Science Lesson • MESQUITE poses unstructured mesh quality improvement as an optimization problem. – Element Quality: • Ideal element as defined by the user drives this. – Mesh quality objective function: • How local element qualities are summed into the global objective function. Again, user defined. – Optimization problem: min(F(x)) • Optimization problems solved using included solvers ranging from simple steepest descent to more sophisticated Feasible Newton and Active Set solvers. Again user chooses solver method. C/D) Methods and Programming Model • Pretty basic C++ . • No third party libraries except for unit testing (cppunit). • MPI parallelism, mostly low volume nearest neighbor communication. • No threaded parallelism currently – we intend to change this. • Fairly portable code including recent runs on LLNL dawn BG/P machine. • Optimization solvers included in the code, no interface to external optimization libraries. • Designed to meet TSTTM mesh query interface and have demonstrated its use in several code interfaces. E) I/O and Viz • I/O: – Not really applicable since Mesquite is intended for use within an existing code framework. – For standalone use and testing we typically read/write one file per MPI task. • Viz: – Visit or paraview for viewing parallel mesh files. – Optional Gnuplot output of convergence histories. • Analysis: – Internal mechanism for mesh quality calculations. G/H) Tools and Performance • What tools do you use? – TAU/OpenSpeedShop/Intel tools for performance analysis and thread checks. – Totalview, valgrind for debugging. – Some internal debugging output available. • What do you believe is your current bottleneck to better performance? – Serial performance is sub-optimal. A route is needed from the generic algorithms provided in Mesquite to tight, high performance loops. • What do you believe is your current bottleneck to better scaling? – Scaling hasn’t been a problem (yet.) • What features would you like to see in performance tools? – Better derived hardware metrics/more sophisticated analysis. I) Status and Scalability • Goal in one year: Similar graph but with threads added. • Top Pains: – Must add threading to existing code – not enough resources to rewrite. – Require portable threading model. – How to inherit simulation threading model. J) Roadmap • Where will your science take you over the next 2 years? – Desire to support runs on significantly larger systems (Sequoia) • What do you hope to learn / discover? – Extent of MPI scalability. – Effect of adding threading on MPI scalability. • What improvements will you need to make? – New threaded global solver algorithms. – Gradual evolution to threaded implementation. • What are your plans? – OpenMP threading in limited regions of the code – specific algorithms with good available parallelism initially. – Extending threads to other areas may require algorithmic changes. – Explore other threading models: OpenCL, OpenACC, CUDA, etc.