AUTOMATIC ONLINE TUNING AutoTune Key innovation Performance analysis and tuning is an important step in programming multicore-based parallel architectures ranging from small clusters with GPUs (Graphics Processing Units) such as modern high-end servers as used by Google up to supercomputers (HPC) such as the new supercomputer SuperMUC at Leibniz Computing Centre, Garching. While performance analysis tools exist that help the developer in analysing the application performance, these tools do not give any recommendations how to tune the code. AutoTune will extend Periscope, an open source automatic online and distributed performance analysis tool developed by Technische Universität München, with automatic online tuning plugins for performance and energy efficiency tuning. The resulting Periscope Tuning Framework will be able to tune serial and parallel codes with or without GPU kernels and will return tuning recommendations that can be integrated into the production version of the code. The whole tuning process, consisting of automatic performance analysis and automatic tuning, will be executed online, i.e. during a single run of the application. This new productivity tool for parallel programming will increase multi-core programming efficiency by orders of magnitude. Moreover, an energy saving up to 10% can be expected. The technical approach AutoTune will develop the Periscope Tuning Framework as an extension of Periscope. It will follow the main Periscope principles, i.e., use of formalised expert knowledge in form of properties and strategies, automatic execution, online search based on program phases, and distributed processing. Periscope will be extended by a number of tuning plugins that fall into two categories: online and semi-online plugins. An online tuning plugin will perform transformations to the application and/or the execution environment without requiring a restart of the application. A semi-online tuning plugin will be based on a restart of the application but without re-starting Periscope. Contract number 288038 Project coordinator Technische Universität München Contact person Michael Gerndt TUM Boltzmannstr. 3 D-85748 Garching Tel: +49 89 289 17652 Fax: +49 89 289 17662 gerndt@in.tum.de Project website www.autotune-project.eu Community contribution to the project 2.35 Mio Euro Project start date 15.10.2011 Duration 36 months Demonstration and Use The results of AutoTune will be demonstrated in the context of GPU programming and High Performance Computing. We will run applications with and without autotuning on different state of the art GPU cards available in 2014 in order to demonstrate the increase in performance portability. We will also demonstrate the increased power efficiency of codes on the Intel Xeon Sandy Bridge-EP nodes of the new superMUC supercomputer at Leibniz Computing Centre in Garching. We expect to see a 10% power saving by adapting the node’s CPU frequency. Scientific, Economic and Societal Impact AutoTune will enhance the international recognition of all participating organizations and facilitate the networking of academic and industrial participating organizations. The scientific impact will include but not be limited to extensions of European performance analysis tools towards automatic tuning. The involved universities will include the gained knowledge into their curricula. Thus future engineers will be better trained, which is an important asset for the European industry. The industrial partner CAPS will exploit AutoTune results by improving its current and future product tool offering enhancing it with new code optimization techniques that will help to provide users with more portable performance and more automatic adaptation to new hardware target architectures. The big supercomputing centres such as Leibniz Computing Centre will exploit the energy saving results leading directly to financial benefit for the operation cost of its high-end system by about 300.000€ per year. Due to the utmost importance of energy savings not only for HPC but also for high-end servers, the project will have a big societal impact by potentially saving billions of Euros worth of energy in servers world wide. This reduction of energy consumption will also significantly contribute to a Greener World. In addition, the European economy will profit by strengthening companies working on desktop accelerated systems as well as those using those systems in their business. Key Features • • • Automatic performance tuning of parallel codes Increased programmer productivity on GPU accelerated systems. Less power consumption of petascale systems Project partners Country CAPS Entreprise FR Leibniz Computing Cenre NUI Galway DE Technische Universität München Universitat Autonoma de Barcelona Universität Wien DE IE ES AU