GRID SERVICES FOR COMPUTATIONAL CHEMISTRY AT THE RUSSIAN GRID POLYGONS: ANALYSIS OF THE STATE AND PERSPECTIVES Institute of Problem of Chemical Physics, Moscow district, Chernogolovka Varlamov Dmitry / Варламов Дмитрий Volokhov V.M., Volokhov A.V., Pivushkov A.V., Pokatovich G.A., Prokhorov A.I. Dubna-2012, Distributed Computing and Grid-technologies in Science and Education Chemistry – a science about materials, i.e. about the world in which we live… USERS: science, industry, business Production of new and modified materials The new and improved production technologies of materials Industrial parameters of reactions Constants of rates of reactions Features of synthesis of the new and changed materials Optimization of nanomaterials and nanostructures Quantum-mechanical and molecular dynamics simulation Quantum-chemical calculations: electronic structure of molecules, interaction potentials, substance structure etc. Dubna-2012, Distributed Computing and Grid-technologies in Science and Education Hierarchical multilarge-scale approach to modeling of objects from quantum to macrolevel – all levels demand high efficience calculations The level of continuum media and construction level Defining the parameters of the finite element method, which implements a model of continuous media and the level of the model structures, based on the theory of mechanisms and machines, and the theory of complex systems Mesoscopic level The viscosity, thermal conductivity, coefficient of friction, the wave processes in cells, meso-and macroscopic processes in materials, etc. Kinetic level 103 106 System from to atoms , i.e. objects the size of 100-1000 nm3 a. evolution of quantum non-equilibrium systems consisting of hundreds of clusters b. methods of molecular dynamics (equations of classical mechanics) for atoms and small molecules, considered as classical particles Quantum-statistical level Calculation of models taking into account the immediate environment of clusters of atoms and small molecules Quantum-mechanical level Modeling by the ab initio methods of molecules and small clusters from 10-100 atoms (possible phases in a material) Dubna-2012, Distributed Computing and Grid-technologies in Science and Education Typical needs of quantum chemistry for highefficiency calculations 1. First two applied scientific calculations for Petaflops performance represented researches on molecular modeling: calculations of electronic structure of high-temperature superconductors (cuprates) by the quantum Monte-Carlo method and studying of effect of a huge magnetoresistance in magnetic nanoparticles (“Jaguar”, Oak Ridge, USA - 1,64 Pf) 2. Resource allocation (examples): researches of properties of water in low-dimensional systems – up to 8·106 CPU(core)-hours (Argonne National Laboratory); development of chemical catalysts – up to 30·106 CPU-hours (“Jaguar”) per year (2010); 3. Typical FULL docking of protein ligand: 200 atoms x 300 000 configurations x 1000 core-hours – about 300 Pflops!!! 4. By some estimates practically most of the supercomputer centers of the USA (San Diego, Ohio, Illinois etc.) allocate to 25-30 % of capacities for needs of biochemistry and molecular modeling, quantum chemistry, nanotechnological calculations Dubna-2012, Distributed Computing and Grid-technologies in Science and Education EXAMPLE OF ONE OF RESOURCE-INTENSIVE APPLICATION OF QUANTUM CHEMISTRY Research of catalitic hydrogen disintegration on nanostructures of Pt on a crystal surface of SnO2 Sn O Pt U (H2 ) ? ? U (H H ) H H U(H2) U(H+H) SKIF-MSU («Chebyshev»: VASP, 200 CPU, 15 hours, 10 steps of optimization, total requirement - 100-200 steps Dubna-2012, Distributed Computing and Grid-technologies in Science and Education rH-H rH-H Structure of the cluster Pt19 put on a surface SnO2, distances are given in Å, stability for separation of a cluster from a surface is resulted in eV 2.60 O Sn Pt 2.1 2.63 2.0 1 2.62 2.63 7 VASP complex intended for carrying out of calculations taking into account translating symmetry is used. For each point of calculation it is necessary more than days of operation on 100 CPU, necessary quantity of points – more than hundred days II.1 -7.23 Dubna-2012, Distributed Computing and Grid-technologies in Science and Education Dynamics of migration of a proton on a platinum cluster H -1.61 -2.39 O -2.56 -2.15 Sn Pt -2.35 VASP complex intended for carrying out of calculations taking into account translating symmetry is used. For each point of calculation it is necessary more than days of operation on 100 CPU, necessary quantity of points – more than hundred days -3.08 Dubna-2012, Distributed Computing and Grid-technologies in Science and Education EXAMPLE OF ANOTHER OF RESOURCE-INTENSIVE APPLICATION Trajectory calculations of sections of chemical reactions H+H+O+O H2+O+O H2O+O H2+O2 H2O2 H+H+O2 H+HO2 Σ=Sn/N Start conditions: • 4 angles of mutual orientation • aim parameters • 2 quantum oscillatory numbers • 2 quantum rotational numbers • energy of collision (in mass center system) Up to 1-10 millions independent trajectories – at least 2·106 CPU hours HO+HO HO+O+H Chemical reactions are modeled by movement of atoms on potential surfaces within the limits of classical dynamics Dubna-2012, Distributed Computing and Grid-technologies in Science and Education Typical examples of supercomputer centers focused on quantum chemical and molecular dynamics calculations 1. The EPSRC (Engineering and Physical Sciences Research Council) UK National Service for Computational Chemistry Software (NSCCS, http://www.nsccs.ac.uk – >250 Tflops 2. The National Resource for Biomedical and Chemistry Supercomputing (NRBSC, http://www.nrbsc.org) - calculation of molecular systems from 20000 to 120000 atoms, provides resources to external users up to 100 000 CPUhours per structure; 3. Chemical Computing Group (http://www.chemcomp.com), Quebec, Canada – >120 Tflops 4. Lawrence Berkeley National Laboratory (USA, http://www.lbl.gov/csd), chemical branch – > 450 Tflops; 5. Lehrstuhl für Theoretische Chemie der Technische Universität München, Germany (http://www.lrz.de/services/software/chemie) - own Linux cluster (96 Tflops) plus constantly allocated resources of Supercomputer Center; 6. Swiss National Supercomputing Centre (http://www.cscs.ch) – from tenth Tflops to subPflops allocated resources for chemical calculations Dubna-2012, Distributed Computing and Grid-technologies in Science and Education Scientific directions of operations of IPCP IPCP RAS represents the largest in Russia academic institute which are carrying out researches in following areas: • • • • • • the theory of elementary chemical processes; structure of molecules and structure of solid state objects; creation of materials with in advance set properties; kinetics and mechanisms of complicate chemical reactions; nanotechnologies; creation of biologically active substances and medical drugs; • Biotechnologies and biochemistry; • and many other scientific directions…. Dubna-2012, Distributed Computing and Grid-technologies in Science and Education УЧАСТИЕ ИПХФ В ГРИД-ПРОЕКТАХ Федеральные целевые программы 1. Государственный контракт № 07.514.11.4019 от 23 сентября 2011 г. с ИПХФ РАН в рамках ФЦП «Исследования и разработки по приоритетным направлениям развития научно-технологического комплекса России на 2007-2013 годы» по теме «Разработка технологии проведения высокопроизводительных расчетов в области вычислительной химии в различных распределенных средах с применением методов виртуализации приложений и ресурсов» – основной исполнитель 2. ФЦП "Развитие инфраструктуры наноиндустрии в Российской Федерации на 2008-2011 годы", проект «Создание Национальной нанотехнологической сети (ГридННС)», государственный контракт № 16.647.12.2031 от 13.05.2011 г. – соисполнитель 3. Государственный контракт в рамках ФЦП «Исследования и разработки по приоритетным направлениям развития научно-технологического комплекса России на 2007-2013 годы» на тему «Развитие пилотной зоны российской гридсистемы для высокопроизводительных вычислений, в том числе в интересах федеральных ядерных центров», тематика: «Адаптация пакетов прикладных программ для работы в пилотной инфраструктуре грид-системы. Разработка методов проведения грид-вычислений с использованием динамически формируемых сред исполнения грид-заданий» – соисполнитель Dubna-2012, Distributed Computing and Grid-technologies in Science and Education Программы фундаментальных исследований Президиума РАН 1. Программа № 13 фундаментальных исследований Президиума РАН на 2009-2012 годы «Проблемы создания национальной научной распределенной информационновычислительной среды на основе развития GRID технологий и современных телекоммуникационных сетей», проект «Исследование методов виртуализации вычислительных сред и приложений в области вычислительной химии. Динамическое формирование параллельных программных сред на распределенных ресурсах»; 2. Программа № 27 фундаментальных исследований Президиума РАН на 2009-2012 годы «Основы фундаментальных исследований наноматериалов», проект «Самоорганизация наноразмерных материалов и процессы их взаимодействия с адсорбируемыми соединениями: компьютерное моделирование в параллельных и распределенных GRID средах терафлопного уровня» Гранты Российского Фонда фундаментальных исследований 1. Грант РФФИ № 11-07-00686-а «Разработка методов динамического формирования параллельных вычислительных сред на различных типах грид ресурсов (на примере приложений квантовой химии)» Dubna-2012, Distributed Computing and Grid-technologies in Science and Education Historical sequence of usage of GRID technologies at IPCP 2004 - 2007 г. Condor 2005 - 2007 г. Globus 3/4 2006 - 2010 г. LCG-2 – gLite (EGEE-RDIG) 2008 - … Unicore (SKIF Polygon) 2010 - …. Globus Toolkit 4 (GridNNN) The quantum-chemical applications adapted for usage in the distributed GRID environments: • • • • • • Gaussian 03 (parallel, with taking into account license restrictions) GAMESS US (parallel – socket and MPI variants) CPMD (parallel) Dalton (parallel) NAMD (parallel) and many other: Author’s programs – for example, investigations of tunnel properties of structures (non-stationary equation Schrödinger ) Dubna-2012, Distributed Computing and Grid-technologies in Science and Education Current achievements On the basis of IPCP GRID center some Resource Sites of several Russian grid polygons (GridNNN, Minsvyaz GRID, SKIF-Polygon etc.) with support of a number of the established applied quantum-chemical packages (>10) and the Virtual Organizations (up to 3) oriented to computational chemistry constantly work Access to computing resources of a network of GridNNN (till 40-50 Tf and to 10000 CPU) and to the large volume of the applied quantum and chemical software (Firefly, MolPro, LaMMPS, NAMD, AbInit etc.) supported by infrastructure of GridNNN is provided Dubna-2012, Distributed Computing and Grid-technologies in Science and Education Virtual Organizations (VO) in computational chemistry field within National Nanotechnological Network (GridNNN) The IPCP created and supports 3 Virtual organizations "NanoChem", "Gamess", "Gaussian" in the field of computational chemistry within infrastructure «National Nanotechnological Network» (GridNNN) for carrying out calculations with use applied quantum and chemical software (Gaussian, GAMESS, NAMD, etc.) on GRID polygon of the Russian National Nanotechnological Network (http://nanogrid.icp.ac.ru/virtual.php, http://ngrid.ru/ngrid/gridnnn/volist) Participants of GridNNN anв mentioned above VO НИВЦ МГУ («Чебышев») НИИЯФ МГУ ОИЯИ (Дубна) РНЦ “Курчатовский Институт” Казанский НЦ РАН ПИЯФ ИПХФ РАН ИМСС УрО РАН (Пермь) СПбГУ ВЦ ДВО РАН (Хабаровск) СПИИРАН и другие ВУЗы, НИИ РАН, инновационные и коммерческие предприятия Resource site of IPCP in SKIF grid polygon (category “A” site) • ОС – Linux Ubuntu, Unicore 6.2 Middleware • • – Gateway (https://unicorgw.icp.ac.ru:8080); – Server container (Unicore/X); – Target System Interface (TSI); – UAS (Unicore Atomic Services); – Unicore User Database – XUUDB; – PBS/Torque; – User Interface – Command-line client Access through the distributed environment to the main supercomputer resources of SKIF-Polygon is provided (for example, SKIF-MSU, different Russian sites) Quantum-chemical applications (Gamess-US, Gaussian, NAMD, author’s multiparameter tasks) also are adapted for usage in Unicore environment Dubna-2012, Distributed Computing and Grid-technologies in Science and Education Quantum-chemical and molecular dynamic applied packages – features of use Use standard (from the point of view of the chemist and any end user) applied programs in grid environments is caused by several parameters: • demand of applied packages scientific and industrial researchers, i.e. end users (there is no sense to carry out very labor-consuming adaptation and the subsequent control of the grid-environment for low-used packages); • license restrictions (including existing for freely extended packages); for commercial applied packages of a condition of licensing can or limit extremely possibility of realization them in the grid-environment, or outright forbid it; • availability of source code package that is highly desirable for some modules packages (eg, communication, or responsible for the implementation of parallel protocols); • ability to work without an interactive user interaction and graphical display of information (which is still unlikely to realize in a grid environment). Dubna-2012, Distributed Computing and Grid-technologies in Science and Education Quantum-chemical and molecular dynamic applied packages on space of GridNNN and other grid polygons (all are established on a resource grid sites of IPCP) Quantum Chemistry: 1. Gaussian – including high-level web-interface 2. GAMESS-US – including high-level web-interface 3. FireFly 4. AbInit 5. NAMD 6. VASP 7. PWScf – including high-level web-interface 8. NWChem – including high-level web-interface Molecular dynamics 1. LAMMPS 2. OpenMX 3. GROMACS Dubna-2012, Distributed Computing and Grid-technologies in Science and Education Adaptation of the application package for use in grid environments 1. Grid Gateway settings for service of an applied package(s) and providing information on it to the monitor of resources 2. Creation of low-level interfaces for applied package(s) for solving of incoming jobs 3. Creation of high-level web-oriented user interfaces (POI, WIG) 4. In some cases – modification of source codes of applied packages Dubna-2012, Distributed Computing and Grid-technologies in Science and Education Authorised grid-user User certificate Data files High-level formation of job Resource selection Job launch Job’s monitoring Results WWW-portal WIG(s) Return of the received results Space of grid polygons Loading, edition, storing the input data Gateway to grid environment Authorization and certifications Formation of final result Web interfaces, POI, WIG Dubna-2012, Distributed Computing and Grid-technologies in Science and Education Low-level interfaces to applied packages on Resource Sites ППП на ресурсных сайтах Simplified scheme of grid calculations with high-level interfaces to applied software packages (middleware GridNNN, gLite, Unicore) Web-interfaces was created (http://nanogrid.icp.ac.ru , http://webgrid.icp.ac.ru – grid portal of IPCP) to work with the quantum-chemical software packages GAMESS-US, Gaussian, NAMD, NWChem, PWscf etc. in virtual organizations of GridNNN and the pilot zone of the Russian grid system for high performance computing Friendly to the end user (researcher) Web interfaces to applied software packages considerably reduce labor input of work of the user with similar packages in the conditions of the distributed computing environments. The created interfaces allow the user to work via the Internet browser and to carry out the following actions: • to authorize the user for start of a program complex and to carry out its certification in the Virtual Organization; • to prepare a job (including creation and edition of initial data and configuration files) in compliance with package requirements; • to start an applied package in grid-range infrastructure (if necessary – on any or chosen resource); • to conduct monitoring of performance of a job (including stopping and restart); • on termination of job – to receive results of the distributed calculations. Dubna-2012, Distributed Computing and Grid-technologies in Science and Education Example of module of Web-interface for quantum-chemical applied package GAMESS (for loading and edition initial files) Dubna-2012, Distributed Computing and Grid-technologies in Science and Education Module of the web interface for full-scale preparation of data and configuration files for a quantum-chemical GAMESS-US applied package in the distributed environments Dubna-2012, Distributed Computing and Grid-technologies in Science and Education Example of module of Web-interface for quantum-chemical applied package GAMESS (resource selection) Dubna-2012, Distributed Computing and Grid-technologies in Science and Education Example of module of Web-interface for quantum-chemical applied package GAMESS (monitoring of jobs) Dubna-2012, Distributed Computing and Grid-technologies in Science and Education The problems of creation and development of applied GRID services • High economic costs of creation and operation of computing resources and infrastructure, absence of steady support of the Russian grid polygons; • The absence (or low) interest from industry and business: – the general low culture of carrying out technological development in the majority of scientific and industrial organizations; – inability and unwillingness to use results of mathematical modeling and highefficiency calculations. • Absence of skilled professionals as from computer specialists (systems analysts, programmers), and by the users (chemists, engineers); • Practical absence of domestic applied packages high software cost, the license restrictions, the reduced area of use, difficulty of development; • Weak settlement of the legal and economic relations between owners of computing services (including GRID) and consumers Dubna-2012, Distributed Computing and Grid-technologies in Science and Education Suggested ways to solve problems • Stimulation of the user(s) (industrial first of all) for use of computing GRID services: – absence or the minimum payment of using by GRID resources at the first stages of innovative development; – training of users as a full-time, and in a mode of interactive on-line creation of information web resources on training to work with interfaces of applied packages and with GRID resources; – introduction in plans of training of applied experts (engineers, chemists, etc.) additional courses of work with applied software packages and carrying out by students of obligatory calculations of real scientific and technical and engineering tasks with use of supercomputers and GRID infrastructures; – creation friendly, with a large number of base templates, interfaces (WWW focused) to GRID resources; – introduction by the RF state as a certification condition when developing a part of commodity products and materials of obligatory application of means of mathematical modeling and testing Dubna-2012, Distributed Computing and Grid-technologies in Science and Education Suggested ways to solve problems (continued) • Government policies and stimulation of developers of applied GRID services and resources – creation of the State (or equity), stable, constantly running computing infrastructure (supercomputers, clusters of medium-sized, high-speed communication channels, etc.); – compensation by the State of a part of operational costs of the supercomputer centers for support of the distributed calculations (grants or the budgetary financing); – tax privileges for development of the domestic applied software, creation of the innovative parks focused on it; – improvement of standard and legal and economic regulation in the field of strategic information technologies; – in the field of the State education: (a) expanding the number of budget places in universities for professions' “system administration/programming”, “application programming” etc.; (b) Use a wide range of undergraduate and graduate students of universities to develop the necessary system and application software in the learning process; (в) conduct training of university teachers in these specialties. Dubna-2012, Distributed Computing and Grid-technologies in Science and Education Conclusions Within the IPCP GRID Resource Center integrated into the National Nanotechnological Network and a pilot zone of the Russian grid-system for high-efficiency calculations, access to a number of quantum-chemical and molecular-dynamic applied packages for carrying out large-scale calculations on the Russian grid-ranges with use both simplified, and problem-oriented high-level web interfaces is realized THANKS FOR YOUR ATTENTION!!!! Dubna-2012, Distributed Computing and Grid-technologies in Science and Education