Grid services for computational chemistry at the Russian grid polygons

advertisement
GRID SERVICES FOR COMPUTATIONAL CHEMISTRY
AT THE RUSSIAN GRID POLYGONS:
ANALYSIS OF THE STATE AND PERSPECTIVES
Institute of Problem of Chemical Physics, Moscow district, Chernogolovka
Varlamov Dmitry / Варламов Дмитрий
Volokhov V.M., Volokhov A.V., Pivushkov A.V., Pokatovich G.A., Prokhorov A.I.
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Chemistry – a science about materials, i.e.
about the world in which we live…
USERS: science, industry, business
Production of new and modified materials
The new and improved production technologies of materials
Industrial parameters of
reactions
Constants of rates
of reactions
Features of synthesis of
the new and changed
materials
Optimization of nanomaterials
and nanostructures
Quantum-mechanical and molecular dynamics
simulation
Quantum-chemical calculations: electronic structure of molecules,
interaction potentials, substance structure etc.
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Hierarchical multilarge-scale approach to modeling of objects from
quantum to macrolevel – all levels demand high efficience calculations
The level of continuum media and construction level
Defining the parameters of the finite element method, which implements a model of continuous
media and the level of the model structures, based on the theory of mechanisms and machines,
and the theory of complex systems
Mesoscopic level
The viscosity, thermal conductivity, coefficient of friction, the wave processes in cells, meso-and
macroscopic processes in materials, etc.
Kinetic level
103
106
System from
to
atoms , i.e. objects the size of 100-1000 nm3
a. evolution of quantum non-equilibrium systems consisting of hundreds of clusters
b. methods of molecular dynamics (equations of classical mechanics) for atoms and small
molecules, considered as classical particles
Quantum-statistical level
Calculation of models taking into account the immediate environment of clusters of atoms and
small molecules
Quantum-mechanical level
Modeling by the ab initio methods of molecules and small clusters from 10-100 atoms (possible
phases in a material)
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Typical needs of quantum chemistry for highefficiency calculations
1. First two applied scientific calculations for Petaflops performance
represented researches on molecular modeling: calculations of
electronic structure of high-temperature superconductors
(cuprates) by the quantum Monte-Carlo method and studying of
effect of a huge magnetoresistance in magnetic nanoparticles
(“Jaguar”, Oak Ridge, USA - 1,64 Pf)
2. Resource allocation (examples): researches of properties of water
in low-dimensional systems – up to 8·106 CPU(core)-hours
(Argonne National Laboratory); development of chemical catalysts
– up to 30·106 CPU-hours (“Jaguar”) per year (2010);
3. Typical FULL docking of protein ligand: 200 atoms x 300 000
configurations x 1000 core-hours – about 300 Pflops!!!
4. By some estimates practically most of the supercomputer centers
of the USA (San Diego, Ohio, Illinois etc.) allocate to 25-30 % of
capacities for needs of biochemistry and molecular modeling,
quantum chemistry, nanotechnological calculations
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
EXAMPLE OF ONE OF RESOURCE-INTENSIVE APPLICATION
OF QUANTUM CHEMISTRY
Research of catalitic hydrogen disintegration on
nanostructures of Pt on a crystal surface of SnO2
Sn
O
Pt
U (H2 )
?

?
U (H  H )
H H
U(H2)
U(H+H)
SKIF-MSU («Chebyshev»: VASP, 200 CPU, 15 hours, 10 steps
of optimization, total requirement - 100-200 steps
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
rH-H
rH-H
Structure of the cluster Pt19 put on a surface SnO2, distances are given in Å,
stability for separation of a cluster from a surface is resulted in eV
2.60
O
Sn
Pt
2.1
2.63
2.0
1
2.62
2.63
7
VASP complex intended for carrying out
of calculations taking into account
translating symmetry is used.
For each point of calculation it is
necessary more than days of
operation on 100 CPU, necessary
quantity of points – more than
hundred days
II.1
-7.23
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Dynamics of migration of a proton on a platinum cluster
H
-1.61
-2.39
O
-2.56
-2.15
Sn
Pt
-2.35
VASP complex intended for
carrying out of calculations
taking into account
translating symmetry is used.
For each point of calculation it is
necessary more than days
of operation on 100 CPU,
necessary quantity of points
– more than hundred days
-3.08
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
EXAMPLE OF ANOTHER OF RESOURCE-INTENSIVE
APPLICATION
Trajectory calculations of sections of chemical reactions
H+H+O+O
H2+O+O
H2O+O
H2+O2
H2O2
H+H+O2
H+HO2
Σ=Sn/N
Start conditions:
• 4 angles of mutual orientation
• aim parameters
• 2 quantum oscillatory numbers
• 2 quantum rotational numbers
• energy of collision (in mass center
system)
Up to 1-10 millions independent
trajectories – at least 2·106 CPU hours
HO+HO
HO+O+H
Chemical reactions are modeled by movement of atoms on potential
surfaces within the limits of classical dynamics
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Typical examples of supercomputer centers
focused on quantum chemical and molecular
dynamics calculations
1. The EPSRC (Engineering and Physical Sciences Research Council) UK
National Service for Computational Chemistry Software (NSCCS,
http://www.nsccs.ac.uk – >250 Tflops
2. The National Resource for Biomedical and Chemistry Supercomputing
(NRBSC, http://www.nrbsc.org) - calculation of molecular systems from 20000
to 120000 atoms, provides resources to external users up to 100 000 CPUhours per structure;
3. Chemical Computing Group (http://www.chemcomp.com), Quebec, Canada –
>120 Tflops
4. Lawrence Berkeley National Laboratory (USA, http://www.lbl.gov/csd),
chemical branch – > 450 Tflops;
5. Lehrstuhl für Theoretische Chemie der Technische Universität München,
Germany (http://www.lrz.de/services/software/chemie) - own Linux cluster (96
Tflops) plus constantly allocated resources of Supercomputer Center;
6. Swiss National Supercomputing Centre (http://www.cscs.ch) – from tenth
Tflops to subPflops allocated resources for chemical calculations
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Scientific directions of operations of IPCP
IPCP RAS represents the largest in Russia academic
institute which are carrying out researches in following
areas:
•
•
•
•
•
•
the theory of elementary chemical processes;
structure of molecules and structure of solid state objects;
creation of materials with in advance set properties;
kinetics and mechanisms of complicate chemical reactions;
nanotechnologies;
creation of biologically active substances and medical
drugs;
• Biotechnologies and biochemistry;
• and many other scientific directions….
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
УЧАСТИЕ ИПХФ В ГРИД-ПРОЕКТАХ
Федеральные целевые программы
1. Государственный контракт № 07.514.11.4019 от 23 сентября 2011 г. с ИПХФ РАН в
рамках ФЦП «Исследования и разработки по приоритетным направлениям
развития научно-технологического комплекса России на 2007-2013 годы» по
теме «Разработка технологии проведения высокопроизводительных расчетов
в области вычислительной химии в различных распределенных средах с
применением методов виртуализации приложений и ресурсов» – основной
исполнитель
2. ФЦП "Развитие инфраструктуры наноиндустрии в Российской Федерации на
2008-2011 годы", проект «Создание Национальной нанотехнологической сети
(ГридННС)», государственный контракт № 16.647.12.2031 от 13.05.2011 г. –
соисполнитель
3.
Государственный контракт в рамках ФЦП «Исследования и разработки по
приоритетным направлениям развития научно-технологического комплекса
России на 2007-2013 годы» на тему «Развитие пилотной зоны российской гридсистемы для высокопроизводительных вычислений, в том числе в интересах
федеральных ядерных центров», тематика: «Адаптация пакетов прикладных
программ для работы в пилотной инфраструктуре грид-системы. Разработка
методов проведения грид-вычислений с использованием динамически
формируемых сред исполнения грид-заданий» – соисполнитель
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Программы фундаментальных исследований Президиума РАН
1. Программа № 13 фундаментальных исследований Президиума РАН на 2009-2012 годы
«Проблемы создания национальной научной распределенной информационновычислительной среды на основе развития GRID технологий и современных
телекоммуникационных сетей», проект «Исследование методов виртуализации
вычислительных сред и приложений в области вычислительной химии.
Динамическое
формирование
параллельных
программных
сред
на
распределенных ресурсах»;
2. Программа № 27 фундаментальных исследований Президиума РАН на 2009-2012 годы
«Основы
фундаментальных
исследований
наноматериалов»,
проект
«Самоорганизация наноразмерных материалов и процессы их взаимодействия с
адсорбируемыми соединениями: компьютерное моделирование в параллельных
и распределенных GRID средах терафлопного уровня»
Гранты Российского Фонда фундаментальных исследований
1. Грант РФФИ № 11-07-00686-а «Разработка методов динамического
формирования параллельных вычислительных сред на различных типах грид
ресурсов (на примере приложений квантовой химии)»
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Historical sequence of usage of GRID technologies at IPCP
2004 - 2007 г. Condor
2005 - 2007 г. Globus 3/4
2006 - 2010 г. LCG-2 – gLite (EGEE-RDIG)
2008 - … Unicore (SKIF Polygon)
2010 - …. Globus Toolkit 4 (GridNNN)
The quantum-chemical applications adapted for usage in the
distributed GRID environments:
•
•
•
•
•
•
Gaussian 03 (parallel, with taking into account license restrictions)
GAMESS US (parallel – socket and MPI variants)
CPMD (parallel)
Dalton (parallel)
NAMD (parallel) and many other:
Author’s programs – for example, investigations of tunnel properties of
structures (non-stationary equation Schrödinger )
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Current achievements
On the basis of IPCP GRID center some
Resource Sites of several Russian grid polygons
(GridNNN, Minsvyaz GRID, SKIF-Polygon etc.)
with support of a number of the established
applied quantum-chemical packages (>10) and
the Virtual Organizations (up to 3) oriented to
computational chemistry constantly work
Access to computing resources of a network
of GridNNN (till 40-50 Tf and to 10000 CPU)
and to the large volume of the applied
quantum and chemical software (Firefly,
MolPro, LaMMPS, NAMD, AbInit etc.)
supported by infrastructure of GridNNN is
provided
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Virtual Organizations (VO) in computational chemistry field
within National Nanotechnological Network (GridNNN)
The IPCP created and supports 3 Virtual organizations "NanoChem", "Gamess", "Gaussian"
in the field of computational chemistry within infrastructure «National Nanotechnological
Network» (GridNNN) for carrying out calculations with use applied quantum and chemical
software (Gaussian, GAMESS, NAMD, etc.) on GRID polygon of the Russian National
Nanotechnological Network
(http://nanogrid.icp.ac.ru/virtual.php, http://ngrid.ru/ngrid/gridnnn/volist)
Participants of GridNNN anв
mentioned above VO
НИВЦ МГУ («Чебышев»)
НИИЯФ МГУ
ОИЯИ (Дубна)
РНЦ “Курчатовский
Институт”
Казанский НЦ РАН
ПИЯФ
ИПХФ РАН
ИМСС УрО РАН (Пермь)
СПбГУ
ВЦ ДВО РАН (Хабаровск)
СПИИРАН
и другие ВУЗы, НИИ РАН,
инновационные и
коммерческие предприятия
Resource site of IPCP in SKIF grid polygon
(category “A” site)
• ОС – Linux Ubuntu, Unicore 6.2 Middleware
•
•
– Gateway (https://unicorgw.icp.ac.ru:8080);
– Server container (Unicore/X);
– Target System Interface (TSI);
– UAS (Unicore Atomic Services);
– Unicore User Database – XUUDB;
– PBS/Torque;
– User Interface – Command-line client
Access through the distributed environment to the main supercomputer
resources of SKIF-Polygon is provided (for example, SKIF-MSU, different
Russian sites)
Quantum-chemical applications (Gamess-US, Gaussian, NAMD, author’s
multiparameter tasks) also are adapted for usage in Unicore environment
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Quantum-chemical and molecular dynamic applied packages –
features of use
Use standard (from the point of view of the chemist and any end user)
applied programs in grid environments is caused by several
parameters:
• demand of applied packages scientific and industrial researchers,
i.e. end users (there is no sense to carry out very labor-consuming
adaptation and the subsequent control of the grid-environment for
low-used packages);
• license restrictions (including existing for freely extended
packages); for commercial applied packages of a condition of
licensing can or limit extremely possibility of realization them in the
grid-environment, or outright forbid it;
• availability of source code package that is highly desirable for
some modules packages (eg, communication, or responsible for
the implementation of parallel protocols);
• ability to work without an interactive user interaction and graphical
display of information (which is still unlikely to realize in a grid
environment).
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Quantum-chemical and molecular dynamic applied packages on
space of GridNNN and other grid polygons
(all are established on a resource grid sites of IPCP)
Quantum Chemistry:
1. Gaussian – including high-level web-interface
2. GAMESS-US – including high-level web-interface
3. FireFly
4. AbInit
5. NAMD
6. VASP
7. PWScf – including high-level web-interface
8. NWChem – including high-level web-interface
Molecular dynamics
1. LAMMPS
2. OpenMX
3. GROMACS
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Adaptation of the application package for use in grid
environments
1. Grid Gateway settings for service of an applied package(s) and
providing information on it to the monitor of resources
2. Creation of low-level interfaces for applied package(s) for solving of
incoming jobs
3. Creation of high-level web-oriented user interfaces (POI, WIG)
4. In some cases – modification of source codes of applied packages
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Authorised
grid-user
User
certificate
Data
files
High-level formation of
job
Resource selection
Job launch
Job’s monitoring
Results
WWW-portal
WIG(s)
Return of the received
results
Space of grid polygons
Loading, edition, storing
the input data
Gateway to grid environment
Authorization and
certifications
Formation of final result
Web interfaces,
POI, WIG
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Low-level interfaces to applied packages
on Resource Sites
ППП на ресурсных сайтах
Simplified scheme of grid calculations with high-level interfaces to applied
software packages (middleware GridNNN, gLite, Unicore)
Web-interfaces was created
(http://nanogrid.icp.ac.ru , http://webgrid.icp.ac.ru – grid portal of IPCP)
to work with the quantum-chemical software packages GAMESS-US,
Gaussian, NAMD, NWChem, PWscf etc. in virtual organizations of GridNNN
and the pilot zone of the Russian grid system for high performance
computing
Friendly to the end user (researcher) Web interfaces to applied software
packages considerably reduce labor input of work of the user with similar
packages in the conditions of the distributed computing environments.
The created interfaces allow the user to work via the Internet browser and to
carry out the following actions:
•
to authorize the user for start of a program complex and to carry out its
certification in the Virtual Organization;
•
to prepare a job (including creation and edition of initial data and
configuration files) in compliance with package requirements;
•
to start an applied package in grid-range infrastructure (if necessary – on
any or chosen resource);
•
to conduct monitoring of performance of a job (including stopping and
restart);
•
on termination of job – to receive results of the distributed calculations.
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Example of module of Web-interface for
quantum-chemical applied package GAMESS
(for loading and edition initial files)
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Module of the web interface for full-scale preparation of data and
configuration files for a quantum-chemical GAMESS-US applied package in
the distributed environments
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Example of module of Web-interface for
quantum-chemical applied package GAMESS
(resource selection)
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Example of module of Web-interface for
quantum-chemical applied package GAMESS
(monitoring of jobs)
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
The problems of creation and
development of applied GRID services
• High economic costs of creation and operation of computing resources
and infrastructure, absence of steady support of the Russian grid
polygons;
• The absence (or low) interest from industry and business:
– the general low culture of carrying out technological development in the majority of
scientific and industrial organizations;
– inability and unwillingness to use results of mathematical modeling and highefficiency calculations.
• Absence of skilled professionals as from computer specialists (systems
analysts, programmers), and by the users (chemists, engineers);
• Practical absence of domestic applied packages  high software cost,
the license restrictions, the reduced area of use, difficulty of
development;
• Weak settlement of the legal and economic relations between owners of
computing services (including GRID) and consumers
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Suggested ways to solve problems
• Stimulation of the user(s) (industrial first of all) for use of computing
GRID services:
– absence or the minimum payment of using by GRID resources at the first
stages of innovative development;
– training of users as a full-time, and in a mode of interactive on-line creation of information web resources on training to work with interfaces
of applied packages and with GRID resources;
– introduction in plans of training of applied experts (engineers, chemists,
etc.) additional courses of work with applied software packages and
carrying out by students of obligatory calculations of real scientific and
technical and engineering tasks with use of supercomputers and GRID
infrastructures;
– creation friendly, with a large number of base templates, interfaces
(WWW focused) to GRID resources;
– introduction by the RF state as a certification condition when developing
a part of commodity products and materials of obligatory application of
means of mathematical modeling and testing
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Suggested ways to solve problems (continued)
• Government policies and stimulation of developers of applied
GRID services and resources
– creation of the State (or equity), stable, constantly running computing
infrastructure (supercomputers, clusters of medium-sized, high-speed
communication channels, etc.);
– compensation by the State of a part of operational costs of the
supercomputer centers for support of the distributed calculations (grants
or the budgetary financing);
– tax privileges for development of the domestic applied software, creation
of the innovative parks focused on it;
– improvement of standard and legal and economic regulation in the field
of strategic information technologies;
– in the field of the State education: (a) expanding the number of budget
places
in
universities
for
professions'
“system
administration/programming”, “application programming” etc.; (b) Use a
wide range of undergraduate and graduate students of universities to
develop the necessary system and application software in the learning
process; (в) conduct training of university teachers in these specialties.
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Conclusions
Within the IPCP GRID Resource Center integrated into the
National Nanotechnological Network and a pilot zone of the
Russian grid-system for high-efficiency calculations, access
to a number of quantum-chemical and molecular-dynamic
applied packages for carrying out large-scale calculations on
the Russian grid-ranges with use both simplified, and
problem-oriented high-level web interfaces is realized
THANKS FOR YOUR ATTENTION!!!!
Dubna-2012, Distributed Computing and Grid-technologies in Science and Education
Download