Building a computational cluster managed by ARC NorduGrid Alexey N. Makarov MakarovAlexey@gmail.com Saint-Petersburg State University, Department of Physics March – 2008 - Saint-Petersburg - JASS2008 Outline Computational clusters – necessity of integration What is the Grid? An aim of the Grid Existing Grid projects Introduction to ARC NorduGrid Creating a computational cluster Installation and configuration of ARC server Practical using of created system Computational clusters Necessity of integration High performance clusters deployed to improve performance and availability over that provided by a single computer and used to run programs for time-intensive computations. Nowadays scientists propose such tasks that no one cluster can solve them for available time. Such tasks may be done on set of clusters. Often clusters as well as storage elements, particularized software, scientific equipment, etc. located in some university wants to use uniformly by scientist from another university, city or even country. What is a Grid? Grid is a system, that coordinates resources that are not subject to centralized control… …using standard, open, general-purpose protocols and interfaces… … to deliver non trivial quality of service I.Foster, “What is the grid?”(2002) Grid is a type of parallel and distributed system that enables the sharing, selection, and aggregation of geographically distributed "autonomous" resources dynamically at runtime depending on their availability, capability, performance, cost, and users quality-of-service requirements. Rajkumar Buyya “Convergence Characteristics for Clusters, Grids, and P2P networks “, Panel at the P2P conference, Linkoping, Sweden An aim of the Grid's Provide reliable, stable, universal, inexpensive access to resources Distribute resources inside dynamic virtual organizations Aggregate geographically distributed autonomous resources Provide single sign on and subsequent using of all available resources Provide easy way for administration tasks Are WEB and P2P some kind of Grids? The answer is NO. WEB is support access to distributed resources but NOT coordinate usage of this resources. P2P is NOT provide necessary QoS. Grid is NOT a big cluster because of it management system. What kind of applications require Grid? Computation intensive (large-scale simulation) Data intensive (experimental data analysis) Distributed collaboration (online instrumentation, specialized software) Existing Grids Why ARC? Stability working and good testing Supporting and continuous development from NorduGrid collaboration Easy integration with cluster tools and application software Lightweight client's part Introduction to ARC NorduGrid ARC middleware implements fundamental Grid services The usual grid security: single sign on, Grid ACLs (GACL), VOs (VOMS) Job submission: direct or via matchmaking and brokering Information services: resource aggregation, representation, discovery and monitoring Implements core data management functionality Automated seamless input/output data movement Data Indexing (RLS), client-side data movement Job monitoring and management Logging service Build upon standard open source solutions and protocols Globus Toolkit® pre-WS API and libraries (no services!) OpenLDAP, OpenSSL, SASL, SOAP, GridFTP, GSI Introduction to ARC NorduGrid architecture ARC based on independent services interacting with overs Grid manager - handles job management upon client request, interfaces to LRMS GridFTP – server performs most data movement Grid Infosys - publishes resource and job information via LDAP Httpsd – https server for Smart Storage Element, performs secure data movement Monitor - Web interface to the NorduGrid Information System Client - a lightweight User Interface with the built-in Resource Broker Authorization via certification scheme Certificate Authorities centres ARC components Picture from “NorduGrid, the middleware and related projects”, NDGF/Lund University GRID06 - June 26, 2006 – Dubna, Oxana Smirnova Creating a computational cluster Tuning operation system (sshd, NFS, firewall) Installation and configuration of PBS (TORQUE-2.1.6) to front-end server and nodes Installation software supporting MPI2 and OpenMP parallel programming (MPICH-2.0 and gcc-4.2.0) Testing productivity of computational cluster High Performance Linpack Benchmark – system for testing productivity of computational clusters. Top500 list creates using results of HPL. This is a list of the most performance systems in the world. On our cluster it’s 5.136e+01 Gflops Mounting equipment Computational nodes (w3,w4,w7,w8): 2 x Intel Xeon Dual Core 3.0GHz 2 x 2048MB DDR ECC REG Front-end server (ap8.gridzone.ru): Intel Pentium 4 Dual Core 3.2GHz 2 x 1024MB DDR2 ECC GigEthernet LAN: CiscoCatalyst 2960G OS: ScientificLinux 4.4 Site configuration External Dependences ARC Middleware SimpleCA ap8.gridzone.ru PBS Server PBS Scheduler Ganglia Local Resource Management System and Cluster Tools installing independently from ARC. Cluster 32 CPU PBS Client MPICH-2.0 Gcc-4.2.0 Ganglia client GPT Globus Toolkit® packages VOMS Python, MySQL, libxml2 libraries Grid Manager Grid Infosys GridFTP SSE Grid Monitor LocalCA ARC Middleware installing on front-end server only. Local Certificate Authority Centre User certificates for local resources Host certificates grant abilities for creating local grid infrastructure Simple manipulation with users certificates Testing ARC front-end server Information system (LdapBrowser, ldapsearch, grid-monitor, ngtest) Data management system (ngls, ngcp, ngrm, ngcat, ngtest) Job management system (ngsub, ngstat, ngget, ngresub, ngkill, ngrenew, ngtest) Monitoring systems Grid Monitor is a Web interface to the NorduGrid Information System Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters Nagios is an enterprise-class monitoring solutions for hosts, services, and networks Logger is a independently developed system for gathering information about Grid jobs General Grid Monitor Data from information system via Ldap Browser Ganglia Resource utilization Computations done for project «Internet Mathematics 2007» supporting by Yandex® Using in «Grid technology» course for students education Grid community scientists investigations i.e. Olav Syljuasen, Pavel Lihatov, Antti Hyvarinen and others. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. I. Foster, C. Kesselmann, S. Tuecke, «The Anatomy of the Grid», 2000 I. Foster, C. Kesselmann, S. Tuecke, J.M.Nick «The Physiology of the Grid», 2002 I.Foster «What is the Grid? A Three Points Checklist», 2002 Rajkumar Buyya «Convergence Characteristics for Clusters, Grids, and P2P networks», Panel at the P2P conference, Linkoping, Sweden Oxana Smirnova, «NorduGrid, the middleware and related projects», NDGF/Lund University GRID06 - June 26, 2006 – Dubna M. Ellert, A.Konstantinov, B. Kónya, O.Smirnova, A.Wäänänen. Architecture Proposal, NORDUGRID-TECH-1, 2002. http://www.globus.org http://www.nordugrid.org http://www.gridcomputing.com http://www.gridclub.ru Thank you for your attention!