Introduction to ARSC Systems and Services Derek Bastille bastille@arsc.edu 907-450-8643 User Consultant/Group Lead Outline • • • • • • • About ARSC ARSC Compute Systems ARSC Storage Systems Available Software Account Information Utilization / Allocations Questions About ARSC • We are not a government site. – Owned and operated by the University of Alaska Fairbanks. – Focus on Arctic and other Polar regions • Involved in the International Polar Year – Host a mix of HPCMP and non-DoD users – On-site staff and faculty perform original research in fields like Oceanography, Space Weather Physics, Vulcanology and Large Text Retrieval About ARSC • Part of the HPCMP as an Allocated Distributed Center – Participated in various Technology Insertion initiatives • New Cray system acquired as part of TI-08) – Allocate 70% of available cycles to HPCMP projects and users – Locally allocate remaining 30% to non-DoD Universities and other government agencies – Connectivity is primarily via DREN OC12 – Host Service Academy cadets during the summer along with other academic interns About ARSC • An Open Research Center – All ARSC systems are ‘open research’ – Only unclassified and non-sensitive data – Can host US citizens or foreign nationals who are NAC-less – Also host undergraduate and graduate courses through the University of Alaska • Happy to work with other Universities for student and class system usage About ARSC • Wide variety of Projects – Computational Technology Areas CTA OTH CSM CFD CCM CEA CWO SIP FMS EQM CEN IMT SAP CTA Name Other Computational Structural Mechanics Computational Fluid Mechanics Computational Chemistry and Materials Science Computational Electromagnetics and Simulation Climate/Weather/Ocean Modeling and Simulation Signal/Image Processing Forces Modeling and Simulation/C4I Environmental Quality Modeling and Simulation Computational Electronics and Nanoelectronics Integrated Modeling and Test Environments Space/Astrophysics About ARSC CTA Comparison for CY2007 # of Projects # of Jobs CPU Hrs OTH CSM CFD CCM CEA CW O SIP FMS EQM CEN IMT SAP ARSC Systems Iceberg [AK6] IBM Power4 (800 core) 5 TFlops peak 92 p655+ nodes 736 1.5 Ghz CPUs 2 p690 nodes 64 1.7 Ghz CPUs 25 TB Disk Will be retired on 18 July, 2008 ARSC Systems Midnight [AK8] – SuSE Linux 9.3 Enterprise – All nodes have 4GB per core – 358 X2200 Sun Fire nodes • 2 dual core 2.6 Ghz Opterons/node – 55 X4600 Sun Fire nodes • 8 dual core 2.6 Ghz Opterons/node – Voltaire Infiniband switch – PBSPro – 68Tb Lustre Filesystem ARSC Systems Pingo – Cray XT5 – 3,456 2.6Ghz Opteron Cores QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. • 4Gb per core • 13.5 TB total memory – – – – – 432 Nodes 31.8 TFlops peak SeaStar interconnect 150 TB storage Working towards FY2009 availability (October 2008) ARSC Systems - Storage Seawolf / Nanook – SunFire 6800 – 8 900 Mhz CPUs • 16 GB total memory – – – – 20 TB local (seawolf) 10 TB local (nanook) Fibre Channel to STK silo $ARCHIVE NFS mounted Storage Tek Silo – SL8500 – > 3 PB theoretical capacity – STK T10000 & T9940 drives ARSC Systems - Data Analysis – – QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. – – – Discovery Lab MD Flying Flex Multichannel audio and video Located on UAF campus Other Linux / OSX workstations available for post-production, data analysis, animation and rendering Access Grid Nodes Collaborations with many UAF departments ARSC Systems - Software • All the usual suspects – Matlab, ABAQUS, NCAR, Fluent, etc – GNU tools, various libraries, etc. – Several HPCMP Consolidated Software Initiative packages and tools • Several compilers on Midnight – Pathscale, GNU, Sun Studio www.arsc.edu/support/resources/software.phtml Access Policies • Similar access policies to other HPCMP centers – All logins to ARSC systems are via kerberized clients • ssh, scp, kftp, krlogin • ARSC issues SecurID cards for the ARSC.EDU Kerberos realm • Starting to implement PKI infrastructure – PKI is still a moving target for HPCMP at this point – All ARSC systems undergo regular HPCMP CSA checks and the DITSCAP/DIACAP process Access Policies • ‘Open Center Access’ – Only HPCMP center to be Open Access for all systems – National Agency Checks not required – Nominal restrictions on Foreign Nationals • Must apply from within US • Need to not be in the TDOTS • Must provide valid passport & entry status – Information Assurance Awareness training is required Access Policies – Security Policies www.arsc.edu/support/policy/secpolicy.html • Dot file permissions and some contents routinely checked by scripts • Kerberos passphrases expire every 180 days • Accounts placed in an ‘inactive’ status after 180 days of not logging in • Please ask us if you have any questions Application Process - DoD • HPCMP users need to use pIE and work with their S/AAA • ARSC has a cross-realm trust with other MSRCs, so principals such as HPCMP.HPC.MIL can be used We are assuming that most UH researchers will be applying as Non-DoD accounts Application Process - Non-DoD – Non-DoD users and projects are handled internally by ARSC – www.arsc.edu/support/accounts/acquire.html • Application forms and procedures • ARSC will issue and send the SecurID cards – Allocations based on federal FY (1 Oct - 30 Sep) – Granting of resources is dependent on how much of the 30% allocation remains • Preference given to UA researchers and affiliates and/or Arctic related science Application Process - Non-DoD – You may apply for a project if you are a qualified faculty member or researcher • Students can not be a Primary Investigator – Faculty sponsor is required, but the sponsor does not need to be an actual ‘user’ of the systems – Students are then added to the project as a user • PIs are requested to provide a short annual report outlining project progress and any published results • Allocations of time are granted to projects – Start-up accounts have a nominal allocation – Production projects have allocations based on need and availability Application Process - Non-DoD – Users apply for access as part of the Project – PIs will need to email approval before we add any user to a project – ARSC will mail a SecurID card (US Express mail) once the account has been created – A few things are needed to activate the account • Signed Account Agreement and SecurID receipt • IAA training completion certificate • Citizenship/ID verification – See: www.arsc.edu/support/accounts/acquire.html#proof_citizenship ARSC Systems - Utilization – Job usage compiled/uploaded to local database daily – Allocation changes posted twice daily – PIs will be automatically notified when their project exceeds 90% of its allocation and when it runs out of allocation – Users can check usage by invoking show_usage • show_usage -s for all allocated systems • More detailed reports available upon request ARSC Systems - Utilization To: <the PI> From: ARSC Accounts <hpc-accounts@arsc.edu> Subject: ARSC: midnight Project Utilization and Allocation Summary Consolidated CPU Utilization Report ======================================== FY: 2008 ARSC System: midnight ARSC Group ID: <GROUP> Primary Investigator: <PI Name> Cumulative usage summary for October 1, 2007 through 15 Mar 2008. Foreground Background Total ---------- ---------- ---------Allocation 150000.00 Hours Used 126432.97 2.59 126435.56 ================================== Remaining 23567.03 ( 15.71%) NOTE: In order to monitor the usage of your project on a regular basis, you can invoke the show_usage command on any allocated ARSC system. If you have any questions about your allocation and/or usage, please contact us. Regards, ARSC HPC Accounts [email] hpc-accounts@arsc.edu [voice] 907-450-8602 [fax] 907-450-8601 ARSC Systems - Queues – Invoke news queues on iceberg to see current queues • Load Leveler used for scheduling http://www.arsc.edu/support/howtos/usingloadleveler.html Name MaxJobCPU MaxProcCPU Free Max Description d+hh:mm:ss d+hh:mm:ss Slots Slots --------------- -------------- -------------- ----- ----- --------------------data 00:35:00 00:35:00 14 14 12 hours, 500mb, network nodes debug 1+08:05:00 1+08:05:00 16 32 01 hours, 4 nodes, debug p690 21+08:05:00 21+08:05:00 64 64 08 hours, 240gb, 64 cpu single 56+00:05:00 56+00:05:00 113 704 168 hours, 12gb, 8 cpu bkg 85+08:05:00 85+08:05:00 100 704 08 hours, 12gb, 256 cpu standard 170+16:05:00 170+16:05:00 113 704 16 hours, 12gb, 256 cpu challenge 768+00:05:00 768+00:05:00 113 704 48 hours, 12gb, 384 cpu special unlimited unlimited 113 736 48 hours, no limits cobaltadm unlimited unlimited 3 4 cobalt license checking -------------------------------------------------------------------------------- ARSC Systems - Queues – Invoke news queues on Midnight to see current queues • PBS Pro used for scheduling Queue Min Max Max Procs Procs Walltime Notes --------------- ----- ----- --------- -----------standard 1 16 84:00:00 See (A) 17 256 16:00:00 257 512 12:00:00 challenge 1 16 96:00:00 See (B) 17 256 96:00:00 See (C) 257 512 12:00:00 background debug 1 512 12:00:00 1 32 00:30:00 See (D) ARSC Systems - Help – Each system has a Getting Started… www.arsc.edu/support/howtos/usingsun.html www.arsc.edu/support/howtos/usingp6x.html – The HPC News email letter has many great tips and suggestions www.arsc.edu/support/news/HPCnews.shtml – Help Desk Consultants are quite talented and able to help with a variety of issues Contact Information ARSC Help Desk Mon - Fri 08:00 - 17:00 AK 907-450-8602 consult@arsc.edu www.arsc.edu/support/support.html Questions? Michelle Phillips (2007 Quest) Finished 4th in 2008 Quest (11 days, 10 hrs, 21 mins) Photo by: Carsten Thies Yukonquest.com