Introduction to ARSC Systems and Services Derek Bastille

advertisement
Introduction to ARSC
Systems and Services
Derek Bastille
bastille@arsc.edu
907-450-8643
User Consultant/Group Lead
Outline
•
•
•
•
•
•
•
About ARSC
ARSC Compute Systems
ARSC Storage Systems
Available Software
Account Information
Utilization / Allocations
Questions
About ARSC
• We are not a government site.
– Owned and operated by the University of Alaska
Fairbanks.
– Focus on Arctic and other Polar regions
• Involved in the International Polar Year
– Host a mix of HPCMP and non-DoD users
– On-site staff and faculty perform original research
in fields like Oceanography, Space Weather
Physics, Vulcanology and Large Text Retrieval
About ARSC
• Part of the HPCMP as an Allocated
Distributed Center
– Participated in various Technology Insertion
initiatives
• New Cray system acquired as part of TI-08)
– Allocate 70% of available cycles to HPCMP
projects and users
– Locally allocate remaining 30% to non-DoD
Universities and other government agencies
– Connectivity is primarily via DREN OC12
– Host Service Academy cadets during the summer
along with other academic interns
About ARSC
• An Open Research Center
– All ARSC systems are ‘open research’
– Only unclassified and non-sensitive data
– Can host US citizens or foreign nationals
who are NAC-less
– Also host undergraduate and graduate
courses through the University of Alaska
• Happy to work with other Universities for
student and class system usage
About ARSC
• Wide variety of Projects
– Computational Technology Areas
CTA
OTH
CSM
CFD
CCM
CEA
CWO
SIP
FMS
EQM
CEN
IMT
SAP
CTA Name
Other
Computational Structural Mechanics
Computational Fluid Mechanics
Computational Chemistry and Materials Science
Computational Electromagnetics and Simulation
Climate/Weather/Ocean Modeling and Simulation
Signal/Image Processing
Forces Modeling and Simulation/C4I
Environmental Quality Modeling and Simulation
Computational Electronics and Nanoelectronics
Integrated Modeling and Test Environments
Space/Astrophysics
About ARSC
CTA Comparison for CY2007
# of Projects
# of Jobs
CPU Hrs
OTH
CSM
CFD
CCM
CEA
CW O
SIP
FMS
EQM
CEN
IMT
SAP
ARSC Systems
Iceberg
[AK6]
IBM Power4 (800 core)
5 TFlops peak
92 p655+ nodes
736 1.5 Ghz CPUs
2 p690 nodes
64 1.7 Ghz CPUs
25 TB Disk
Will be retired on
18 July, 2008
ARSC Systems
Midnight [AK8]
– SuSE Linux 9.3 Enterprise
– All nodes have 4GB per core
– 358 X2200 Sun Fire nodes
• 2 dual core 2.6 Ghz
Opterons/node
– 55 X4600 Sun Fire nodes
• 8 dual core 2.6 Ghz
Opterons/node
– Voltaire Infiniband switch
– PBSPro
– 68Tb Lustre Filesystem
ARSC Systems
Pingo
– Cray XT5
– 3,456 2.6Ghz Opteron Cores
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
• 4Gb per core
• 13.5 TB total memory
–
–
–
–
–
432 Nodes
31.8 TFlops peak
SeaStar interconnect
150 TB storage
Working towards FY2009
availability (October 2008)
ARSC Systems - Storage
Seawolf / Nanook
– SunFire 6800
– 8 900 Mhz CPUs
• 16 GB total memory
–
–
–
–
20 TB local (seawolf)
10 TB local (nanook)
Fibre Channel to STK silo
$ARCHIVE NFS mounted
Storage Tek Silo
– SL8500
– > 3 PB theoretical capacity
– STK T10000 & T9940 drives
ARSC Systems - Data Analysis
–
–
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
–
–
–
Discovery Lab
MD Flying Flex
Multichannel audio and
video
Located on UAF
campus
Other Linux / OSX
workstations available
for post-production, data
analysis, animation and
rendering
Access Grid Nodes
Collaborations with
many UAF departments
ARSC Systems - Software
• All the usual suspects
– Matlab, ABAQUS, NCAR, Fluent, etc
– GNU tools, various libraries, etc.
– Several HPCMP Consolidated Software
Initiative packages and tools
• Several compilers on Midnight
– Pathscale, GNU, Sun Studio
www.arsc.edu/support/resources/software.phtml
Access Policies
• Similar access policies to other HPCMP
centers
– All logins to ARSC systems are via kerberized
clients
• ssh, scp, kftp, krlogin
• ARSC issues SecurID cards for the ARSC.EDU Kerberos
realm
• Starting to implement PKI infrastructure
– PKI is still a moving target for HPCMP at this point
– All ARSC systems undergo regular HPCMP CSA
checks and the DITSCAP/DIACAP process
Access Policies
• ‘Open Center Access’
– Only HPCMP center to be Open Access for all
systems
– National Agency Checks not required
– Nominal restrictions on Foreign Nationals
• Must apply from within US
• Need to not be in the TDOTS
• Must provide valid passport & entry status
– Information Assurance Awareness training is
required
Access Policies
– Security Policies
www.arsc.edu/support/policy/secpolicy.html
• Dot file permissions and some contents
routinely checked by scripts
• Kerberos passphrases expire every 180 days
• Accounts placed in an ‘inactive’ status after 180
days of not logging in
• Please ask us if you have any questions
Application Process - DoD
• HPCMP users need to use pIE and
work with their S/AAA
• ARSC has a cross-realm trust with
other MSRCs, so principals such
as HPCMP.HPC.MIL can be used
We are assuming that most UH researchers
will be applying as Non-DoD accounts
Application Process - Non-DoD
– Non-DoD users and projects are handled
internally by ARSC
– www.arsc.edu/support/accounts/acquire.html
• Application forms and procedures
• ARSC will issue and send the SecurID cards
– Allocations based on federal FY (1 Oct - 30 Sep)
– Granting of resources is dependent on how much
of the 30% allocation remains
• Preference given to UA researchers and affiliates and/or
Arctic related science
Application Process - Non-DoD
– You may apply for a project if you are a qualified
faculty member or researcher
• Students can not be a Primary Investigator
– Faculty sponsor is required, but the sponsor does not need to be an
actual ‘user’ of the systems
– Students are then added to the project as a user
• PIs are requested to provide a short annual report
outlining project progress and any published results
• Allocations of time are granted to projects
– Start-up accounts have a nominal allocation
– Production projects have allocations based on need and availability
Application Process - Non-DoD
– Users apply for access as part of the Project
– PIs will need to email approval before we add any
user to a project
– ARSC will mail a SecurID card (US Express mail)
once the account has been created
– A few things are needed to activate the account
• Signed Account Agreement and SecurID receipt
• IAA training completion certificate
• Citizenship/ID verification
– See:
www.arsc.edu/support/accounts/acquire.html#proof_citizenship
ARSC Systems - Utilization
– Job usage compiled/uploaded to local database
daily
– Allocation changes posted twice daily
– PIs will be automatically notified when their project
exceeds 90% of its allocation and when it runs out
of allocation
– Users can check usage by invoking show_usage
• show_usage -s for all allocated systems
• More detailed reports available upon request
ARSC Systems - Utilization
To: <the PI>
From: ARSC Accounts <hpc-accounts@arsc.edu>
Subject: ARSC: midnight Project Utilization and Allocation Summary
Consolidated CPU Utilization Report
========================================
FY: 2008
ARSC System: midnight
ARSC Group ID: <GROUP>
Primary Investigator: <PI Name>
Cumulative usage summary for October 1, 2007 through 15 Mar 2008.
Foreground Background
Total
---------- ---------- ---------Allocation 150000.00
Hours Used 126432.97
2.59 126435.56
==================================
Remaining 23567.03 ( 15.71%)
NOTE:
In order to monitor the usage of your project on a regular basis, you can invoke
the show_usage command on any allocated ARSC system.
If you have any questions about your allocation and/or usage, please contact us.
Regards,
ARSC HPC Accounts
[email] hpc-accounts@arsc.edu
[voice] 907-450-8602
[fax] 907-450-8601
ARSC Systems - Queues
– Invoke news queues on iceberg to see
current queues
• Load Leveler used for scheduling
http://www.arsc.edu/support/howtos/usingloadleveler.html
Name
MaxJobCPU MaxProcCPU Free Max Description
d+hh:mm:ss d+hh:mm:ss Slots Slots
--------------- -------------- -------------- ----- ----- --------------------data
00:35:00
00:35:00 14 14 12 hours, 500mb, network nodes
debug
1+08:05:00 1+08:05:00 16 32 01 hours, 4 nodes, debug
p690
21+08:05:00 21+08:05:00 64 64 08 hours, 240gb, 64 cpu
single
56+00:05:00 56+00:05:00 113 704 168 hours, 12gb, 8 cpu
bkg
85+08:05:00 85+08:05:00 100 704 08 hours, 12gb, 256 cpu
standard
170+16:05:00 170+16:05:00 113 704 16 hours, 12gb, 256 cpu
challenge
768+00:05:00 768+00:05:00 113 704 48 hours, 12gb, 384 cpu
special
unlimited
unlimited 113 736 48 hours, no limits
cobaltadm
unlimited
unlimited 3 4 cobalt license checking
--------------------------------------------------------------------------------
ARSC Systems - Queues
– Invoke news queues on Midnight to see
current queues
• PBS Pro used for scheduling
Queue
Min Max Max
Procs Procs Walltime Notes
--------------- ----- ----- --------- -----------standard
1 16 84:00:00 See (A)
17 256 16:00:00
257 512 12:00:00
challenge
1 16 96:00:00 See (B)
17 256 96:00:00 See (C)
257 512 12:00:00
background
debug
1 512 12:00:00
1
32 00:30:00 See (D)
ARSC Systems - Help
– Each system has a Getting Started…
www.arsc.edu/support/howtos/usingsun.html
www.arsc.edu/support/howtos/usingp6x.html
– The HPC News email letter has many
great tips and suggestions
www.arsc.edu/support/news/HPCnews.shtml
– Help Desk Consultants are quite talented
and able to help with a variety of issues
Contact Information
ARSC Help Desk
Mon - Fri
08:00 - 17:00 AK
907-450-8602
consult@arsc.edu
www.arsc.edu/support/support.html
Questions?
Michelle
Phillips
(2007 Quest)
Finished 4th
in 2008 Quest
(11 days, 10
hrs, 21 mins)
Photo by:
Carsten Thies
Yukonquest.com
Download