Using Gaussian & GaussView on CHPC Resources Anita M. Orendt Center for High Performance Computing anita.orendt@utah.edu Fall 2012 Purpose of Presentation • To discuss usage of both Gaussian and GaussView on CHPC systems • To provide hints on making efficient use of Gaussian on CHPC resources • To demonstrate functionality of GaussView http://www.chpc.utah.edu Sanddunearch 156 nodes/624 cores Infiniband and GigE Updraft 256 nodes/2048 cores Infiniband and GigE 85 nodes general usage Owner nodes Ember 382 nodes/45684 cores Infiniband and GigE 67 nodes general usage Turret Arch Switch Telluride 12 GPU nodes (6 CHPC) Meteorology Administrative Nodes IBRIX NFS NFS Home Directories scratch systems serial – all clusters general – updraft http://www.chpc.utah.edu /scratch/ibrix/chpc_gen ember, updraft Getting Started at CHPC • Account application – now an online process – https://www.chpc.utah.edu/apps/profile/account_request.php • Username unid with passwords administrated by campus • Interactive nodes – two per each cluster (cluster.chpc.utah.edu) with round-robin access to divide load • CHPC environment scripts – www.chpc.utah.edu/docs/manuals/getting_started/code/chpc.tcshrc – www.chpc.utah.edu/docs/manuals/getting_started/code/chpc.bashrc • Getting started guide – www.chpc.utah.edu/docs/manuals/getting_started • Problem reporting system – http://jira.chpc.utah.edu or email to issues@chpc.utah.edu http://www.chpc.utah.edu Security Policies (1) • No clear text passwords - use ssh and scp • Do not share your account under any circumstances • Don’t leave your terminal unattended while logged into your account • Do not introduce classified or sensitive work onto CHPC systems • Use a good password and protect it – see gate.acs.utah.edu for tips on good passwords http://www.chpc.utah.edu Security Policies (2) • Do not try to break passwords, tamper with files, look into anyone else’s directory, etc. – your privileges do not extend beyond your own directory • Do not distribute or copy privileged data or software • Report suspicions to CHPC (security@chpc.utah.edu) • Please see http://www.chpc.utah.edu/docs/policies/security.html for more details http://www.chpc.utah.edu .tcshrc/.bashrc • Gaussian users need .tcshrc even if normally use bash – both are put in new accounts – Gaussian setup under individual compute cluster sections – Uncomment out (remove # from start of line) EITHER the line that does a source of g03.login OR the line for g09.login – they are mutually exclusive! • The script can be modified for individual needs with source .aliases at end and the creation of an .aliases file Gaussian Users Group • User also needs to be in the gaussian group (check box on account application form) – otherwise you will not have permission to run gaussian/gaussview – Command groups will show your groups; look for g05 http://www.chpc.utah.edu Batch System • All jobs run on compute nodes accessed through batch system with Moab & Torque (PBS) as scheduler/resource manager • More or less a first in – first out, but with backfill for best utilization of resources • Sanddunearch - no longer allocated - 72 hours max walltime • Ember – 72 hours max walltime on CHPC nodes; long QOS available – can also run in smithp-guest mode (#PBS –A smithp-guest); 24 hours max walltime and preemptable on smithp nodes • Updraft – 24 hours max walltime on CHPC nodes – with allocation, can also run as preemptable (qos=preemptable in your #PBS –l line); charged .25 the normal charges, but is preemptable • No allocation – can still run in freecycle mode; this mode is preemptable on updraft and ember. Automatic with no allocation - cannot choose • Special needs/time crunch – talk to us; we do set up reservations http://www.chpc.utah.edu Job Control Commands • • • • • • • • • • qsub script – to submit job qdel job number – to delete job from queue (both waiting and running jobs) showq – to see jobs in queue (add –r, -i, or –b, for running, idle, or blocked only) – Use with | grep username to focus on your jobs only – Idle jobs with reservations have * after job number – If in blocked section there may be problems qstat –a – PBS version of showq; has some different information (also –f jobnumber) mshow –a – –flags=FUTURE – what resources are currently available to you qstat –f job number – valuable info on deferred jobs showstart job number – estimation of when your job will start (based on jobs ahead of yours lasting for entire time requested); only works for jobs with reservations checkjob (–v) job number – more detailed information; error messages at end diagnose –n – shows you activity on nodes More info on web pages and user guides – http://www.chpc.utah.edu/docs/manuals/software/maui.html – http://www.chpc.utah.edu/docs/manuals/software/pbs.html http://www.chpc.utah.edu Gaussian03 • Version E.01 (last version) installed – /uufs/chpc.utah.edu/sys/pkg/gaussian03/E.01 for AMD (sanddunearch) – /uufs/chpc.utah.edu/sys/pkg/gaussian03/E.01-EMT64 for Intel (updraft/ember/PI-owned nodes on sanddunearch) • • • • Main web site: www.gaussian.com Have site license for both unix and windows versions With G03, GaussView4 is standard General information on CHPC installation • http://www.chpc.utah.edu/docs/manuals/software/g03.html • http://www.chpc.utah.edu/docs/manuals/software/gv.html • Has information on licensing restrictions, example batch scripts, where to get more information on the specific package • Gaussian use is restricted to academic research only http://www.chpc.utah.edu Gaussian09 • Version C.01 current (still have B.01 and A.02 if needed) – /uufs/chpc.utah.edu/sys/pkg/gaussian09/EM64T for Intel procs /uufs/chpc.utah.edu/sys/pkg/gaussian09/AMD64 for AMD procs • Have site license for unix version only • Standard version of GaussView with G09 is GV5 • Chemistry groups purchased Windows license for both G09 and GV5 – Groups can purchase share to gain access • General information on CHPC installation • http://www.chpc.utah.edu/docs/manuals/software/g09.html http://www.chpc.utah.edu GaussView • Molecular builder and viewer for Gaussian input/output files • CHPC has campus licenses for linux version – For Gaussian03 – standard is version 4 – For Gaussian09 – standard is version 5 • Access with gv & – provided you have uncommented the Gaussian setup from the standard .tcshrc • DO NOT submit jobs from within GaussView – instead create and save input file and use batch system • Examples of how to use to show MO’s, electrostatic potentials, NMR tensors, vibrations given on Gaussian’s web page http://faculty.ycp.edu/~jforesma/educ/ http://www.chpc.utah.edu Highlights of G03/G09 Differences • G09 does not use nprocl – Limit of about 8 nodes due to line length issue • New Restart keyword – For property and higher level calculation restarts – Still use old way for opt, scf restarts • New easier way for partial opts/scans – See G09 opt keyword for details • New capabilities, new methods – “What’s New in G09” at http://www.gaussian.com/g_prod/g09new.htm • Improved timings http://www.chpc.utah.edu G03/G09 Script • Two sections for changes: #PBS -S /bin/csh #PBS –A account #PBS –l walltime=02:00:00,nodes=2:ppn=12 #PBS -N g03job • And: setenv WORKDIR $HOME/g09/project setenv FILENAME input setenv SCRFLAG LOCAL setenv NODES 2 http://www.chpc.utah.edu Scratch Choices • LOCAL (/scratch/local) – – – – – Hard drive local to compute node 60GB on SDA; 200GB on updraft; 400GB on ember Fastest option - recommended IF this is enough space for your job Do not have access to scratch files during run but log/chk files written to $WORKDIR Automatically scrubbed at end of job • SERIAL (/scratch/serial) – NFS mounted on all clusters (interactive and compute) – 15 TB • GENERAL (/scratch/general) – NFS mounted on UPDRAFT compute nodes and all interactive nodes – 3.5 TB • IBRIX (/scratch/ibrix/chpc_gen) – Parallel scratch file system (HP IBRIX solution) – On UPDRAFT and EMBER compute nodes and on all interactive nodes – 55 TB http://www.chpc.utah.edu Functionality • Energies – MM : AMBER (old one), Dreiding, UFF force fields – Semi-empirical: CNDO, INDO, MINDO/3, MNDO, AM1, PM3, PM6 – HF: closed shell, restricted and unrestricted open shell – DFT: Many functionals, both pure and hybrid, from which to choose – MP: 2nd-5th order; direct and semi-direct methods – Other high level methods such as CI, CC, MCSCF, CASSCF – High accuracy methods such as G1, G2 etc and CBS http://www.chpc.utah.edu Functionality (2) • Gradients/Geometry optimizations • Frequencies • Other properties – Population analyses – Natural Bond Order analysis (NBO5 with G03) – Electrostatic potentials – NMR shielding tensors – J coupling tensors http://www.chpc.utah.edu Input File Structure • Filename.com • Free format, case insensitive • Spaces, commas, tabs, forward slash as delimiters between keywords • ! Comment line • Divided into sections (in order) – Link 0 commands (%) – Route section – what you want calculation to do – Title – Molecular specification – Optional additional sections http://www.chpc.utah.edu Input File: Link 0 Commands • First “Link 0” options – %chk – %mem – %nprocs • Examples – %chk=Filename.chk – %mem=2gb – %nprocs=4 • Note nprocl no longer used in G09 http://www.chpc.utah.edu Number of Processors • %nprocs – number of processors on one node – sanddunearch – 4; updraft – 8; ember – 12 – There are owner nodes on sanddunearch with 8 processors per node http://www.chpc.utah.edu Memory Specification • Memory usage: default is 6MW or 48MB – all nodes have much more than this! • If you need more use %mem directive – Units : words (default), KB, KW, MB, MW, GB, GW – Number must be an integer • Methods to estimate memory needs for select applications given in Chapter 4 of User’s Guide • %mem value must be less than memory of node – Sanddunearch nodes have 8GB – Updraft nodes have 16GB – Ember nodes have 24GB http://www.chpc.utah.edu Input - Route Specification • • Keyword line(s) – specify calculation type and other job options Start with # symbol – for control of the print level in the output file use #n, #t, #p for normal, terse, or more complete output – #p suggested as it monitors job progress; useful for troubleshooting problems • • • Can be multiple lines Terminate with a blank line Format – keyword=option – keyword(option) – keyword(option1,option2,..) – keyword=(option1,option2,…) • User’s Guide provides list of keywords, options, and basis set notation http://www.gaussian.com/g_tech/g_ur/l_keywords09.htm http://www.chpc.utah.edu Input - Title Specification • Brief description of calculation - for user’s benefit • Terminate with a blank line http://www.chpc.utah.edu Input – Molecular Specification • 1st line charge, multiplicity • Element labels and location – Cartesian • label x y z – Z-matrix • label atom1 bondlength atom2 angle atom3 dihedral • If parameters used instead of numerical values then variables section follows • Default units are angstroms and degrees • Again end in blank line http://www.chpc.utah.edu Parallel Nature of Gaussian • All runs make use of all core per node with nprocs • Only some portions of Gaussian run parallel on multiple nodes (includes most of the compute intensive parts involved with single point energies/optimizations for HF/DFT) • If time consuming links are not – job WILL NOT benefit from running on more than one node • Nmr and mp2 frequency are examples that do not run parallel; opt and single point energies tend to scale nicely • Not all job types are restartable, but more are restartable in G09 than G03 (e.g., frequencies and NMR) – see new restart keyword – Requires rwf from previous run – Still restart optimizations and single point energies the old way • CHPC does allow for jobs over standard walltime limit on ember if needed – but first explore using more nodes or restart options http://www.chpc.utah.edu Timings G09 varying scratch system All jobs on 1 ember node; run at the same time on valinomycin (C54H90N6O18) Scratch setting Medmp2 882 bf 536GB rwf file Large opt 1284 bf; 51 opt steps 1GB rwf file Large freq of same 8GB rwf file local 4.5 hrs (used maxdisk) 10.25 hrs 10.5 hrs serial >30 hrs (ran out of time) 27.5 hrs 14.75 hrs ibrix 5.75 hrs 11.75 hrs 10.75 hrs ***depends strongly on amount of I/O and on other jobs usage on scratch system http://www.chpc.utah.edu Scaling G03/G09 B3PW91;650 bf; 8 opt steps; time in hours # quad nodes wall time # 8way nodes wall time # 12way nodes wall time 1 23.25/18.5 1 5.75/5.5 1 3.25 2 12/9.25 2 3/3 2 1.75 4 6.25/5 4 2/1.75 8 4/3 8 1/1 16 3 http://www.chpc.utah.edu DFT Frequency of same case # quad nodes wall time # 8way nodes wall time # 12way nodes wall time 1 20.25/13.5 1 5.5/5 1 3.25 2 11/7.25 2 3/2.75 2 1.5 4 6/7.5 4 1.75/1.5 8 4.25/6 8 1/1 16 3.5 http://www.chpc.utah.edu SDA/UP Scaling MP2 G03/G09 8 opt steps 338bf; time in hours (freq added) # quad nodes wall time # 8way nodes 1 13.5 9.75(32.5) 10 5.5(27.5) 8.5 3(24) 1 2 4 wall time # 12way nodes wall time 1 1(9.5) 1(7.5) 4(19) 2 2 1.5(16.5) 4 http://www.chpc.utah.edu <1(7) RWF Sizes – Choice of Scratch • For a DFT optimization with 462 basis functions – 150mb RWF • For a DFT freq of above structure – 1.1gb RWF • For a MP2 optimization with 462 bf – 55gb RWF AND 55gb SCR file • For a MP2 frequency of above structure – 247gb RWF http://www.chpc.utah.edu GaussView Demos • • • • Building molecules Inputting structures Setting up jobs Looking at different types of output – Optimization runs – Frequencies – MOs – Electron density/electrostatic potential http://www.chpc.utah.edu • Any questions – contact me – anita.orendt@utah.edu – Phone: 801-231-2762 – Office: 422 INSCC http://www.chpc.utah.edu