Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Professional Administration training
Rajiv Jaisankar
Technical Specialist
Altair APAC
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Chapter One: Understanding PBS Professional
Chapter One

What is PBS Professional?

History of PBS Professional

PBS Works Online Store

PBS Professional Documentation

Altair Global Offices & Technical Support

Broad Hardware and Operating System Support

Supported MPI Libraries

PBS Professional Components & Roles
2
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
What is PBS Professional?
 Workload management solution that maximizes the efficiency and utilization of
high-performance computing (HPC) resources and improves job turnaround.
Robust Workload Management









Floating flex-based licenses
Scalability, with flexible queues
Job arrays
User and administrator interface
Job suspend/resume
Application checkpoint/restart
Automatic file staging
Accounting logs
Access control lists
Advanced Scheduling Algorithms
 Resource-based scheduling
 Preemptive scheduling
 Optimized node sorting
 Enhanced job placement
 Advance & standing reservations
 Cycle harvesting across workstations
 Scheduling across multiple complex
 Network topology scheduling
 Manages both batch and interactive work
Reliability, Availability and Scalability
 Server failover feature
 Automatic job recovery
 Provides system monitoring
 Provides integration with MPI solutions
 Tested to manage 1,000,000+ jobs per day
3
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
History of PBS Professional

1993-97:
Developed for NASA to replace NQS

2000:
Veridian formed commercial version of PBS;
Released PBS Professional 5.0

2003:
Altair acquired PBS Professional technology and engineering;
Released PBS Professional 5.3

2004:
Released PBS Professional 5.4

2005:
Released PBS Professional 7.0 and 7.1

2006:
Released PBS Professional 8.0

2007:
Released PBS Professional 9.0 and 9.1

2008:
Released PBS Professional 9.2

2008:
Released PBS Professional 10.0

2009:
Released PBS Professional 10.1

2009:
Released PBS Professional 10.2

2010:
Released PBS Professional 10.4

2010:
Released PBS Professional 11.0

2011:
Released PBS Professional 11.1
4
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Broad Hardware & Operating System Support






AMD-Linux and Windows
Intel-Linux and Windows
IBM AIX on Power
IBM Linux on Power
HP-UX on Itanium 2
Cray X2, XT, XT3, XT4, XT5,
and XT6
 SGI Altix ICE, XI, and UV
 SUN Solaris on SPARC
 Windows 7, XP, Vista,
Server 2003, and Server 2008
 Red Hat Enterprise 4, 5, and 6
 SLES 9, 10, and 11
Note: For a detailed list of supported systems & OS please refer to the latest release notes
5
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Supported MPI Libraries

Currently supported MPI libraries integrated with PBS:
•
MPICH 1.2.5, 1.2.6 on Linux 2.4 on, x86, AMD64, EM64T, Itanium2
•
MPICH 1.2.5, 1.2.6 on Linux 2.6 on x86, AMD64, EM64T
•
MPICH 1.2.7 on x86 Linux
•
MPICH-GM on Linux
•
Intel MPI 2.0.22 on Linux
•
MPICH2 1.0.3, 1.0.5, 1.0.7 on Linux
•
IBM POE on AIX 5.x, and 6.x , including HPS support
•
HP MPI 1.08.03 on HP-UX 11 on Itanium 2
•
HP MPI 2.0.0 on Linux 2.4 & 2.6 on x86, AMD64, EM64T, Itanium 2
•
LAM/MPI 6.5.9/7.0.6/7.1.1 on Linux 2.4/2.6 on x86, AMD64, EM64T, Itanium 2
•
SGI MPI (MPT) on Linux on Altix / Itanium 2/x86_64 and XE
•
SGI MPI (MPT) over Infiniband
•
MVAPICH 1.2.7/2.0 on Linux
•
OpenMPI 1.4.2 on Linux
6
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Professional Components & Roles

Batch Server

Scheduler

MOM
*
- referred to as the PBS Server
- central focus for a PBS complex
- routes job to compute host *
- processes all PBS related commands *
- provides the basic batch services *
- server maintains its own server and queue settings *
- daemon executes as pbs_server.bin
- referred to as the PBS Scheduler
- queries list of running and queued jobs from the PBS Server *
- queries queue, server, and node properties *
- queries resource consumption and availability from the PBS MOM *
- sorts available jobs according to local scheduling policies
- determines which job is eligible to run next
- daemon executing as pbs_sched
- referred to as the PBS MOM
- executes jobs at request of PBS Scheduler
- monitors resource usage of running jobs
- enforces resource limits on jobs
- reports system resource limits, configuration *
- daemon executing as pbs_mom
This information is for debugging purposes only. It may change in future releases and
should not be relied upon.
7
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Complex Configurations
Single Execution System
Server
MOM
Scheduler
All 3 PBS components on a single host.
8
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Complex Configurations, cont.
Multiple Execution System
MOM
Server
Front End
System
MOM
Scheduler
MOM
Note: PBS Server machine maybe a different architecture (UNIX/LINUX) from the execution hosts
A PBS complex can be either UNIX/Linux or Windows, but not both.
9
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Chapter Two - Installation of PBS Professional
Chapter Two
 Pre-Installation
 Basic Installation
 Post-Installation
 PBS Installed Directory Structure
10
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Post Installation – PBS Configuration File
 How does the PBS init script determine which services to invoke?
• The init script reads the configuration file: “/etc/pbs.conf”
• Format of a pbs.conf file:
PBS_EXEC=/opt/pbs/default
PBS_HOME=/var/spool/PBS
PBS_START_SERVER=1
PBS_START_MOM=1
PBS_START_SCHED=1
0 will prevent init from starting or stopping
the daemon
1 will have init start or stop the daemon
PBS_SERVER=traintb16
PBS_DATA_SERVICE_USER=pbsuser01
11
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
File System and File Transfer
 Sites will need to determine how users will access data files
• Most common file sharing methods used by PBS customers:
•
•
NFS
GFS
Network File System
Global File System
(most widely used)
 What method of file copy will be used?
•
•
•
rcp
scp
cp
remote copy (default used by PBS)
secure copy
Linux/Unix copy
12
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
User’s PBS Environment
 Delivery of STDOUT/STDERR files
•
PBS should be able to copy user’s STDOUT and STDERR files to the appropriate
directory without password challenge
 Stage input/output files
•
Users may need to import/export files related to the job before/after execution
 Users’ Data Transfer
•
Users should be able to transfer data without having to supply password, (e.g. rcp/scp)
 Users must have a valid account
•
Users should be able to log onto execution host(s) and should have a valid username
and group
13
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Altair LM-X License Management
 PBS Professional 11.0 is now licensed by Altair License Management
System (ALM) based on X-Formation’s LM-X license management
system
 Altair’s ALM package for PBS can be downloaded from:
https://secure.altair.com/UserArea/
 We recommend that Altair’s ALM be installed and configured before
installing PBS Professional v11.0
 For additional information on Altair’s ALM refer to the Altair License
Manager System 11.0 Installation and Operations Guide
14
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Chapter Three - PBS Administration
Chapter Seven
• Process flow of a PBS job
• PBS installed directory structure
• Directory structure of $PBS_HOME
• Directory structure of $PBS_EXEC
• Understanding the PBS configuration file
• Manually starting and stopping PBS daemons
• Impact of PBS daemons restarts on running jobs
• Network ports used by PBS
• Status of PBS complex
15
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Process Flow of a PBS Job – User Level
6.traintb16
6.traintb16
PBS SERVER
6.traintb16 on HOST A
1. User submits job
2. PBS Server returns a job ID
3. PBS Scheduler requests a list of resources from
the Server *
PBS SCHEDULER
4. PBS Scheduler sorts all the resources and jobs *
5. PBS Scheduler informs PBS Server which host(s) that job
can run on *
6. PBS Server pushes job script to execution host(s)
HOST A
HOST B
HOST C
ncpus
mem
host
7. PBS MOM executes job script
8. PBS MOM periodically reports resource usage back to PBS Server *
9. When job is completed PBS MOM kills the job script
10. PBS Server de-queues job from PBS complex
Note: * This information is for debugging purposes only.
It may change in future releases.
16
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Installed Directory Structure
 PBS Professional software is installed in two separate directories
• $PBS_EXEC  “/opt/pbs/default”
contains:
PBS daemons
Libraries
Man pages
Support tools
Administrator and user PBS commands
• $PBS_HOME “/var/spool/PBS”
contains:
PBS daemon configurations
PBS daemon logs
Other various file-related directories
17
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Directory Structure - PBS_HOME
Directory structure of $PBS_HOME
PBS_HOME
server_priv
mom_priv
daemon configuration directories
sched_priv
server_logs
mom_logs
daemon log directories
sched_logs
spool
undelivered
checkpoint
aux
misc directories/files
pbs_environment
pbs_version
datastore
This information is for debugging purposes only. It may change in future releases.
18
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Directory Structure - PBS_EXEC
Directory structure of $PBS_EXEC
PBS_EXEC
bin
sbin
binaries of PBS daemons and
user/admin PBS commands
lib
man
include
etc
tcltk
libraries, manual pages, and
header files
unsupported
python
pgsql
This information is for debugging purposes only. It may change in future releases.
19
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Directory Structure of $PBS_HOME /server_priv
Detailed structure of $PBS_HOME/server_priv *
server_priv
accounting
db_password
hooks
jobs
prov_tracking
server.lock
svrlive
*
directory containing daily accounting logs
database password - encrypted
directory containing custom hook definitions
directory containing users’ job scripts
OS provisioning directory
PBS server PID lock file
used for failover configuration
tracking
PBS license related file
usedlic
PBS license related file
This information is for debugging purposes only. It may change in future releases.
20
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Configuration File – pbs.conf
 PBS installs a configuration file “pbs.conf” located in “/etc/” directory. This
configuration file is used by PBS to determine:
• Which daemons to start/stop
• What PBS server to communicate with
• What file copy mechanism to use
Default contents of pbs.conf
PBS_EXEC=/opt/pbs/default
PBS_HOME=/var/spool/PBS
PBS_START_SERVER=1
PBS_START_MOM=1
PBS_START_SCHED=1
PBS_SERVER=hostname.domain
PBS_DATA_SERVICE_USER=pbsuser01
 Each server/scheduler, execution, and client host has a pbs.conf file
installed
 Refer to Administrator’s Guide; Chapter 13; Section 13.1.3; pages 715-716
for a complete listing of configuration file variables
21
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Configuration File – pbs.conf, cont.
 How pbs.conf differs between the PBS Server and PBS MOM hosts:
PBS SERVER HOST
PBS_EXEC=/opt/pbs/default
PBS_HOME=/var/spool/PBS
PBS_START_SERVER=1
PBS_START_MOM=0
PBS_START_SCHED=1
PBS_SERVER=traintb16
PBS_DATA_SERVICE_USER=pbsuser01
PBS EXECUTION HOST
PBS_EXEC=/opt/pbs/default
PBS_HOME=/var/spool/PBS
PBS_START_SERVER=0
PBS_START_MOM=1
PBS_START_SCHED=0
PBS_SERVER=traintb16
Note: Only 1 active instance of a PBS Server and PBS Scheduler can be
running within a PBS complex
22
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Configuration File – pbs.conf, cont.
 The variable PBS_START_<daemon> sets which daemon should be
allowed to start when the “/etc/init.d/pbs” script runs.
For example:
/etc/pbs.conf
This is the expected behavior when executing
“/etc/init.d/pbs start”:
PBS_EXEC=/opt/pbs/default
PBS_HOME=/var/spool/PBS
PBS_START_SERVER=1
pbs_server daemon will be invoked
PBS_START_MOM=0
pbs_mom daemon will not be invoked
PBS_START_SCHED=1
pbs_sched daemon will be invoked
PBS_SERVER=traintb16
PBS_DATA_SERVICE_USER=pbsuser01
23
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Starting/Stopping PBS Using Start/Stop Script
 Starting/stopping PBS
• Why use start/stop script?
• Vnode definitions are created only when the start script is used; they are
not created when the daemons are started manually
• Vnode definitions are required if PBS is to manage cpusets on a machine
• The pbs_mom daemon on the Altix and the Cray must be started via the
start script
• Using the pbs start/stop script to stop PBS will preserve jobs (the server
gets a ‘qterm -t quick’)
• Location of start/stop script (Linux)
/etc/init.d/pbs start
24
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Status of PBS Complex
 Use qstat -Bf to view the status of a PBS complex
Server: traintb16
server_state = Active
server_host = traintb16.prog.altair.com
scheduling = True
total_jobs = 0
state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:0 Exiting:0 Begun
:0
default_queue = workq
log_events = 511
mail_from = adm
query_other_jobs = True
resources_default.ncpus = 1
default_chunk.ncpus = 1
scheduler_iteration = 600
FLicenses = 33
resv_enable = True
node_fail_requeue = 310
max_array_size = 10000
pbs_license_info = 7788@localhost
pbs_license_min = 1
pbs_license_max = 2147483647
pbs_license_linger_time = 3600
license_count = Avail_Global:32 Avail_Local:1 Used:0 High_Use:0
pbs_version = PBSPro_11.0.0.103450
eligible_time_enable = False
max_concurrent_provision = 5
25
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Manually Starting/Stopping PBS Daemons
 Manually starting/stopping PBS daemons
• PBS Server
•
•
Start
• $PBS_EXEC/sbin/pbs_server
Stop
• $PBS_EXEC/bin/qterm –t [quick|delay|immediate]
• PBS Scheduler
•
•
Start
• $PBS_EXEC/bin/pbs_sched
Stop
• $PBS_EXEC/bin/qterm –s
• kill –INT <pbs_sched_pid>
• PBS MOM
•
•
Start
• $PBS_EXEC/sbin/pbs_mom
Stop
•
•
$PBS_EXEC/bin/qterm –m  This will shut down all the MOMs
kill –INT <pbs_mom_pid>
26
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Network Ports Used By PBS Daemons
 UNIX/Linux network ports
Daemon
Port Number
Protocol
Connection
pbs
15001
TCP
Client/Scheduler to Server
pbs_server
15001
UDP
Server to MOM via RPP
pbs_mom
15002
TCP
MOM to/from Server
pbs_resmon
15003
TCP
MOM resource requests
pbs_resmon
15003
UDP
MOM resource requests
pbs_sched
15004
TCP
PBS Scheduler
pbs_mom_globus
15005
TCP
MOM Globus
pbs_mom_globus
15006
TCP
MOM Globus resource requests
pbs_mom_globus
15006
UDP
MOM Globus resource requests
27
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Chapter Four - Job Management
Chapter Four

Defining a Job Script

Types of Jobs

Submitting Jobs

Process Flow of a PBS Job

Querying PBS Jobs

Setting Job Attributes

Requesting Job Resources

Default Job Attributes

Order of Default Resources Assigned to Jobs

Job Exit Codes
28
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Defining a Job Script
 What is a job script?
• A file that contains a set of instructions to execute a series of
commands. Also known as a “batch job”.
Example of a job script:
Shell interpreter
commands
#!/bin/bash
sleep 5
/home/altair/scripts/optistruct –cpu 2 handlebar.fem
29
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Submitting Jobs - Using “qsub”
Submitting a job script to PBS
• Using “qsub” command
Usage:
qsub
Example:
qsub
<job_attributes/resources>
–l select=1:ncpus=1
<job_script>
test_script
• If the job is accepted by PBS, a job identifier is returned. This job
identifier is comprised of the job number and the submitted server host
name:
0.traintb16
Note: - If a job is rejected it will not return a job identifier, but it will increment the job ID
- Largest possible job ID is 7 digits: 9,999,999. Once reached it will reset to zero
30
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Requesting Job Resources – Built in Resources
Resource
Description
arch
System architecture
cput
Amount of CPU time used by the job for all the processes on
all the chunks
mem
Amount of physical memory allocated to a job
ncpus
Number of processors requested for a job
walltime
Time requested for the job to run
Note: For complete listing refer to PBS Reference Documentation Guide pages 336-340
31
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Types of Jobs
 There are two types of PBS jobs
• Batch Job
-
A script that contains commands or tasks to execute site specific applications
• Interactive Job
-
Runs like a batch job, but when it runs, the user’s terminal input and output are
connected to the execution host; similar to a login session.
• Allows users to debug a job script
• Verify a new application properly runs
32
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Setting Job Attributes – Using PBS Directives
 Job attributes can be set in 2 different ways:
• Method 1: on the qsub command line
qsub –N <job_name> <job_script>
• Method 2: within a job script as a PBS directive
#!/bin/bash
#PBS –N test_run_01
#PBS –l select=4:ncpus=4:mem=16GB
#PBS –l place=scatter
#PBS- j oe
#PBS –o /home/pbsuser01/OUTPUTS
optistruct –ncpu 2 handlebar.fem
Note: - PBS expects the directives to begin on the second line, and be on consecutive lines thereafter.
Once started, the interpreter stops processing directives at the first line that contains an
executable line. It will ignore comment lines.
- Command line arguments will override PBS directives.
33
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Requesting Job Resources – Understanding Resources
 What are job resources?
• Applications sometimes need certain types and amounts of system
resources such as:
-
memory
ncpus
scratch space
• During job submission, required resources can be requested
 How can these resources be requested within PBS?
• PBS defines these resources as chunks or as job-wide resources
What are “chunks”?
What are “job-wide resources”?
• set of resources that are allocated as a unit to a job
• resources that are associated with the entire job
• smallest set of resources that are allocated to a job
• for example: placement of jobs, walltime
• for example: ncpus, mem
• requested in a “select” statement
qsub –l select=<#>:ncpus=<#>:mem=<#>
34
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Requesting Job Resources – Using Chunks & Select
 Requesting resources in chunks
• Resources which are to be allocated as a unit to a job
-
Smallest set of resources to be allocated to a single job
Host/Vnode level request
Syntax: qsub –l select=[ N: ] chunk[ + [N:] chunk….]
For example:
1. Job requesting: 3 chunks with 2 CPUs per chunks:
qsub –l select=3:ncpus=2
2. Job requesting: 2 chunks with 1 CPU each and 10GB each and another set of 3
chunks with 2 CPUs each and 8GB each of memory
qsub –l select=2:ncpus=1:mem=10gb+3:ncpus=2:mem=8gb
35
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Requesting Job Resources – Job Placement

Placing jobs on hosts/Vnodes
•
Users can specify how their multi-node job is placed within a PBS complex based on the
resources requested
•
Place statement controls how the job is placed on the hosts/vnodes from which resources
may be allocated for the job
•
Using the “place” statement:
Usage:
qsub –l place= <type>| <sharing> | <group>
Example:
qsub –l select=1:ncpus=2:mem=100MB –l place=pack
Type
type
script
Value
Description
free
place job on any vnode(s), including hosts
pack
all chunks will be taken from one host
scatter
each chunk is allocated to a separate host
excl
only this job uses the vnodes chosen
shared
this job can share the vnodes chosen
<resource>
chunks will be grouped according to a resource
sharing
group
36
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Requesting Job Resources – Job Wide Resources
 Requesting job-wide limits
• Resources that are requested outside a select statement
-
Such as walltime, or cput
• Requesting resources at server or queue level
• Resources that are not tied to specific host(s)/vnode(s)
For example:
qsub –l select=1:ncpus=1:mem=100MB –l walltime=01:00:00 myscript
37
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Requesting Job Resources – SMP Jobs
 SMP jobs are meant to run on a single execution host
 Submitting an SMP PBS job
qsub –l select=x:ncpus=x –l place=pack
Note: all chunks will be placed on a single host
 Additional options
•
Place a job on a host that already has a job running on it
qsub –l select=1:ncpus=2 –l place=pack:shared
•
Place a job on a host on which no other jobs are running and make
that host exclusive to it
qsub –l select=1:ncpus=2 –l place=pack:excl
38
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Requesting Job Resources – MPI Jobs
 MPI jobs run on multiple hosts, using an MPI application
 PBS has tightly integrated wrapper scripts for various MPI
implementations
• Allows PBS to track spawned MPI processes
• More accurate tracking of all resources being consumed across all the
hosts
• Accurately record CPU accounting utilization on all nodes
• Accurately enforce requested job limits
• Automatically "clean up" stray MPI processes on all nodes
• Require no changes other than wrapping
39
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
1777 ?
Ss
0:00 /opt/pbs/default/sbin/pbs_mom
1779 ?
Ss
0:00 \_ -bash
1810 ?
S
0:00
\_ /bin/sh
/var/spool/PBS_10.4.0.101257/mom_priv/jobs/1746.rhel5.lab.altair.com.
SC
1812 ?
S
0:00
\_ /opt/mpich2-install/bin/mpirun -f
/var/spool/PBS_10.4.0.101257/aux/1746.rhel5.lab.altair.com
/usr/local/gromacs_mpich2-1.3.2p1/bi
1813 ?
S
0:00
\_ /opt/mpich2-install/bin/hydra_pmi_proxy -control-port rhel54:37470 --demux poll --pgid 0 --proxy-id 0
1814 ?
R
0:14
\_ /usr/local/gromacs_mpich21.3.2p1/bin/mdrun -f /test/bench/d.dppc/grompp.mdp -c
/test/bench/d.dppc/conf.gro -p /test/benc
40
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Requesting Job Resources – Submitting MPI Jobs
 Method 1
• Request: 4-way MPI job with 2 CPUs and 2GB memory per MPI task, with one
MPI task per host, where each host has 2 CPUs and 2 GB memory
qsub –l select=4:ncpus=2:mem=2GB –l place=scatter
• Variable $PBS_NODEFILE contains list of vnodes
VnodeA
VnodeB
VnodeC
VnodeD
• Sample of an MPI job script
#!/bin/bash
#PBS –l select=4:mem=2GB:mpiprocs=2
#PBS –l place=scatter
mpirun –np 8 –mem 8GB file
41
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Requesting Job Resources – Submitting MPI Jobs, cont.
 Method 2
• Request: 4-way MPI job with 2 CPUs and 2GB memory per MPI task; request
up to 4 hosts, where each host has 4 CPUs and 4 GB memory
qsub –l select=4:ncpus=2:mem=2GB –l place = free
• Variable $PBS_NODEFILE contains list of vnodes
VnodeA
VnodeB
• Sample of a MPI job script
#!/bin/bash
#PBS –l select=4:mem=2GB:mpiprocs=2
$PBS –l place=free
mpirun –np 8 –mem 8GB file
42
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Requesting Job Resources - Boolean Resources
 A resource that can be requested as true or false
 Requesting chunks that have resource ‘optistruct’, the qsub request line would be:
qsub –l select=1:ncpus=1:optistruct=true
The scheduler will only place this job on vnodes that have the resource “optistruct” set to
“true”
 If a boolean resource is requested as job-wide, e.g.:
qsub –l select=1:ncpus=1 –l optistruct=true
PBS will check if it is available at the server or queue level – not vnode/host level
43
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Default Job Attributes
 PBS includes default values for resources that the user doesn’t specify
during job submission
 The following are resource defaults assigned to a job:
• default_chunk.ncpus=1
• resources_default.ncpus=1
• resources_default.walltime=<5 years>
Note: Root and managers can specify additional default resources
44
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Querying Jobs – Using “qstat”
 To show a list of current PBS jobs’ status
• Using “qstat” command
Usage:
qstat <-a, -n, -s, -1, -w>
Example:
qstat
Job id
---------------6.traintb16
7.traintb16
8.traintb16
9.traintb16
Name
---------------test_script
jobA
test_2
test_script
User
----------pbsuser01
pbsuser02
pbsuser04
pbsuser01
Time Use
-------00:00:00
00:00:00
0
00:00:00
S
R
R
Q
R
Queue
----workq
workq
workq
workq
Note: If a job was deleted or completed then it can no longer be listed via qstat
unless the PBS complex has enabled the job history functionality
45
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Querying Jobs – Additional qstat Options
-a
job name, session id, # nodes req, #ncpus req, req’d mem, req’d, time, and elapsed time
Job ID
Username Queue
Jobname
SessID NDS TSK
------------- -------- -------- ---------- ------ --- --8.traintb16
pbsuser0 workq
test_scrip 6556
1
8
-s
same as option –a, but with comments
Job ID
Username
-------------- -------8.traintb16
pbsuser0
Job run at Wed Jul 05
-n
Req'd Req'd
Elap
Memory Time S Time
------ ----- - ------- R 00:07
Queue
-------workq
at 14:48
Jobname
SessID NDS TSK
---------- ------ --- --test_scrip 5556
1
8
on (traintb16:ncpus=8)
Req'd Req'd
Elap
Memory Time S Time
------ ----- - ------- R 00:07
same as option –a, but indicates which execution vnode(s) the job is running on
Job ID
Username Queue
Jobname
SessID NDS TSK
-------------- -------- -------- ---------- ------ --- --8.traintb16
pbsuser0 workq
test_scrip 5556
1
8
traintb16/0
Req'd Req'd
Elap
Memory Time S Time
------ ----- - ------- R 00:07
Note: - Adding an additional option “-1” will output each entry on a single line instead of wrapping around
- Also using “-w” shows the full output of individual fields
46
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Querying Jobs – Job States
State
Description
Q
Job is queued waiting for execution
R
Job is running
S
Job is suspended
E
Job is exiting after execution
H
Job is held or put on hold
W
Job is waiting for its requested execution time or has been delayed 30 minutes because
stage-in failed
T
Job in transition is being moved between states
F
Jobs that have finished; regardless if completed successfully or not
M
Jobs that have moved to another PBS complex
47
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Job Attributes - Viewing Job Attributes
 To view job attributes that were assigned to a particular job, use the
qstat command.
Job Id: 1.traintb16
Job_Name = sleep_job
Job_Owner = pbsuser01@traintb16.prog.altair.com
resources_used.cpupercent = 0
resources_used.cput = 00:00:00
resources_used.mem = 1028kb
resources_used.ncpus = 1
Usage:
qstat –f <job_id>
resources_used.vmem = 18440kb
resources_used.walltime = 00:00:00
Example: qstat –f 2.trainhp01
job_state = R
queue = workq
server = traintb16
Checkpoint = u
ctime = Tue May
5 17:49:09 2010
Error_Path = traintb16.prog.altair.com:/home/pbsuser01/boo/sleep_job.e1
exec_host = traintb16/0
exec_vnode = (traintb16:ncpus=1)
Hold_Types = n
Join_Path = n
Keep_Files = n
Mail_Points = a
mtime = Tue May
5 17:49:09 2010
Output_Path = traintb16.prog.altair.com:/home/pbsuser01/boo/sleep_job.o1
Priority = 0
qtime = Tue May
5 17:49:09 2010
Rerunable = True
48
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Job Attributes - Viewing Job Attributes, cont.
Resource_List.ncpus = 1
Resource_List.nodect = 1
Resource_List.place = pack
Resource_List.select = 1:ncpus=1
stime = Tue May
5 17:49:11 2010
session_id = 11535
jobdir = /home/pbsuser01
substate = 42
Variable_List = PBS_O_HOME=/home/pbsuser01,PBS_O_LANG=en_US.UTF-8,
PBS_O_LOGNAME=pbsuser01,
PBS_O_PATH=/home/pbsuser01/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X
11:/usr/X11R6/bin:/usr/games:/opt/kde3/bin:/usr/lib/mit/bin:/usr/lib/mi
t/sbin:/opt/pbs/default/bin:/opt/pbs/default/sbin,
PBS_O_MAIL=/var/spool/mail/pbsuser01,PBS_O_SHELL=/bin/bash,
PBS_O_HOST=traintb16.prog.altair.com,
PBS_O_WORKDIR=/home/pbsuser01/boo,PBS_O_SYSTEM=Linux,PBS_O_QUEUE=workq
comment = Job run at Tue May 05 at 17:49 on (traintb16:ncpus=1)
etime = Tue May
5 17:49:09 2010
Submit_arguments = -l select=1:ncpus=1 my_script
Note: Running as root or PBS Manager will output additional information
49
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Querying Jobs – Using “tracejob”
 Using tracejob to obtain comprehensive information about a job
• Using “tracejob” command
Usage:
tracejob –n<days> <job id>
Example:
tracejob –n4 0.traintb16
Job: 0.traintb16
05/05/2010 17:43:35
05/05/2010 17:43:35
S
S
05/05/2010
05/05/2010
05/05/2010
05/05/2010
05/05/2010
05/05/2010
05/05/2010
05/05/2010
05/05/2010
17:45:08
17:45:08
17:45:08
17:45:10
17:45:10
17:45:14
17:45:14
17:45:15
17:45:15
L
S
M
S
L
M
M
S
S
05/05/2010
05/05/2010
05/05/2010
05/05/2010
05/05/2010
05/05/2010
05/05/2010
05/05/2010
05/05/2010
17:45:15
17:45:15
17:45:15
17:45:15
17:45:15
17:45:15
17:45:15
17:45:15
17:45:15
M
M
M
M
M
M
S
M
M
enqueuing into workq, state 1 hop 1
Job Queued at request of pbsuser01@traintb16.prog.altair.com, owner = pbsuser01@traintb16.prog.altair.com,
job name = sleep_job, queue = workq
Considering job to run
Job Run at request of Scheduler@traintb16.prog.altair.com on exec_vnode (traintb16:ncpus=1)
Started, pid = 11491
Job Modified at request of Scheduler@traintb16.prog.altair.com
Job run
task 00000001 terminated
Terminated
Obit received momhop:1 serverhop:1 state:4 substate:42
Exit_status=0 resources_used.cpupercent=0 resources_used.cput=00:00:00 resources_used.mem=3056kb
resources_used.ncpus=1 resources_used.vmem=39392kb resources_used.walltime=00:00:07
task 00000001 cput= 0:00:00
traintb16 cput= 0:00:00 mem=3056kb
Obit sent
copy file request received
staged 2 items out over 0:00:00
delete job request received
dequeuing from workq, state 5
kill_job
work proc outstanding
S = Server
L = Scheduler
M = MOM
Note: Information is taken from server logs, scheduler logs, and mom logs (local to that machine) past 24 hrs
50
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Querying Jobs – Deleting Jobs
 To delete jobs that are listed under qstat
• Using “qdel” command
Usage:
qdel <job id>
Example:
qdel 0.traintb16
 To delete a job from the server regardless of the job’s state
Usage:
qdel –W force <job id>
Example:
qdel –W force 0.traintb16
Note: Users can only delete their own jobs; unless that user’s
name is in the manager’s list
51
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Querying Jobs – Finished Job History
 To view only jobs that have been deleted, moved, or finished
• qstat -H
Job id
---------------80.traintb16
81.traintb16
82.traintb16
83.traintb16
Name
---------------sleep5
sleep5
sleep5
sleep5
User
---------------pbsuser01
pbsuser01
pbsuser01
pbsuser01
Time Use
-------00:00:00
00:00:00
00:00:00
00:00:00
S
F
F
F
F
Queue
----workq
workq
workq
workq
S
F
F
F
F
Q
R
Queue
----workq
workq
workq
workq
workq
workq
 To view all jobs; regardless what state type
• qstat -x
Job id
---------------80.traintb16
81.traintb16
82.traintb16
83.traintb16
84.traintb16
85.traintb16
Name
---------------sleep5
sleep5
sleep5
sleep5
sleep5
sleep5
User
---------------pbsuser01
pbsuser01
pbsuser01
pbsuser01
pbsuser01
pbsuser01
Time Use
-------00:00:00
00:00:00
00:00:00
00:00:00
0
00:00:00
Note: The PBS Server attribute job_history_enable needs to be set in order to use this option
52
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Querying jobs – Estimated start time/start order
 PBS can estimate the start time and start order of jobs using qstat –
T option
• New column: Est Start Time
• Job ids are displayed in the order of estimated start time
$ qstat -T
traintb16:
Est
Req'd
Req'd
Start
Job ID
Username Queue
Jobname
SessID NDS TSK Memory Time
--------------- -------- -------- ---------- ------ --- --- ------ ----159.traintb16
pbsuser01workq
STDIN
4302
1
2
-- 00:05
164.traintb16
pbsuser01workq
STDIN
-1
1
-- 01:05
13:36
165.traintb16
pbsuser01workq
STDIN
-1
1
-- 01:05
13:36
160.traintb16
pbsuser01workq
STDIN
-1
1
-- 01:05
14:41
Note: The sorted job ids are NOT determined by the PBS Scheduler.
161.traintb16
pbsuser01workq
STDIN
-1
1
-- 01:05
14:41
S Time
- ---R
-
Q
Q
Q
Q
53
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Querying Jobs – Re-Queuing Jobs
 To re-queue a running job
• Using “qrerun” command
Usage:
qrerun <job id>
Example:
qrerun 0.traintb16
 To re-queue a job even if that job’s execution host is not reachable
Usage:
qrerun –W force <job id>
Example:
qrerun –W force 0.traintb16
Note: only root or managers can perform this operation
54
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Job Exit Codes
 The exit code from a batch job is a standard Unix termination status, the same sort
of number you get in a shell script from checking the "$?" variable after executing a
command.
 Typically, exit code 0 (zero) means successful completion.
 Codes 1-127 are typically generated by the job itself calling exit() with a non-zero
value to terminate itself and indicate an error.
 Exit codes in the range 129-255 represent jobs terminated by Unix "signals". Each
type of signal has a number, and what's reported as the job exit code is the signal
number plus 128. Signals can arise from within the process itself (as for SEGV) or be
sent to the process by some external agent (such as the batch control system).
 The specific meaning of the signal numbers are platform-dependent
 Exit codes < 0 are set by PBS
55
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Job Exit Codes, cont.
#
Name
Description
0
JOB_EXEC_OK
Job execution was successful
-1
JOB_EXEC_FAIL1
Job execution failed, before files, no retry
-2
JOB_EXEC_FAIL2
Job execution failed, after files, no retry
-3
JOB_EXEC_RETRY
Job execution failed, do retry
-4
JOB_EXEC_INITABT
Job aborted on MOM initialization
-5
JOB_EXEC_INITRST
Job aborted on MOM init, checkpoint, no migrate
-6
JOB_EXEC_INITRMG
Job aborted on MOM init, checkpoint, ok migrate
-7
JOB_EXEC_BADREST
Job restart failed
-8
JOB_EXEC_GLOBUS_INIT__RETRY
Initialization of globus job failed, do retry
-9
JOB_EXEC_GLOBUS_INIT_FAIL
Initialization of globus job failed, no retry
-10
JOB_EXEC_FAILUID
Invalid UID/GID for job
-11
JOB_EXEC_RERUN
Job rerun
-12
JOB_EXEC_CHKP
Job was checkpointed and killed
-13
JOB_EXEC_FAIL_PASSWORD
Job failed due to a bad password
-14
JOB_EXEC_RERUN_ON_SIS_FAIL
Job was re-queued or deleted due to communication failure between 1st head node
and a sister node
56
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Accounting Records – Log Information
 PBS accounting logs contain information about job statistics such
as:
• Owner, queue, start time, end time, execution host, resources
requested, exit status, and resources used
 Accounting logs are stored on the machine where the pbs_server
daemon is running
• Location: $PBS_HOME/server_priv/accounting
– A new log file is created every day
—file name format: [YYYYMMDD]
 The accounting logs are only accessible by root
 The accounting logs can be parsed by the “pbs-report” script
57
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Accounting Records – Details of Accounting Log Entry

Sample of accounting log entry:
05/05/2010 17:45:15;E;0.traintb16;user=pbsuser01 group=users
jobname=sleep_job queue=workq ctime=1241559815 qtime=1241559815
etime=1241559815 start=1241559910 exec_host=traintb16/0
exec_vnode=(traintb16:ncpus=1) Resource_List.ncpus=1
Resource_List.nodect=1 Resource_List.place=pack
Resource_List.select=1:ncpus=1 session=11491 end=1241559915 Exit_status=0
resources_used.cpupercent=0 resources_used.cput=00:00:00
resources_used.mem=3056kb resources_used.ncpus=1
resources_used.vmem=39392kb resources_used.walltime=00:00:07
syntax: date-time; record_type; id_string; message_text
date-time
Date and time stamp. Format: mm/dd/yyyy hh:mm:ss
record_type
Single character indicating type of record
id_string
Job, reservation or reservation-job identifier
message_text
Contains detailed information for the job or reservation
58
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Using Accounting Records: Using “pbs-report”
 To parse information from the accounting logs use the pbs-report
script located in $PBS_EXEC/sbin directory
PBS Pro Cluster Accounting Summary Statistics
-----------------------------------------
 Information obtained
from pbs-report helps
sites to determine how
much work was done by
PBS jobs at a site
during a specified time
period
Sample output of pbs-report:
Report from Thu Sept 15 2010 00:00:00 to Thu Sept 17 2010 12:13:32
# of
Username
Total
Total
Average
jobs
CPU Time
Wall Time
Efcy.
Wait Time
Muda
-----
----------
----------
-----
----------
-----
TOTAL
132
0
618322
0.000
2108
0.000
pbsuser01
127
0
616328
0.000
2191
0.000
pbsuser02
5
0
1994
0.000
4
0.000
Minimum
5
0
1994
0.000
4
0.000
Maximum
------------
127
0
616328
0.000
2191
0.000
Mean
66
0
309161
0.000
1097
0.000
Deviation
61
0
307167
0.000
1093
0.000
5
0
1994
0.000
4
0.000
Median
Job Set Summary
Standard
Minimum
Maximum
Mean
Deviation
Median
----------
----------
----------
----------
----------
CPU time
0
0
0
0
0
Wall time
0
78616
4684
17559
60
Wait time
0
67778
2108
2070
2
Suspend time
0
0
0
0
0
Note: All times displayed in seconds.
59
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Moving Jobs Between Queues
 Users can move jobs from one queue to local queue by using the qmove
command
• Using “qmove” command
Usage:
qmove <new_queue> <job_id>
Example:
qmove small_queue 0.traintb16
 Jobs can also be moved to another PBS complex
Example: qmove small_queue@traintb02 0.traintb16
Note:
• Running or suspended jobs cannot be moved
• Use qstat –H if job was moved
• Must specify the fully job id.server for qstat
60
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Holding and Releasing Jobs
 Users can put a hold on their jobs, so that PBS will not schedule them for execution
•
Using “qhold” command
Usage:
qhold <job_id>
Example:
qhold 0.traintb16
 To release a held job, to allow PBS to consider it for execution:
•
Using “qrls” command
Usage:
qrls <job_id>
Example:
qrls 0.traintb16
61
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Deferring Job Execution
 Users can specify a date/time for their job to be eligible for execution
Usage:
qsub –a date_time
date_time  [[[CC]YY]MM]DD]hhmm[.SS]
Example:
qsub –a 201008281645 my_script
Note: Deferred jobs will be marked with the “W” wait state
62
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Specifying Email Notifications
 Users can specify what type of email notification they want, depending on job status
 The default is only to notify the user when the job is aborted or terminated
 Using qsub command with the following options, users can set their own notification:
Usage:
qsub –m <a|b|e|n>
Example: qsub –m abe
Options Description
a
job is aborted (default)
b
job has begun
e
job has finished
execution
n
do not send any email
63
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Chapter Five - Site Specific Configurations
Chapter Five
 Preserving job history
 Prologue/epilogue scripts
 PBS redundancy and failover
64
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Preserving Job History - Concept
 By default once a job has been de-queued from a PBS complex, the job’s
history is retrievable using qstat
To enable job history feature by using qmgr:
Qmgr: set server job_history_enable = True
• preserves job attributes
• preserves job resource requested and used
 The default preservation time frame is 14 days
Qmgr: set server job_history_duration: <time>
•
<time> : [[hours:]minutes:]seconds[.milliseconds]
To view job history:
•
•
View all job ids; past and present
View jobs that were only finished, moved, or deleted
qstat –x |f|a|n|s
qstat –H |f|n|s
65
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Prologue & Epilogue Scripts
 Sites can be set up to run custom scripts before jobs are executed or
after each job is finished or terminated
• These scripts can perform tasks such as network file staging for site-specific
applications, file cleanup after a job has been completed, or to output additional
information to the user’s job after completion
• These scripts are known as:
•
Prologue
•
Epilogue
Script executed on primary execution host before the job is run
Located in: $PBS_HOME/mom_priv/prologue
Script executed on primary execution host after the job is run
Located in: $PBS_HOME/mom_priv/epilogue
• Each execution host will have it’s own prologue or epilogue script
•
•
Only runs on primary execution host of a multinode job
Runs as root
• A timeout period can be set up in the PBS_HOME/mom_priv/config:
$prologalarm <seconds>
66
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Prologue & Epilogue Scripts – Sequence of Events

Start of a Job (Prologue)
1.
2.
3.
4.
5.

Licenses are obtained
Files are staged in if needed
$TMPDIR is created
The prologue script is executed
The PBS job script is executed
End of a Job (Epilogue)
1.
2.
3.
4.
5.
6.
7.
8.
The PBS job script finishes
The job’s cpusets are destroyed
The epilogue script is run
The obit is sent to the pbs server
Any file stageout takes place – includes STDOUT and STDERR
Files staged in or out are removed
PBS Job files are deleted
FLEX licenses are returned to pool
67
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Prologue & Epilogue Scripts – Sample Prologue Script
 Prologue Script – reordering the vnodes in the PBS_NODEFILE
#!/bin/bash
PBS_NODEFILE="/var/spool/PBS/aux/$1"
lines=`cat $PBS_NODEFILE | wc -l`
nodes=`cat $PBS_NODEFILE | uniq`
nodect=`echo $nodes | wc -w`
loops=$(expr $lines / $nodect)
for (( times = 0; times < $loops; times++ )); do
nodefile=$nodefile$nodes" "
done
echo $nodefile | tr " " "\n" > $PBS_NODEFILE
68
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Prologue & Epilogue Scripts – Sample Epilogue Script
 Epilogue Script – cleaning up files/directories
#!/bin/sh
#
#
#
#
#
#
#
#
#
#
#
$Id: epilogue,v 3.3 2006/07/27 20:48:36
$1 = job id
$2 = user name
$3 = group name
$4 = job name
$5 = session id
$6 = requested resource limits
$7 = resources used
$8 = queue name
$9 = account string
$10 = exit code from job
UNIX95=XPG4; export UNIX95
jobid=$1
jobname=$4
user=$2
sid=$5
if [ -z "$jobid" -o -z "$jobname" -o -z "$user" ];
then
echo "`basename $0`: No arguments: exiting."
exit 1
fi
# Defining a marker for utilization later.
state=/tmp/cleanup${jobid}
# Define the source location
src=/scratch/`hostname`/$user/$jobname-`echo $jobid |
cut -d. -f1`
if [ -d $src -a ! -f $state ]; then
touch $state
if [ -x $src/pbs-cleanup ]; then
if [ `whoami` != $user ]; then
su - $user -c "$src/pbs-cleanup"
else
$src/pbs-cleanup
fi
fi
if [ $? -eq 0 ]; then
cd /
rm -rf $src
rmdir `dirname $src` 2>/dev/null
fi
rm -f $state
fi
until [ ! -f $state ]; do
sleep 5
done
exit 0
69
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Redundancy and Failover - Concept
 PBS provides the capability for a backup PBS Server to assume the
workload of a failed Primary Server
• Primary Server
• Secondary Server
- is the main PBS server
- is usually inactive, but starts up when primary fails
 Requirements for a PBS failover configuration:
•
•
•
•
Primary and secondary servers must run on two separate host machines
Both servers and all the execution hosts must have the same PBS version
Both servers must be the same architecture – same binary
Both servers must be able to communicate with each other and all the execution
hosts
• The primary and secondary servers must share the same PBS_HOME directory
• PBS_HOME directory should be on a file system that is not local to either of the
server hosts.
• Root/administrator must have full read/write access to PBS_HOME
Note: Its not advisable to have a MOM running on either host
70
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Redundancy and Failover – Setting Up

Configuring Failover on the Primary Server
1.
2.
3.
4.
5.
6.
Install PBS on the primary server’s host
Check whether PBS is able to run jobs on execution hosts
If the test passes move the $PBS_HOME directory to a shared file system
Check whether PBS is able to run jobs on execution hosts using the new
directory
If the test passes shut down the pbs_server and pbs_sched daemons
Configure the /etc/pbs.conf file to include the following settings:
PBS_PRIMARY=<primary_host>
PBS_SECONDARY=<secondary_host>
PBS_SERVER=<short name for primary host>
7.
The primary server is configured to run the scheduler:
PBS_START_MOM=0
PBS_START_SCHED=1
8.
Start the PBS daemons by executing: /etc/init.d/pbs start
71
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Redundancy and Failover – Setting Up, cont.
 Configuring Failover on the Secondary Server
1. Install PBS on the secondary server’s host
2. Mount the $PBS_HOME directory to same shared file system where the primary’s
$PBS_HOME is mounted to
3. Configure the /etc/pbs.conf file to include the following settings:
PBS_PRIMARY=<primary_host>
PBS_SECONDARY=<secondary_host>
PBS_SERVER=<short name for primary_host>
4. Since only one instance of the PBS scheduler can be running, only the primary server
is configured to run it; the secondary will not run it
PBS_START_MOM=0
PBS_START_SCHED=0
5. Start the PBS daemons by executing: /etc/init.d/pbs start
72
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Redundancy and Failover – Setting Up, cont.
Configuring Failover on Execution and Client Hosts
1.
Install PBS on each execution host
2.
On each execution host, configure the /etc/pbs.conf file to include the
following parameters:
PBS_PRIMARY=<primary_host>
PBS_SECONDARY=<secondary_host>
PBS_SERVER=<short name for primary host>
3.
Install the client commands on each client host
4.
On each client host, configure the /etc/pbs.conf file to include the following
parameters:
PBS_PRIMARY=<primary_host>
PBS_SECONDARY=<secondary_host>
PBS_SERVER=<short name for primary host>
73
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Redundancy and Failover – Behavior
 What type of communication occurs between the primary and secondary
servers when the daemons are running?
•
The secondary server will periodically attempt to connect to the primary server
•
The primary server will send a “handshake” every few seconds to the secondary server
•
Doing a “qstat –Bf” will show which of the two servers is active; look at the “server_host” line
 What happens when the secondary server becomes active?
•
PBS will send an email from the email account defined in the server’s “mail_from” attribute
that a failover has occurred
•
The Secondary will communicate with the primary’s scheduler
•
•
If it cannot communicate then the secondary server will launch its own scheduler process
The Secondary server will inform all the PBS MOM that it’s the active server
 How does a failover impact PBS users?
•
Users will not notice when a failover occurs
•
When a user uses a PBS command such as qstat, the command will try to connect to the
primary server first. If it fails, it will try the secondary server.
•
If the secondary responds to the command, a local file is created so this process doesn’t repeat
every time that user sends PBS commands
•
This file is removed after the primary becomes active
74
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Chapter Six: Limiting Resource Usage
Chapter Six
 Concept
 Terminology
 Attributes
 Users
 Groups
75
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Resource Usage: Concept
 PBS allows sites to setup separate resource limits by individual users or
groups, generic users or groups, and total used by all users
 Different methods of resources limits can be set:
•
•
•
•
•
•
•
•
total number of jobs that can run in a PBS complex
total number of jobs a single user can run (named or generic )
total number of jobs a group can run (named or generic)
maximum amount of resource that a user can request per job
maximum amount of resource that a group can request per job
total number of jobs that can be queued
total number of jobs that a user can have in a queue
total number of jobs that a group can have in a queue
 Limit attributes are set within the qmgr utility
•
•
at server level
at queue level
 PBS managers and operator can set limit attributes
76
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Resource Usage: Terminology
 Terminology
User Limits
Description
limit-spec
All users
A limit for the total amount of resources allocated to
all users combined
o:PBS_ALL
Generic users
A limit for any single user
u:PBS_GENERIC
An individual user
A limit for a named user
u:<username>
Group Limits
Description
limit-spec
Generic groups
A limit for any group
g:PBS_GENERIC
An individual group
A limit for a named group
g:<groupname>
Note: <limit-spec> is case-sensitive
77
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Resource Usage: Attributes
 Resource limit attributes
<limit attribute>
Description
max_run
Maximum number of jobs allowed to be running
max_run_soft
Soft limit of number of jobs allowed to be running
max_run_res.<resource>
Maximum amount of specified resource that be can
allocated to running jobs
max_run_res_soft.<resource>
Soft limit on the amount of specified resources that be can
allocate to running jobs
max_queued
Maximum number of jobs allowed in a queue
max_queued_res.<resource>
Total amount of specified resource that can be allocated
to queued or running jobs
Syntax
•
Server level
set server <limit_attribute> += “ [<limit_spec=<value>]
•
Queue Level
set queue <queue_name> <limit_attribute> += “ [<limit-spec>=<value>]”
78
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Resource Usage: Users
 Limit the total number of running jobs for all users within a PBS complex to 4 jobs
•
set server max_run = “[o:PBS_ALL=4]”
 Limit a set number of running jobs for each user to 4 jobs
•
set server max_run = “[u:PBS_GENERIC=4]”
 Limit the number of running jobs for user “pbsuser01” to 4 jobs
•
set server max_run += “[u:pbsuser01=4]”
 Limit the TOTAL number of running jobs for all users to 7; however allow user
“pbsuser01” to run 5
•
set server max_run += “[o:PBS_ALL=7] , [u:pbsuser01=5]”
 Generic Users =3; user “pbsuser01” = 2; user “pbsuser02”=5
•
set server max_run +=“[u:PBS_GENERIC=3], [u:pbsuser01=2],[u:pbsuser02=5]”
79
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Resource Usage – Groups
 Limit the total number of running jobs for any groups within a PBS complex to 4
jobs
• set server max_run = “[g:PBS_GENERIC=4]”
 Limit the number of running jobs for a named group: opti to 4 jobs
• set server max_run += “[g:opti=4]”
80
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Chapter Seven - Job Attributes & Selective Query
Chapter Four

Altering requested job resources

Handling output and error files

Job’s staging and execution directory

File staging

Sending messages to PBS jobs

Sending signals to PBS jobs

Selective job querying

Job dependencies

Moving jobs between queues

Holding and releasing jobs

Deferring job execution

Specifying email notifications

Exercises
81
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Altering Requested Job Resources – Using “qalter”
 Job’s requested resources can be changed even after submitted
• Using “qalter” command:
Usage:
qalter -l <resource_name>=<new_value> <job_id>
Example:
qalter -l select=1:ncpus=3 0.traintb16
 Can a job’s requested resources be altered once that job has started
execution?
• Yes, but only certain types of resources
Resource
Before Execution
After Execution
cputime
YES
YES- smaller amount
walltime
YES
YES
ncpus
YES
NO
memory
YES
NO
Note: Managers and Operators can
grant more resources even if job has
started
82
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Handling Output and Error Files
 Users have the ability to control how their output and errors are handled when their
jobs are completed
•
•
Can be defined at qsub command line or as a PBS directive
By default files are copied using rcp; scp can be configured
 Option #1: Specifying the path/filename of STDOUT/STDERR
• -o <path><filename>
•
-e <path><filename>
 Option #2: Where to retain STDOUT/STDERR files
Note:
Options
Description
-k e
STDERR to be retained in job’s staging/execution
directory
-k o
STDOUT to be retained in job’s staging/execution
directory
-k oe
Both files to be retained in job’s staging/execution
directory
-k n
Neither file is retained
Option #1 and #2 cannot be mixed together
If .O and .E cannot be copied back it is retained on the execution host in the directory $PBS_HOME/undelivered
83
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Job’s Staging and Execution Directory
 Default job staging is user’s home directory
jobdir = <user’s home directory>
 Alternative method is have PBS create a unique directory for each job; this
is done by using the sandbox attribute
Usage:
qsub –W sandbox = <HOME | PRIVATE>
Where:
HOME
PRIVATE
user’s home directory; default
PBS will create a job-specific directory
• Where the PRIVATE directory name has the form:
pbs. <job_id.server_name>.<id_string> 
pbs.21.traintb16.x8z
84
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Job’s Staging and Execution Directory, cont
 If using sandbox=PRIVATE:
• jobdir = /home/pbsuser01/pbs.17.traintb16.x8z
• .O and .E will be copied to where it was qsub
• after the job is completed the PRIVATE ($jobdir) directory is deleted
 If using sandbox=PRIVATE with –k oe option:
• jobdir = /home/pbsuser01/pbs.17.traintb16.x8z
• .O and .E will remain in $jobdir directory
• after the job is completed the PRIVATE ($jobdir) directory is deleted;
including the .O and .E
85
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
File Staging
 Input/Output File Staging
•
Users can specify which files/directories are copied onto the execution host before their job
executes. This is known as STAGE IN.
•
Users can specify which files/directories are returned to the submission host or specified
directory after the job completes. This is known as STAGE OUT.
•
After a job is completed, all stage-in and stage-out files are removed.
Command line input argument:
qsub –W stagein = <remote_path/file@server_name>:<local_path/file>
qsub –W stageout = <file>:<remote_path/file@server_name>
PBS Directive:
#PBS stagein = <remote_path/file@server_name>:<local_path/file>
#PBS stageout =<local_path/file>:<remote_path/file@server_name>
Note: By default PBS uses RCP for file copying. SCP can be used.
Walltime is not charged during staging in and out of files.
86
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Sending Messages to PBS Jobs – Using “qmsg”
 String messages can be sent to a job’s output (.O) or error (.E) file
 Why?
• To have external events recorded to the jobs
• Useful for administrators to notify a job that system events occurred
where that job was running
• Using “qmsg” command:
Output file:
qmsg –O “<msg>” <job_id>
Error file:
qmsg –E “<msg>” <job_id>
Note: If flag “O” or “E” is not specified, the message is sent to the error
file
87
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Sending Signals to PBS Jobs - Concept
 Why send a signal?
• To force a program to take a specific action
 Most signals that are used:
Signal
Description
SIGHUP
Hangs up the program process
SIGTERM
Terminates the program process
SIGINT
Interrupts the program process
SIGKILL
Kills now regardless of the state of the program
suspend
Suspends a job process
resume
Resumes a job process
88
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Sending Signals to PBS Jobs – Using “qsig”
 Sending a signal
• Using “qsig” command
Usage:
qsig –s <signal> <job_id>
Example:
qsig –s suspend 0.traintb16
qsig –s resume 0.traintb16
Note: Here, <signal> can be either the name of the signal, or its corresponding
unsigned number.
89
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Selective Job Querying – Using “qselect”
 Using qstat will output status of all current jobs
 The qselect command can return a list of job IDs that meet specific criteria
Option
Usage:
qselect –<option>
Value
Description
-N
<name>
Job name
-q
<queue>
Queue name
-s
<job state>
Job states R,Q, etc
-u
<user name>
User name
-H
OP
Description
.eq.
equal to
.ne.
not equal to
.ge.
Finished or moved jobs
-l
<res.OP.value>
By resources
-t
<.sub_option.time_attribute.value>
By certain time type
sub_option
time_attribute
Description
a
Execution_Time
time job began execution
c
ctime
job creation time
greater than or equal to
e
etime
job end time
.gt.
greater than
g
eligible_time
accrued eligible time
.le.
less than or equal to
m
mtime
modification time
.lt.
less than
q
qtime
job queued time
s
stime
job start time
90
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Selective Job Querying – Using “qselect”, cont.

Examples:
•
To find job IDs of jobs belonging to a particular user:
qselect –u user01
•
To find job IDs of running jobs that have requested greater than 4 ncpus:
qselect –s R –l ncpus.gt.4
•
To query jobs that are currently in the run state wrapped around qstat:
qstat `qselect –s R`
•
To delete all jobs in a PBS complex wrapped around qdel:
qdel `qselect`
•
To list all jobs in a PBS complex including finished or moved jobs:
qselect –x
•
To list jobs between a time of start time:
qselect -ts.gt.09251200 -ts.lt.09251500
Note: Using qselect without any options outputs all job IDs
91
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Job Dependencies - Concept
 Users have the ability to specify dependencies between their jobs,
such as:
• Specify order of execution
• Execute the next job only if previous job finished
• Place jobs on hold until a particular job starts or completes
 Using “qsub” command
Usage:
qsub –W depend=<type>:<arg_list> <job_script>
Example:
qsub -W depend=afterok: 1.traintb16 my_script
 To find out if a job has dependencies: qstat –f <jobid>
job_state = H
depend: afterok:1.traintb16@prog.altair.com
Note: jobs that request a dependency will be placed in “H” state
92
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Job Dependencies – Dependency Types
Dependency Type
Description
after:<arg_list>
Job may be scheduled for execution after all jobs in <arg_list> have started execution
afterok:<arg_list>
Job may be scheduled for execution only after all in <arg_list> have terminated with no errors.
afternotok:<arg_list>
Job may be scheduled for execution only after all jobs in <arg_list> have terminated with errors.
afterany:<arg_list>
Job may be scheduled for execution after all jobs in <arg_list> have terminated with or without
errors.
before:<arg_list>
Jobs in <arg_list> may begin execution once this job has begun execution
beforeok:<arg_list>
Jobs in <arg_list> may begin execution once this job terminates without errors
beforenotok:<arg_list>
Jobs in <arg_list> may begin execution once job terminates execution with errors
beforeany:<arg_list>
Jobs in <arg_list> may begin execution once this job terminates execution, with or without errors
on:<count>
Job may be scheduled for execution after count dependencies on other jobs have been satisfied
93
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Chapter Eight - PBS Server & Site Configurations
Chapter Eight
 Viewing and setting server, queue, and vnode attributes
 Server log information
 Creating a backup of the PBS environment

Exercises
94
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Viewing PBS Server Configuration – Using “qmgr”
 PBS Administrators can use the PBS utility “$PBS_EXEC/bin/qmgr” to view and
modify PBS server, queue and vnode attributes.
• The qmgr command prints out the commands to re-create server and queue settings.
The values shown below are the defaults.
create queue workq
set queue workq queue_type = Execution
Default queue settings
set queue workq enabled = True
set queue workq started = True
set server scheduling = True
set server default_queue = workq
set server log_events = 511
set server mail_from = adm
set server query_other_jobs = True
set server resources_default.ncpus = 1
set server default_chunk.ncpus = 1
set server scheduler_iteration = 600
set server resv_enable = True
Default server settings
set server node_fail_requeue = 310
set server max_array_size = 10000
set server pbs_license_info = 7788@localhost
set server pbs_license_min = 1
set server pbs_license_max = 2147483647
set server pbs_license_linger_time = 3600
set server license_count = "Avail_Global:32 Avail_Local:1 Used:0 High_Use:0"
set server eligible_time_enable = False
set server max_concurrent_provision = 5
95
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Qmgr Commands
 Helpful qmgr commands
•
List of qmgr commands and PBS version:
qmgr: help
•
Print out commands to re-create server/queue:
qmgr: print server|queue @default
•
Print server/queue attributes and their values:
qmgr: list server|queue @default
•
Print attributes and values of a specific queue:
qmgr: list queue <queue_name>
•
Print out commands to re-create named queue: qmgr: print queue <queue_name>
•
To delete a queue:
qmgr: delete queue <queue_name>
•
Print out commands to re-create vnodes:
qmgr: print nodes @default
•
Print attributes and values of a specific vnode:
qmgr: list node <node_name>
•
To set the value of an attribute:
qmgr: set server|queue|node <attribute>
•
To unset the value of an attribute:
qmgr: unset server|queue|node <attribute>
•
To create a new queue or vnode:
qmgr: create queue|node
96
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Server – Understanding PBS Server Attributes
 Setting server attributes allows PBS Administrators to specify who can
submit jobs, how many jobs can be running, resource limits (min, max,
available, and default), reservations, access control list (acl), etc.
 Three levels of privilege: User, Operator, and Manager. Managers have
greatest privilege.
• All users can list or print attributes.
• Operators can additionally set or unset attribute values.
• Managers can additionally create or delete queues and vnodes
 PBS server daemon must be running in order to execute the qmgr
utility.
 Any changes made to server attributes via qmgr go into effect as soon
as they are entered; the pbs_server daemon does not need to be
restarted.
97
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Server - Server Configuration Attributes
Attribute
Description
scheduling
Specifies whether or not the scheduler will schedule jobs. T|F
default_queue
Queue to which jobs are sent when users don’t specify a target queue. This
is set to ‘workq’ by the install script.
log_events
Specifies which events are logged by the server.
mail_from
Username from which server sends mail. Default: “adm”
query_other_jobs
Specifies whether users can query other users’ job stats. T|F
resources_default.ncpus
Default value for ncpus assigned a given job if not requested at qsub
default_chunk.ncpus
Default value for ncpus per chunk
scheduler_iteration
Time between non-event-driven scheduling iterations
resv_enable
Enables/disables requesting reservations
node_fail_requeue
Time value for the server to wait for primary execution vnode to come back
up before it will re-queue or delete the vnode’s jobs
max_array_size
Maximum number of subjobs allowed in a job array
eligible_time_enable
Controls whether a job’s eligible_time attribute is used as its starving time
max_concurrent_provision
The maximum number of vnodes allowed to be in the process of being
provisioned
98
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Queues – Understanding Queues
 PBS uses a resource-based scheduling system, where submitted jobs
are held in a container waiting for execution.
 This container is known as a “queue”.
 There are two types of queues: Execution and Route
• Execution queue – jobs waiting for execution or running jobs
• Route queue –routes jobs to either another execution or
another route queue
 Queues can be set up with attributes such as:
•
•
•
•
Number of jobs running
Max queued
Resources available
Which users/groups/hosts have access
 PBS comes with a predefined default execution queue: workq
99
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Queues – Attributes of an Execution Queue
 PBS administrators use the PBS “qmgr” utility to view, modify, and
delete queues
 To view the attributes of queue workq: list queue workq
Qmgr: list queue workq
Name of queue
Type of queue
Number of jobs in queue
Number of jobs in each state
Queue workq
queue_type = Execution
total_jobs = 0
state_count = Transit:0
Queued:0 Held:0
Waiting:0 Running:0
Exiting:0 Begun:0
Whether queue accepts new jobs
enabled = True
Whether queue’s jobs can be run
started = True
100
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Queues – Creating an Execution Queue
 Only PBS Administrators can create and delete queues
 To print out the commands to recreate queue workq:
print queue workq
Qmgr: print queue workq
Creation of a given queue
Indicates what type of queue
True|False: jobs can be enqueued
True|False: jobs can be scheduled for execution
create queue workq
set queue workq queue_type = Execution
set queue workq enabled = True
set queue workq started = True
101
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Queues – Creating an Execution Queue, cont
 Creating a new queue named “my_queue”
1. create queue my_queue
Naming and creating the new queue
1. set queue my_queue queue_type = Execution
Defining this queue as an Execution (or Route) queue
1. set queue my_queue enabled = TRUE
Setting the enabled attribute to True allows job to be enqueued
1. set queue my_queue started = TRUE
Setting the started attribute to True allows jobs to run from this
queue
102
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Queues - Execution Queue Attributes
Attribute
Description
max_queuable
Maximum number of jobs allowed in queue
max_running
Maximum number of jobs allowed to be running
resources_default.<res_name>
Default resource assigned to a job if that resource is
not specified via qsub command
resources_max.<res_name>
Maximum amount of resource request for jobs that are
allowed into this queue
resources_min.<res_name>
Minimum amount of a resource request for jobs that
are allowed into this queue
resources_available.<res_name>
Maximum amount of resource allowed to be used by all
running jobs in this queue
103
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Queues - Why Use Multiple Execution Queues?
 Why would a PBS complex have multiple queues instead of a
single queue?
• Having multiple queues could help with the following:
• Various types of applications
•
• Access by different groups of users, hosts, or groups
• Long, medium, or short running jobs
• Different architectures
• Various resources
• Assigning a dedicated queue to a host/vnode
• Peering jobs to another PBS complex
104
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Queues – Setting Access Control on Queues

Queues can be configured so that only certain users, groups, or hosts can submit jobs to
a particular queue.
•
This functionality is called an Access Control List – “ACL”
•
There are 3 types of access level <acl_type>:
“user”
a list of users who are allowed to enqueue jobs
“group”
a list of groups who are allowed to enqueue jobs
“host”
a list of hosts that are allowed to enqueue jobs
To set an ACL on a queue:
1.
Enable the ACL functionality for that queue:
set queue <queue_name> acl_<acl_type>_enable = True
2.
Assign a UNIX/Linux list of users, groups, or hosts that will have access:
set queue <queue_name> acl_<acl_type>s += “<list of users, groups, or hosts>”
3.
To restrict a user, use the minus operator symbol:
set queue <queue_name> acl_<acl_type>s = “- <user>”
105
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Queues – Creating a Routing Queue
 Routing queues route jobs to an execution queue or to another routing
queue
• How can a routing queue be beneficial?
•
•
•
Allows users to submit to one queue instead of specifying at qsub
Destination queues can be set up by ACL or resource restrictions
Jobs can be routed to another PBS complex
 To create a routing queue named “routeq”:
1. create queue routeq
2. set queue routeq queue_type = Route
3. set queue routeq route_destinations += “my_queue”
4. set queue routeq enabled = True
5. set queue routeq started = True
- List of execution or route queues to be routed to
- Comma-separated
106
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Queues - Routing Queue Attributes
 Routing queues may also be configured with queue attributes such as:
•
•
•
•
•
route_lifetime
max_queuable
resources_max
resources_min
access control list (ACL)
 To prevent users from submitting jobs directly to an execution queue (thus
bypassing the route queue), you can set the following attribute:
Usage:
set queue < queue_name> from _route_only = True
 To assign multiple execution queues as “route_destinations” :
Usage:
set queue <queue_name> route_destinations += “queue1, queue2, queue3”
107
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Queues – Assigning Queue Priorities
 Queues can be assigned a priority level between -1023 and
+1024
• By default a new queue has a priority level set to 0
• Setting a non-default priority level serves two functions:
1) PBS Scheduler sorts the queues from high to low using this priority level
for job sorting
2) Enables queue to be an Express Queue (by default, priority >= 150)
• useful in determining which job to preempt when using Preemptive
Scheduling
Usage:
set queue <queue_name> priority = <value>
Example:
set queue my_queue priority = 100
108
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Queues - How Updates Affect Jobs
 Any modifications made via the qmgr utility take place immediately
and do not require the pbs_server daemon to be restarted
 Certain types of attributes will affect those jobs already queued but not
running
 Using qmgr to delete a queue that has jobs enqueued or running is not
allowed
Alternative Methods:
• May want to stop enqueuing jobs into the queue by setting
enabled=false and let the queue drain the jobs
• If waiting for the queue to drain is not an option
-
Option 1: use qdel to delete the jobs
-
Option 2: use qmove to move jobs to a different queue; or to another PBS complex
109
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Queues – Queue Status
 To obtain status of all the queues within a PBS complex: qstat –Q[f]
Output of: qstat –Q
Queue
Max
Tot Ena Str
Que
Run
Hld
Wat
Trn
Ext Type
---------------- ----- ----- --- --- ----- ----- ----- ----- ----- ----- ---workq
0
0 yes yes
0
0
0
0
0
0 Exec
Output of: qstat –Qf
Queue: workq
queue_type = Execution
total_jobs = 0
state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:0 Exiting:0 Begun:0
resources_assigned.ncpus = 0
resources_assigned.nodect = 0
enabled = True
started = True
110
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Nodes - Understanding PBS Vnodes
 What is a host?
• An instance of a single OS running
• A machine
 What is a PBS MOM?
•
•
•
•
•
Executes the job script
Reports back to the server when the job is completed
Enforces some job resource limits
Can manage multiple vnodes
Tracks job resource usage
 What are vnodes?
• An abstract object representing a set of resources which form a usable part
of a machine
- Can be one of the following: host, nodeboard, or blade
• A single host can be made up of multiple vnodes
111
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Nodes - Viewing Existing Vnodes
 There are two methods to view the list of vnodes and their attributes in
a PBS complex
Method 1
Method 2
Within qmgr: list nodes @default
Using pbsnodes –av at command line
Node traintb16
traintb16
Mom = traintb16
Mom = traintb16
Port = 15002
Port = 15002
pbs_version = PBSPro_11.0.0.103450
pbs_version = PBSPro_11.0.0.103450
ntype = PBS
ntype = PBS
state = free
state = free
pcpus = 1
pcpus = 1
resources_available.arch = linux
resources_available.arch = linux
resources_available.host = trantb16
resources_available.host = traintb16
resources_available.mem = 1027124kb
resources_available.mem = 1027124kb
resources_available.ncpus = 1
resources_available.ncpus = 1
resources_available.vnode = traintb16
resources_available.vnode = traintb16
resources_assigned.mem = 0kb
resources_assigned.mem = 0kb
resources_assigned.ncpus = 0
resources_assigned.ncpus = 0
resources_assigned.netwins = 0
resources_assigned.netwins = 0
resources_assigned.vmem = 0kb
resources_assigned.vmem = 0kb
resv_enable = True
resv_enable = True
sharing = default_shared
sharing = default_shared
112
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Nodes – Setting Vnode Attributes
Attribute
Description
comment
Assign a comment
max_running
Maximum number of jobs that can run on this
vnode
priority
Vnodes can be sorted by a priority level
state
Shows or sets the state of the vnode. Useful for
setting a vnode’s state to online/offline
queue
Associate a queue to a vnode
sharing
Defines whether more than one job at a time can
use this vnode's resources.
List of resource amounts available on this
resources_available.<res> vnode. If not explicitly set, amount shown is that
reported by pbs_mom running on the vnode.
113
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Nodes – Using “pbsnodes”
 Use “pbsnodes” to obtain a detailed listing of all the
hosts or vnodes in a PBS complex
Usage: pbsnodes <options>
Example:
Options
pbsnodes -a
Description
a
List all hosts and their attributes
av
List all vnodes and their attributes
l
Lists all hosts or vnodes with
state=DOWN or state=OFFLINE
114
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Nodes – Output of “pbsnodes –a”
pbsnodes –a
traintb16
Mom = traintb16
Port = 15002
pbs_version = PBSPro_11.0.0.103450
ntype = PBS
state = free
pcpus = 1
resources_available.arch = linux
resources_available.host = traintb16
resources_available.mem = 1027124kb
resources_available.ncpus = 1
resources_available.vnode = traintb16
resources_assigned.mem = 0kb
resources_assigned.ncpus = 0
resources_assigned.netwins = 0
resources_assigned.vmem = 0kb
resv_enable = True
sharing = default_shared
115
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Server - Server Log Information
 Server logs are stored on the host where the pbs_server daemon is running
• Location: $PBS_HOME/server_logs
• A new log file is created every day
– File name format: [YYYYMMDD]
 The logging level is configurable using qmgr utility
Usage:
set server log_events = <value>
Where <value> can be between 0 and 511
— 0
nothing is logged
— 511
default log level
— 2047
everything is logged; useful for debugging hooks
Note: When changing server’s log_event it is not necessary to restart the pbs_server daemon
116
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Server – Details of Server Log Entry
Sample of Server log entry:
09/14/2010 08:17:31;0002;Server@trainhp01;Svr;Log;Log opened
09/14/2010 08:17:45;0002;Server@trainhp01;Node;traintb16.prog.altair.com;node up
09/14/2010 08:18:36;0040;Server@trainhp01;Svr;traintb16;Scheduler sent command 3
syntax:
date-time;event_code;server_name;object_type;object_name;message_text
date-time
event_code
server_name
object_type
object_name
message_text
date and time stamp, format: mm/dd/yyyy hh:mm:ss
numerical code for type of event
name of the Server which logged the message
type of object which the message is about:
Svr=server
Que=queue
Job=job
Req=request
Fil=file
Node=vnode
Hook=hooks
name of the specific object
text of the log message
117
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Backing up Server, Queue & Vnode Settings
 PBS Administrators can safely back up their qmgr settings at the
command line:
1.
Output the server and queue settings:
qmgr –c “ print server” > server_queue_settings
2.
This command will print all attributes for all vnodes:
qmgr –c “ print node @default” > vnodal_settings
3.
This command will print all attributes for all hooks:
qmgr –c “print hook” > hook_definitions
 To restore settings:
qmgr < <input_file>
118
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Chapter Nine - PBS MOM Configuration
Chapter Nine
 What is the PBS MOM?
 Directory structure of $PBS_HOME/mom_priv
 Contents of $PBS_HOME/mom_priv/jobs
 Configuration parameters
 Enforcing resource limits
 Restricting user logins
 Checkpoint and restart
 MOM log information
 Details of MOM logs
 Exercises
119
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
What is the PBS MOM?
 The PBS MOM is the component responsible for monitoring and
executing PBS jobs, as well as the following:
•
•
•
•
Reports resource usage
Enforces resource usage limits
Notifies the server when the job has finished
Executes prologue/epilogue script
 Each execution host (MOM) has its own configuration file
• Located in $PBS_HOME/ mom_priv/config
• Provides several types of runtime information
-
Access control
Static resource names and values
External resources provided by a program to be run on request via a shell script
• Each parameter is on a separate line and component parts are separated by
white space
• Default contents of mom_priv/config: $clienthost traintb16
$restrict_user_maxsysid 499
120
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Directory Structure of $PBS_HOME /mom_priv
 Directory structure of $PBS_HOME/mom_priv *
mom_priv
config
Configuration file
jobs
When jobs are running the job script is placed in this directory
vnodemap
List of vnodes in a PBS complex
mom.lock
MOM pid lock file
* This information is for debugging purposes only. It may change in future releases.
121
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Contents of $PBS_HOME /mom_priv/jobs
 Contents of $PBS_HOME/mom_priv/jobs *
-rw------- 1 root
root
3427 Jun 10 00:40 2.traintb16.JB
-rwx------ 1 pbsuser01 users
drwx------ 2 root
root
22 Jun 10 00:40 2.traintb16.SC
4096 Jun 10 00:40 2.traintb16.TK
 If a job is running on a given host it creates 2 files and 1 directory for each
job in the mom_priv/jobs directory
<job_id>.<server_name>.JB
used
Contains job information such as resources
<job_id>.<server_name>.SC
User job script
<job_id>.<server_name>.TK
Directory containing that job’s task
* This information is for debugging purposes only. It may change in future releases.
122
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS MOM – MOM Configuration Parameters
Parameter
Description
$clienthost
List of hosts allowed to connect to MOM
$cputmult
Factor to adjust CPU time used by each job
$ideal_load
Declares the ideal mark for load on a vnode
$max_load
Declares the high water mark for load on a vnode
$kbd_idle
Enables idle workstation cycle harvesting
$logevent
Determines the kind of information logged to MOM logs
$max_check_poll
Maximum time between polling cycles
$min_check_poll
Minimum time between polling cycles
$prologalarm
Timeout period for prologue/epilogue script
$restricted
List of hosts that are allowed to connect to MOM without needing a privileged port
$restrict_user
Controls whether normal users without a job running can log into the host
$restrict_user_maxsysid
Aany user with UID less than this value is exempt from $restrict_user
$suspendsig
Alternative signal to suspend job instead of SIGSTOP
$usecp
Tells MOM to use cp instead of rcp/scp for stdout/err file transfers
$wallmult
Factor used to adjust walltime usage by a job
$tmpdir
Specifies location of job scratch directory
Note: After modifying the MOM’s config file, a ‘SIGHUP” must be sent to that pbs_mom daemon
$jobdir_root
job-specific staging and execution directories
123
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS MOM – Enforcing Resource Limits
 Each MOM can be configured to enforce job resource limits by setting the
$enforce parameter in the mom_priv/config file
Attribute
Type
Description
average_cpufactor
float
Modifies cpuaverage; ncpus limit multiplier
average_percent_over
int
Modifies cpuaverage; percentage over ncpus limit
to allow
average_trialperiod
int
Modifies cpuaverage; minimum walltime before
enforcement
Default
1.025
50
120s
cpuaverage
boolean
enforce this limit
off
cpuburst
boolean
enforce this limit
off
Modifies cpuburst; ncpus limit multiplier
1.5
int
Modifies cpuburst; percentage over the limit to
allow
50
delta_weightup
float
Modifies cpuburst; weighting when average is
moving up
0.4
delta_weightdown
float
Modifies cpuburst; weighting when average is
moving down
0.1
Enforces each job’s memory limit
off
delta_cpufactor
delta_percent_over
mem
float
boolean
124
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS MOM - Restricting User Logins
 PBS Professional can be configured to kill user-owned processes
when that user does not have a job running on that host through
PBS
• To configure this functionality, add the following parameter to the
$PBS_HOME/mom_priv/config file:
$restrict_user on
Note: When this feature is turned on, all processes belonging to any users who log onto that
execution host will be terminated, thus kicking them off
• To create a list of users who are allowed when this featured is enabled:
$restrict_user_exceptions userA, userB, userC
Note: Up to 10 user names are allowed
• To restrict users whose user ID is greater than a specified number:
$restrict_user_maxsysid <number>
125
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS MOM – Checkpoint and Restart

PBS administrators can use their own site-defined external checkpoint facility
• This is useful on systems that don’t support OS-level checkpointing
• Provided by application or other external means

Site-specific checkpointing is configured in the MOM configuration file mom_priv/config by
using the $action parameter and an action
Action
Argument
Description
checkpoint
TIME_OUT !SCRIPT_PATH
ARGS[…]
Specifies that the script in SCRIPT_PATH is run and the
job is left running
checkpoint_abort
TIME_OUT !SCRIPT_PATH
ARGS[…]
Specifies that the script in SCRIPT_PATH is run and the
job is terminated
restart
TIME_OUT !SCRIPT_PATH
ARGS[…]
Specifies the script to be used to restart the job
$restart_background
true|false
Specifies how the job is restarted
$restart_transmogrify
true|false
Controls how MOM launches the restart script/program
126
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS MOM - MOM Log Information
 Each execution host has its own MOM log files
• Location: $PBS_HOME/mom_logs
• A new log file is created every day
-
file name format: [YYYYMMDD]
 The logging level is configurable in $PBS_HOME/mom_priv/config
Usage:
$logevent <value>
Where <value> can be between 0 and 0xffffffff
-
0
Nothing is logged
-
0xffffffff All information is logged
Note: When changing the log event a SIGHUP to the pbs_mom daemon signals it to reread the mom_priv/config file
127
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS MOM – Details of MOM Log Entry
Sample of MOM log entry:
09/14/2010 11:35:55;0008;pbs_mom;Job;1.traintb16;Started, pid = 24073
09/14/2010 11:36:01;0080;pbs_mom;Job;1.traintb16;task 00000001 terminated
09/14/2010 11:36:01;0008;pbs_mom;Job;1.traintb16;Terminated
syntax: date-time;event_code;server_name;object_type;object_name;message_text
date-time field
event_code
pbs_daemon
object_type
Date and time stamp. Format: mm/dd/yyyy hh:mm:ss
Numerical code for type of event
pbs_mom
Type of object which the message is about:
Svr=server
Que=queue
Job=job
Req=request
Fil=file
Node=vnode
object_name
message_text
Name of the specific object
Text of the log message
128
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Chapter Ten - PBS Scheduler Configuration
Chapter Ten
 What is the PBS scheduler?
 Directory Structure of $PBS_HOME/sched_priv
 Default behavior of the scheduler
 Scheduler configuration file
 Default scheduling parameters
 Scheduler log information
129
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
What is the PBS Scheduler?
 What is the PBS scheduler?
• The PBS daemon that is responsible for enforcing site policy, by
choosing the order in which jobs are run, and on what resources
• The scheduler provides various scheduling policies such as:
-
First in First Out (FIFO)
-
Sort jobs based on multiple resources (high to low or low to high)
-
Sort nodes based on resources or priority level
-
Sort queues based on priority level or by qstat –Q output order
-
Allow jobs from higher priority queues to be eligible to run first
-
Allow jobs to move between two or more PBS complexes
-
Allow jobs to run in a dedicated time space
-
Enforce fair portions of a site’s resources and usage
130
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Directory Structure of $PBS_HOME /sched_priv
 Directory structure of $PBS_HOME/sched_priv *
sched_priv
sched_config
Scheduler configuration file
dedicated_time
Specifies dedicated time
resource_group
Specifies relative percentages between
fairshare entities
holidays
Lists holidays to be treated as “nonprimetime”
sched_out
Debug messages
sched.lock
pbs_sched pid lock file
* This information is for debugging purposes only. It may change in future releases.
131
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Default Behavior of the Scheduler
 What events happen within a scheduling cycle?
1. Server will send list of MOM resources to the Scheduler
1. Scheduler will sort all the resources based on default scheduling policies
1. Scheduler will sort queue(s)
-
If one or more queues have priority attribute set then sort based on queue priority
-
If no queue priority is set then it will randomly sort the queues or by qstat –Q output order
-
If a queue’s priority is set to 150 or higher jobs from this queue will be eligible for
execution first
-
If a queue’s priority is set to 150 or higher and preemption is enabled, then preemptive
scheduling will be enforced, allowing jobs from this queue to preempt other jobs
2. Scheduler will sort the jobs from the first queue
-
Jobs are sorted based on when they were enqueued
-
If a job has been marked “starving” and if the help_starving_jobs scheduling policy is
turned on, it will move that job up in sort priority
132
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Using the sched_config file
 Parameter format:
name: value
[prime | non_prime | all | none]
Name
Description
name
Name of the scheduler parameter
non-changeable
value
Type: string, string array, integer, boolean, time
case-sensitive
prime
Applies only to primetime period
case-sensitive
non_prime
Applies only to non-primetime period
case-sensitive
all
Applies to both primetime and non-primetime periods;
default if prime/nonprime is not specified
case-sensitive
none
Not used
• Primetime and non-primetime period are set in the sched_priv/holidays file
• Must send a “kill –HUP <pbs_sched_pid>” in order for the Scheduler to re-read
the configuration file
• Any modifications may affect not only queued jobs but also running jobs
133
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Default Scheduling Parameters – sched_config
Parameter
Value
Parameter
Value
round_robin:
false
load_balancing:
false
by_queue:
true
smp_cluster_dist:
pack
strict_ordering:
false
#unknown_shares:
10
help_starving_jobs:
true
fairshare_usage_res:
cput
max_starve:
24:00:00
fairshare_entity:
euser
backfill:
true
half_life:
24:00:00
backfill_prime:
false
sync_time:
1:00:00
prime_exempt_anytime_queues
false
#fairshare_enforce_no_shares:
true
#prime_spill:
1:00:00
preemptive_sched:
true
primetime_prefix:
p_
preempt_queue_prio:
150
nonprimetime_prefix:
np_
preempt_prio:
"express_queue, normal_jobs“
#job_sort_key:
"cput LOW”
preempt_order:
"SCR“
node_sort_key:
"sort_priority HIGH”
preempt_sort:
min_time_since_start
sort_queues: true
true
dedicated_prefix:
ded
resources:
"ncpus, mem, arch, host, vnode“
log_filter:
3328
#sched_cycle_length
20:00:00
134
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Scheduler – Scheduler Log Information
 Scheduler logs are stored on the machine where the pbs_sched daemon is running
(default)
• Location: $PBS_HOME/sched_logs
• A new log file is created every day
– file name format: [YYYYMMDD]
 The logging level is configurable in $PBS_HOME/sched_priv/sched_config:
Usage:
log_filter: <value>
Where <value> can be between 0 and 3328
• 0 Means to log everything
• 3328
Default value
• 4095
Log nothing
Note: When changing the scheduler log event it is necessary to do a kill –HUP on the pbs_sched pid
135
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS Scheduler – Details of Scheduler Log Entry
Sample of scheduler log entry:
09/14/2010 16:48:36;0080;pbs_sched;Req;;Starting Scheduling Cycle
09/14/2010 16:48:36;0080;pbs_sched;Req;;Leaving Scheduling Cycle
09/14/2010 21:45:47;0002;pbs_sched;Svr;Log;Log closed
syntax: date-time;event_code;server_name;object_type;object_name;message_text
date-time field
Date and time stamp. Format: mm/dd/yyyy hh:mm:ss
event_code
Numerical code for type of event
pbs_daemon
pbs_sched
object_type
Type of object which the message is about:
Svr=server
Que=queue
Job=job
Req=request
Fil=file
Node=vnode
object_name
Name of the specific object
message_text
Text of the log message
136
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Chapter Twelve - Scheduling Custom Resources
Chapter Twelve
 Custom Resources
 Resource Types
 Resource Flags
 Understanding the resourcedef file
 Different examples of using custom resources
•
Host/vnode level resource
•
Boolean resource
•
Server level resource
•
Queue level resource
•
Query execution hosts
•
Query FLEXlm server
137
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Scheduling Resources - Custom Resources
 The PBS Scheduler supports arbitrary resources, e.g. to track disk space, or
application licenses
 Limiting resource usage for users, groups, queues, and vnodes influences the
order in which jobs are started
 Resources may be tracked in two ways:
• Internally by PBS: resources which are consumed by PBS jobs only
• External scripts: resources which might be consumed by PBS jobs and/or
outside of PBS
 Resources can exist at various levels
• Host (vnode) level
• Server and queue level
 Resource matching
• Via arithmetic comparison for number and size type resources
• Via string matching for Boolean and string resources
138
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Scheduling Resources – Resource Types
Data Types
Description
Consumable/NON
• defined at vnode level
• used within a select statement
non-consumable
float
• values [+-] 0-9 [[0-9] …][.][[0-9]…]
consumable
non-consumable
long
• values 0-9[[0-9]…]
consumable
non-consumable
size
• number of bytes or words
consumable
non-consumable
• string value
non-consumable
• multiple string values separated by comma
non-consumable
boolean
string
string_array
• maximum time period that resource can be
time
used
• format: [hh:mm:ss[.ms]]
consumable
non-consumable
139
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Scheduling Resources – Resource Flags
Flags
h
n
f
q
<no
flag>
Description
• host level resource, static or dynamic
• used within select statement
• host level resource, “n” means static
• must also use flag “h”
• host level resource
• must also use flag “h”
• server level resource
• queue level resource
• server level resource, no flag means dynamic
• queue level resource
i
• invisible
• users cannot request or qalter this resource
• users cannot view the value using qstat –f
r
• read only
• users cannot request or qalter this resource
• users can view the value using qstat -f
Consumable/NON
non-consumable
consumable
consumable @ 1st
vnode
consumable
non-consumable
140
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Scheduling Resources – resourcedef
 Custom resources are defined in: $PBS_HOME/server_priv/resourcedef
• File needs to be created manually
• Permissions must be set to 644
Format:
resource_name type=<resource type> flag=<flag>
Sample of resourcedef
optistruct
motionsolve
radioss
jobtype
scratch
gwu
type=long
type=boolean
type=long
type=string
type=size
type=long
flag=hn
flag=h
flag=q
flag=h
Note: Any modifications to the resourcedef file require pbs_server to be restarted
141
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Custom Resource – Unset Resources
 Where resources are not set either at host/server/queue level PBS
assigns default values based on the type of resource
 Host/Vnode Level
Resource Type
Unset Resource
Request Value
boolean
False
False
float
0.0
0.0
long
0
0
size
0
0
string
“”
No match value
string_array
“”
No match value
time
0:00
0:00
 Server/Queue Level
• Numerical resources = infinite
 Custom resources can be set with infinite regardless at
host/server/queue by setting the scheduler parameter:
resource_unset_infinite
142
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Custom Resource – Host/vnode-Level
 Create a custom resource to be applied at the vnode level to indicate how much
of that resource is available at a given time
•
Define the custom resource in resourcedef:
•
Set the value of the custom resource in qmgr:
optistruct
type=long
flag=hn
set node traintb01 resources_available.optistruct=2
set node traintb02 resources_available.optistruct=0
•
Add the custom resource to sched_config file:
resources: “ncpus, mem, arch, host, vnode, optistruct”
•
Request the custom resource:
qsub –l select=1:ncpus=2:optistruct=1
143
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Custom Resource – Boolean Resource
 Create a custom resource to be applied at the vnode level. This custom
resource will indicate whether or not that resource is available on a given
vnode

Define the custom resource in resourcedef:
motionsolve type=boolean

flag=h
Set the value of the custom resource using qmgr:
set node traintb01 resources_available.motionsolve=true
set node traintb02 resources_available.motionsolve=false

Add the custom resource to sched_config file:
resources: “ncpus, mem, arch, host, vnode, motionsolve”

Request the custom resource:
qsub –l select=1:ncpus=2:motionsolve=true
144
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Custom Resource – Server Level Resource
 Create a custom resource to be applied at the server, to track how much of that
resource is available globally at a given time
• Define the custom resource within resourcedef:
radioss
type=long
flag=q
• Set the value of the custom resource using qmgr:
set server resources_available.radioss=8
• Add the custom resource to sched_config file:
resources: “ncpus, mem, arch, host, vnode, radioss”
• Request the custom resource:
qsub –l select=1:ncpus=2 –l radioss=1
145
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Custom Resource – Queue Level Resource
 Create a custom resource to be applied at the queue, to control whether or not a job
can be en-queued based on how whether the job requests this resource
• Define the custom resource within resourcedef:
jobtype type=string
• Set the value of the custom resource using qmgr:
set
set
set
Set
queue
queue
queue
queue
radioss
radioss
radioss
radioss
resources_available.jobtype=radioss
resources_min.jobtype=radioss
resources_max.jobtype=radioss
resources_default.jobtype= “ “
• Add the custom resource to sched_config file:
resources: “ncpus, mem, arch, host, vnode, jobtype”
• Request the custom resource:
qsub –l select=1:ncpus=2 –l jobtype=radioss
146
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Custom Resource – Query Vnodes
 Create a custom resource to query vnodes using a call-out script
• Define the custom resource within resourcedef:
scratch type=size flag=h
• Add the custom resource to sched_config file:
resources: “ncpus, mem, arch, host, vnode, scratch”
mom_resources: “scratch”
• Set the path to the script name in mom_priv/config file:
scratch !/usr/local/bin/scratch.pl
• Request the custom resource:
qsub –l select=1:ncpus=2:scratch=1GB
147
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Custom Resource – Query FLEXlm Server
 Create a custom resource to query the FLEXlm server to determine if enough
FLEX tokens are available for execution
• Define the custom resource within resourcedef:
gwu
type=long
• Add the custom resource to sched_config file:
resources: “ncpus, mem, arch, host, vnode, gwu”
• Set the path to the script in sched_config file:
server_dyn_res:”gwu !/var/spool/altair/scripts/lmstat”
• Request the custom resource:
qsub –l select=1:ncpus=2 –l gwu=50
148
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Chapter Eleven - Various Scheduling Policies
Chapter Eleven





















Job priorities in PBS
Sorting queues
Helping starving jobs
Eligible time
Backfill
Strict ordering
True FIFO
Preemptive scheduling
Hard & soft limits
Sorting jobs
Tunable formula
Round robin
SMP cluster scheduling
Sort execution hosts
Placement sets
Primetime & non-primetime
Dedicated time
Fairshare
Peer scheduling
Advance reservations
Exercises
149
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Chapter Fifteen– Hooks
Chapter Fifteen
 Concept
 Hook commands
 Setting up a custom hook
 Viewing hook definitions
 Exporting hook contents
 Exercises
150
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Hooks - Concept
 What are hooks?
• Custom call-out executables that give more precise control over
submitting jobs
• Written in Python programming language
• Example applications of hooks:
-
Allow/disallow en-queueing jobs based on user/group ID, amount of requested
resources, timeframe
Allow/disallow modifying job attributes of already-submitted jobs
Allow/disallow moving jobs to another execution queue or PBS complex
Allow/disallow requesting an advance/standing reservation
Look up 3rd party database for credentials
• To view hook logging information within the server logs the server
log_events attribute should be set to: 2047
• Only root can create hooks
151
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Hooks - Commands
 Hook commands in qmgr:
Command
Description
list hook <hook name>
List a hook’s attributes and their values
print hook <hook name>
Print a hook’s creation commands
create hook <hook name>
Create a new hook name
set hook <hook name> <attribute name> = <value>
Set a hook’s attribute
unset hook <hook name> <attribute name> = <value>
Unset a hook’s attribute
import hook <hook name> <content-type> <contentencoding> <input file>|-
Import a hook’s python script file
export hook <hook name> <content-type> <contentencoding> <output file>|-
Export a hook’s python script to a file
delete hook <hook name>
Remove a hook and its definition
152
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Hooks - Adding a Hook
 Steps to add a hook, using qmgr:
1. Add the hook name
•
first character must be alphabetic
Qmgr: create hook <hook_name>
2. Set the type of trigger event
•
can have multiple events associated with a single hook, using “+=“
Qmgr: set hook <hook_name> event = <event_name>
<event_name>
Description
queuejob
To allow/disallow enqueueing a job into a queue
modifyjob
To allow/disallow modifying job attributes
resvsub
To allow/disallow reservation requests by users
movejob
To allow/disallow moving jobs to another queue or PBS complex
runjob
To allow modifications of running jobs
provision
To provision a host
153
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Hooks - Adding a Hook, cont.
3.
Specify the path and name of the Python script
Qmgr: import hook <hook_name> application/x-python <content-encoding> \
<path/filename>
<content-encoding>
default (7bit)
Note: when importing a hook, PBS will try to evaluate the script.
If it cannot, it will report the information at the command
line and in the server logs
base64
Additional options:
4.
Relative order of hook execution; default = 1 (highest level)
Qmgr: set hook <hook_name> order = <n>
5.
Specify a timeout value for hook execution; default = 30 seconds
Qmgr: set hook <hook_name> alarm = <n>
154
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Hooks - Adding a Hook, cont.
6.
Enable or disable a particular hook; default = true
Qmgr: set hook <hook_name> enabled = <Boolean>
Note: The pbs_server daemon does not need to be restarted for a hook to be active
155
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Hooks - Viewing Hook Information
 Printing hook creation commands:
Qmgr:
print hook hookA
create hook hookA
set hook hookA type = site
set hook hookA enabled = true
set hook hookA event = ‘””’
set hook hookA user = pbsadmin
set hook hookA alarm = 30
set hook hookA order = 1
 Listing hook attributes:
Qmgr:
import hook hookA application/x-python base64 -
list hook hookA
Hook hookA
type = site
enabled = true
event = “”
user = pbsadmin
alarm = 30
order = 1
156
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Hooks – Exporting Hook Contents
 Reasons to use the export command
•
To view the current script content
•
To make a backup of the python script
•
To make modifications to the python script
•
To export a hook’s Python script to a file
 To export a hook’s Python script to a file:
Qmgr: export hook <hook_name> application/x-python <content-encoding> \<path/filename>
Note: if output file is not specified then it will be stdout
 To back up hook information:
qmgr –c “print hook <hook_name>” > hook_file
157
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Exercises
 Reject a job that doesn’t specify a walltime
(event = queuejob)
 Prevent users from altering any of their job attributes once submitted
(event = modifyjob)
 Prevent users from requesting a Reservation
(event = resvsub)
 Prevent users from moving their job to another queue or to another PBS complex
(event = movejob)
158
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Exercise – queuejob Hook
Objective:
To reject jobs at submission time that do not request walltime resource; the Python script is
already provided
Prerequisites: Disable any existing hooks
PBS Administrator Tasks:
1.
Use qmgr to create a hook called queuejob
2.
Set the event as queuejob
3.
The Python script is located in /root/hook_scripts/queuejob.py
4.
Leave the default attribute values as they are
PBS User Task:
1.
Submit job without requesting any walltime resource
Observation:
•
When submitting a job without requesting walltime resource, what, if any, message appears at the
command line?
•
Was the job enqueued or was it rejected?
159
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Exercise – modifyjob hook
Objective:
To disallow users from qaltering any of their submitted jobs’ attributes/resources; the Python
script is already provided
Prerequisites: Disable any existing hooks
PBS Administrator Tasks:
1.
Using qmgr create a hook called modifyjob
2.
Set the event as modifyjob
3.
The Python script is located in /root/hook_scripts/modifyjob.py
4.
Leave the default attribute values as they are
PBS User Task:
1.
Submit job
2.
qalter any one of a job’s attributes
Observation:
•
Was the user able to qalter the job?
•
If not, what error message, if any, was output?
160
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Exercise – resvsub hook
Objective:
To prevent users from requesting reservations
Prerequisites:
Disable any existing hooks
PBS Administrator Tasks:
1.
Using qmgr create a hook called resvsub
2.
Set the event as resvsub
3.
The Python script is located in /root/hook_scripts/resvsub.py
4.
Leave the default attribute values as they are
PBS User Task:
1.
Request a reservation using pbs_rsub
Observation:
•
Was the user able to request a reservation?
•
If not, what error message, if any, was output?
161
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Exercise – movejob hook
Objective:
Prevent users from moving their jobs to another queue or PBS complex
Prerequisites:
Disable any existing hooks
Should have at least 2 active queues
PBS Administrator Tasks:
1.
Using qmgr create a hook called movejob
2.
Set the event as movejob
3.
The Python script is located in /root/hook_scripts/movejob.py
4.
Leave the default attribute values as they are
PBS User Task:
1.
Qsub a job that should remain queued; not ready for execution
2.
Using qmove command, try to move it to another queue
Observation:
•
When trying to qmove, did it move that job where you requested?
•
If not, what error message if any was output?
162
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Chapter Sixteen – Miscellaneous
Chapter Sixteen
 PBS user and administrator commands
 PBS_EXEC/etc directory
 PBS_EXEC/unsupported/pbs_diag *
 PBS_EXEC/unsupported/pbs_dtj *
 Re-Installation of PBS Professional
*
These scripts are not supported.
163
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
PBS User & Administrator Commands
User Commands
Administrator Commands
Command
Purpose
Command
Purpose
nqs2pbs
Convert from NQS
pbs-report
Report job statistics
pbs_rdel
Delete Reservation
pbs_hostid
Report host identifier
pbs_rstat
Status Reservation
pbs_hostn
Report host name(s)
pbs_rsub
Submit Reservation
pbs_probe
PBS diagnostic tool
pbsdsh
PBS distributed shell
pbs_rcp
File transfer tool
qalter
Alter job
pbs_tclsh
TCL with PBS API
qdel
Delete job
pbsfs
Show fairshare usage
qhold
Hold a job
pbsnodes
Node manipulation
qmove
Move job
printjob
Report job details
qmsg
Send message to job
qdisable
Disable a queue
qorder
Reorder jobs
qenable
Enable a queue
qrls
Release hold on job
qmgr
Manager interface
qselect
Select jobs by criteria
qrerun
Re-queue running job
qsig
Send signal to job
qrun
Manually start a job
qstat
Status job, queue, server
qstart
Start a queue
qsub
Submit a job
qstop
Stop a queue
tracejob
Report job history
qterm
Shut down PBS
164
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
$PBS_EXEC/etc directory
 The directory $PBS_EXEC/etc contains backup PBS configuration files
such as the following, if you ever need to revert back to the default
configuration:
Filename
Description
pbs_dedicated
Dedicated time file
pbs_holidays
Holidays file
pbs_init.d
PBS init run script
pbs_postinstall
PBS postinstall script
pbs_resource_group
Fairshare resource group file
pbs_sched_config
Scheduler configuration file
165
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
$PBS_EXEC/unsupported directory – pbs_diag
 The pbs_diag* script is an interactive script that collects information
from PBS configuration files and job-related history
 Information that is collected:
•
•
•
•
•
•
•
•
•
•
•
•
•
•
qmgr settings for server, queues, and nodes
pbs_probe information about file permissions
pbs.conf master configuration information
pbsnodes node configuration/state information
qstat information about current state of the queues and server
information about existing reservations
pbs_hostn name resolution information
operating system version information
server, scheduler, and mom configuration files
tracejob and logging information for jobs specified by the user
server, scheduler, and mom logs for dates specified by the user
cpuset configuration information and current state if on a cpuset-aware system
vnode definition files
FLEXlm license server status
*
pbs_diag is not supported.
166
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
$PBS_EXEC/unsupported directory – pbs_dtj
 pbs_dtj* (Distributed TraceJob) is a command that enables a user to gather
tracejob information from ALL of the nodes where a PBS Professional job ran
 By default, the script uses rsh to connect to the nodes, although it will check
the pbs.conf file to see if PBS_SCP is set, and use ssh in that case
Usage: pbs_dtj <option>
Option
Description
-u <username>
Specify a user name under which to run
-r <rcommand>
Override the rsh/ssh settings in pbs.conf
-n
Number of days of log files to query
*
pbs_dtj is not supported.
167
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Re-installation of PBS Professional

Procedure to re-install PBS Professional either from the server or an
execution host
1.
Shut down any PBS daemons running on that host:
/etc/init.d/pbs stop
2.
Verify the PBS daemons are no longer running:
ps –ef | grep pbs
3.
Obtain the appropriate PBS rpm package name:
rpm –qa | grep pbs
4.
Remove the PBS rpm package:
5.
Remove the directories $PBS_HOME and $PBS_EXEC
6.
Remove the file /etc/pbs.conf
7.
Remove the file /etc/init.d/pbs
rpm –e pbs-11.0.0.103450
Refer to the Installation part of the PBS Professional Installation and Upgrade Guide for
complete installation procedure
168
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Chapter Seventeen - Troubleshooting
Chapter Seventeen
 pbs_probe
 pbs_hostn
169
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Using pbs_probe
 If a site has a post-installation issue, running the pbs_probe command may
help identify the cause and possible fix
 Using the pbs_probe command returns the following information
====== System Information =======
sysname=Linux
nodename=traintb16
release=2.6.22.5-31-default
version=#1 SMP 2007/09/21 22:29:00 UTC
machine=i686
=== No PBS Infrastructure Problems Detected ===
Options:
-v
verbose mode
-f
fix mode (checks & fixes directory permissions)
170
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Using pbs_hostn
 If a PBS site has hostname resolution issue, using the pbs_hostn
command will help identify the problem
 The command reports the results from gethostbyname and
gethostbyaddr system calls
Example:
pbs_hostn –v traintb16
primary name:
traintb16.prog.altair.com (from gethostbyname())
aliases:
traintb16
address length: 4 bytes
address:
204.235.21.130
(33554559 dec) name: traintb16.prog.altair.com
171
Copyright © 2010 Altair Engineering, Inc. Proprietary and Confidential. All rights reserved.
Conclusion - Survey Monkey
 Please take the opportunity to help assist us by filling out a quick
online survey regarding this training class
 The web link is bookmarked under the Bookmarks pull down menu in
FireFox
 Please make sure you click on “SUBMIT” when finished
THANK YOU
172