Batch Jobs Moab
Navigation: bwHPC BPR / bwUniCluster |
---|
Any kind of calculation on the compute nodes of bwUniCluster requires the user to define calculations as a sequence of commands or single command together with required run time, number of CPU cores and main memory and submit all, i.e., the batch job, to a resource and workload managing software. All bwHPC cluster, including bwUniCluster, have installed the workload managing software MOAB. Therefore any job submission by the user is to be executed by commands of the MOAB software. MOAB queues and runs user jobs based on fair sharing policies.
MOAB commands | Brief explanation |
---|---|
msub | submits an job and queues it in an input queue |
checkjob | displays detailed job state information |
showq | displays information about active, eligible, blocked, and/or recently completed jobs |
showbf | shows what resources are available for immediate use |
Contents
1 Job Submission
Batch jobs are submitted using the command msub. The main purpose of the msub command is to specify the resources that are needed to run the job. msub will then queue the job into the input queue. The jobs are organized into different job classes. For each job class there are specific limits for the available resources (number of nodes, number of CPUs, maximum CPU time, maximum memory etc.).
1.1 msub Command
The syntax and use of msub can be displayed via:
$ man msub
msub options can be used from the command line or in your job script.
msub Options | ||
---|---|---|
Command line | Script | Purpose |
-l resources | #MSUB -l resources | Defines the resources that are required by the job. See the description below for this important flag. |
-N name | #MSUB -N name | Gives a user specified name to the job. |
-I | Declares the the job is to be run interactively. | |
-o filename | #MSUB -o filename | Defines the path to be used for the standard output stream of the batch job. |
-V | #MSUB -V | Declares that all environment variables in the msub environment are exported to the batch job. |
1.1.1 msub -l resource_list
The -l option is one of the most important msub options. It is used to specify a number of resource requirements for your job. Multiple resource strings are separated by commas.
msub -l resource_list | ||
---|---|---|
resource | Purpose | |
-l nodes=1 -l nodes=2:ppn=8 |
Number of nodes Number of nodes and number of processes per node | |
-l walltime=600 -l walltime=1:30:00 |
Wall-clock time. Default units are seconds. HH:MM:SS format is also accepted. | |
-l feature=tree -l feature=blocking -l feature=fat |
For jobs that span over several nodes For sequential jobs For jobs that require up to 1 TB memory | |
-l pmem=1000mb | Memory per process, allowed units are kb,mb,gb. |
1.2 msub Examples
1.2.1 Serial Programs
To submit a serial job that runs the script job.sh and that requires 5000 MB of main memory and 3 hours of wall clock time execute:
$ msub -N test -l nodes=1:ppn=1,walltime=3:00:00,pmem=5000mb job.sh
or add to the script job.sh the lines:
#MSUB -l nodes=1:ppn=1
#MSUB -l walltime=3:00:00
#MSUB -l pmem=5000mb
#MSUB -N test
and execute the modified script, i.e., job_modified.sh:
$ msub job_modified.sh