Batch Jobs - bwUniCluster Features: Difference between revisions

From bwHPC Wiki
Jump to navigation Jump to search
Line 35: Line 35:
| style="padding:3px"| develop
| style="padding:3px"| develop
| style="padding:3px"| thin
| style="padding:3px"| thin
| style="width:15%;padding:3px"| ''walltime''=00:10:00,''procs''=1, ''mem''=4gb
| style="width:15%;padding:3px"| ''walltime''=00:10:00,''procs''=1, ''mem''=4000mb
| style="padding:3px"| ''nodes''=1
| style="padding:3px"| ''nodes''=1
| style="padding:3px"| ''walltime''=00:30:00,''nodes''=1:''ppn''=16
| style="padding:3px"| ''walltime''=00:30:00,''nodes''=1:''ppn''=16
Line 41: Line 41:
| style="width:10%;padding:3px" | singlenode
| style="width:10%;padding:3px" | singlenode
| style="padding:3px"| thin
| style="padding:3px"| thin
| style="padding:3px"| ''walltime''=00:30:01,''procs''=1, mem''=4gb
| style="padding:3px"| ''walltime''=00:30:01,''procs''=1, mem''=4000mb
| style="padding:3px"| ''walltime''=00:30:01,''nodes''=1
| style="padding:3px"| ''walltime''=00:30:01,''nodes''=1
| style="padding:3px"| ''walltime''=3:00:00:00,''nodes''=1:''ppn''=16
| style="padding:3px"| ''walltime''=3:00:00:00,''nodes''=1:''ppn''=16
Line 47: Line 47:
| style="width:10%;vertical-align:top;height=20px; text-align:left;padding:3px" | multinode
| style="width:10%;vertical-align:top;height=20px; text-align:left;padding:3px" | multinode
| style="padding:3px"| thin
| style="padding:3px"| thin
| style="padding:3px"| ''walltime''=00:10:00,''procs''=1, ''mem''=4gb
| style="padding:3px"| ''walltime''=00:10:00,''procs''=1, ''mem''=4000mb
| style="padding:3px"| ''nodes''=2
| style="padding:3px"| ''nodes''=2
| style="padding:3px"| ''walltime''=2:00:00:00,''nodes''=16:''ppn''=16
| style="padding:3px"| ''walltime''=2:00:00:00,''nodes''=16:''ppn''=16
Line 53: Line 53:
| style="width:10%;vertical-align:top;height=20px; text-align:left;padding:3px" | verylong
| style="width:10%;vertical-align:top;height=20px; text-align:left;padding:3px" | verylong
| style="padding:3px"| thin
| style="padding:3px"| thin
| style="padding:3px"| ''walltime''=3:00:00:01,''procs''=1, ''mem''=4gb
| style="padding:3px"| ''walltime''=3:00:00:01,''procs''=1, ''mem''=4000mb
| style="padding:3px"| ''walltime''=3:00:00:01,''nodes''=1
| style="padding:3px"| ''walltime''=3:00:00:01,''nodes''=1
| style="padding:3px"| ''walltime''=6:00:00:00,''nodes=1:''ppn''=16
| style="padding:3px"| ''walltime''=6:00:00:00,''nodes=1:''ppn''=16
Line 59: Line 59:
| style="width:10%;padding:3px" | fat
| style="width:10%;padding:3px" | fat
| style="padding:3px"| fat
| style="padding:3px"| fat
| style="padding:3px"| ''walltime''=00:10:00,''procs''=1, ''mem''=32gb
| style="padding:3px"| ''walltime''=00:10:00,''procs''=1, ''mem''=32000mb
| style="padding:3px"| ''nodes''=1
| style="padding:3px"| ''nodes''=1
| style="padding:3px"| ''walltime''=3:00:00:00,''nodes''=1:''ppn''=32
| style="padding:3px"| ''walltime''=3:00:00:00,''nodes''=1:''ppn''=32

Revision as of 17:09, 5 November 2014

This article contains information on features of the batch job system only applicable on bwUniCluster.

Job Submission

msub Command

The bwUniCluster supports the following additional msub option(s):

bwUniCluster additional msub Options
Command line Script Purpose
-I Declares the job is to be run interactively.



msub -l resource_list

No deviation or additional features to general batch job setting.

msub -q queues

Compute resources such as walltime, nodes and memory are restricted and must fit into queues. Since requested compute resources are NOT always automatically mapped to the correct queue class, you must add to your msub command the correct queue class. Details are:

msub -q queue
queue node default resources minimum resources maximum resources
develop thin walltime=00:10:00,procs=1, mem=4000mb nodes=1 walltime=00:30:00,nodes=1:ppn=16
singlenode thin walltime=00:30:01,procs=1, mem=4000mb walltime=00:30:01,nodes=1 walltime=3:00:00:00,nodes=1:ppn=16
multinode thin walltime=00:10:00,procs=1, mem=4000mb nodes=2 walltime=2:00:00:00,nodes=16:ppn=16
verylong thin walltime=3:00:00:01,procs=1, mem=4000mb walltime=3:00:00:01,nodes=1 walltime=6:00:00:00,nodes=1:ppn=16
fat fat walltime=00:10:00,procs=1, mem=32000mb nodes=1 walltime=3:00:00:00,nodes=1:ppn=32

Default resources of a queue class defines walltime, processes and memory if not explicitly given with msub command. Resource list acronyms walltime, procs, nodes and ppn are described here.

Queue class examples

  • To run your batch job longer than 3 days, please use$ msub -q verylong.
  • To run your batch job on one of the fat nodes, please use$ msub -q fat.



Environment Variables for Batch Jobs

The bwUniCluster expands the common set of MOAB environment variables by the following variable(s):

bwUniCluster specific MOAB variables
Environment variables Description
MOAB_SUBMITDIR Directory of job submission


Since the work load manager MOAB on bwUniCluster uses the resource manager SLURM, the following environment variables of SLURM are added to your environment once your job has started:

SLURM variables
Environment variables Description
SLURM_JOB_CPUS_PER_NODE Number of processes per node dedicated to the job
SLURM_JOB_NODELIST List of nodes dedicated to the job
SLURM_JOB_NUM_NODES Number of nodes dedicated to the job
SLURM_MEM_PER_NODE Memory per node dedicated to the job
SLURM_NPROCS Total number of processes dedicated to the job



Interactive Jobs

Interactive jobs on bwUniCluster must NOT run on the logins nodes, however resources for interactive jobs can be requested using msub. Considering a serial application with a graphical frontend that requires 5000 MByte of memory and limiting the interactive run to 2 hours execute the following:

$ msub  -I  -V  -l nodes=1:ppn=1 -l walltime=0:02:00:00

The option -V defines that all environment variables are exported to the compute node of the interactive session. After execution of this command DO NOT CLOSE your current terminal session but wait until the queueing system MOAB has granted you the requested resources on the compute system. Once granted you will be automatically logged on the dedicated resource. Now you have an interactive session with 1 core and 5000 MByte of memory on the compute system for 2 hours. Simply execute now your application:

$ cd to_path
$ ./application

Note that, once the walltime limit has been reached you will be automatically logged out of the compute system.