Difference between revisions of "Batch Jobs"

From bwHPC Wiki
Jump to: navigation, search
Line 4: Line 4:
   
   
<span style="color:red;font-size:105%;">Important note: bwUniCluster is '''not''' in production mode, yet.</span>
+
<span style="color:red;font-size:105%;">Important note: bwUniCluster is '''not''' in production mode yet.</span>
 
   
  +
<!---
 
<span style="color:red;font-size:105%;">The folllowing features of MOAB are not working:</span>
 
<span style="color:red;font-size:105%;">The folllowing features of MOAB are not working:</span>
 
* <span style="color:red;font-size:105%;">interactive jobs, i.e.,</span> <span style="color:red;font-size:105%;background:#edeae2;margin:10px;padding:1px;border:1px dotted #808080">msub -I</span>
 
* <span style="color:red;font-size:105%;">interactive jobs, i.e.,</span> <span style="color:red;font-size:105%;background:#edeae2;margin:10px;padding:1px;border:1px dotted #808080">msub -I</span>
Line 14: Line 14:
 
<span style="color:red;font-size:105%;">Please do not use these features until further notice. Adaptive Computing, producer of MOAB, is working on that problem.
 
<span style="color:red;font-size:105%;">Please do not use these features until further notice. Adaptive Computing, producer of MOAB, is working on that problem.
 
</span>
 
</span>
  +
-->
   
   
Line 22: Line 23:
 
! MOAB commands !! Brief explanation
 
! MOAB commands !! Brief explanation
 
|-
 
|-
| msub || submits an job and queues it in an input queue
+
| msub || submits a job and queues it in an input queue
 
|-
 
|-
 
| checkjob || displays detailed job state information
 
| checkjob || displays detailed job state information
Line 30: Line 31:
 
| showbf || shows what resources are available for immediate use
 
| showbf || shows what resources are available for immediate use
 
|}
 
|}
  +
   
   
Line 70: Line 72:
 
| Defines the filename to be used for the standard output stream of the batch job. By default the file with defined filename is placed under your job submit directory. To place under a different location, expand ''filename'' by the relative or absolute path of destination.
 
| Defines the filename to be used for the standard output stream of the batch job. By default the file with defined filename is placed under your job submit directory. To place under a different location, expand ''filename'' by the relative or absolute path of destination.
 
|-
 
|-
  +
<!--
 
| -V
 
| -V
 
| #MSUB -V
 
| #MSUB -V
 
| Declares that all environment variables in the msub environment are exported to the batch job.
 
| Declares that all environment variables in the msub environment are exported to the batch job.
 
|-
 
|-
  +
-->
 
|}
 
|}
   
Line 80: Line 84:
 
The '''-l''' option is one of the most important msub options. It is used to specify a number of resource requirements for your job. Multiple resource strings are separated by commas.
 
The '''-l''' option is one of the most important msub options. It is used to specify a number of resource requirements for your job. Multiple resource strings are separated by commas.
   
{| style="border-style: solid; border-width: 1px;" border="1" cellpadding="2"
+
{| style="border-style: solid; border-width: 1px; padding=5px;" border="1"
 
! colspan="3" style="background-color:gray;"| msub -l resource_list
 
! colspan="3" style="background-color:gray;"| msub -l resource_list
 
|-
 
|-
Line 95: Line 99:
 
| For jobs that span over several nodes <br> For sequential jobs <br> For jobs that require up to 1 TB memory
 
| For jobs that span over several nodes <br> For sequential jobs <br> For jobs that require up to 1 TB memory
 
|-
 
|-
  +
| -l pmem=1000mb
<!-- removed until fixed
 
  +
| Memory per '''process''', allowed units are kb,mb,gb. Be aware that '''processes''' are either ''MPI tasks'' if running MPI parallel jobs or ''threads'' if running multithreaded jobs.
| -l mem=1000mb
 
| Memory per '''node''', allowed units are kb,mb,gb.
 
 
|-
 
|-
-->
 
 
|}
 
|}
   
Line 108: Line 110:
 
To submit a serial job that runs the script '''job.sh''' and that requires 5000 MB of main memory and 3 hours of wall clock time execute:
 
To submit a serial job that runs the script '''job.sh''' and that requires 5000 MB of main memory and 3 hours of wall clock time execute:
 
<pre>
 
<pre>
$ msub -N test -l nodes=1:ppn=1,walltime=3:00:00,mem=5000mb job.sh
+
$ msub -N test -l nodes=1:ppn=1,walltime=3:00:00,pmem=5000mb job.sh
 
</pre>
 
</pre>
 
or
 
or
Line 117: Line 119:
 
#MSUB -l nodes=1:ppn=1
 
#MSUB -l nodes=1:ppn=1
 
#MSUB -l walltime=3:00:00
 
#MSUB -l walltime=3:00:00
#MSUB -l mem=5000mb
+
#MSUB -l pmem=5000mb
 
#MSUB -N test
 
#MSUB -N test
 
</source>
 
</source>
 
|}
 
|}
and execute the modified script, i.e., '''job_modified.sh''':
+
and execute the modified script without any msub command line options:
 
<pre>
 
<pre>
$ msub job_modified.sh
+
$ msub job.sh
 
</pre>
 
</pre>
  +
Note, that msub command line options overrule script options.
  +
   
 
=== Multithreaded Programs ===
 
=== Multithreaded Programs ===
  +
Under construction.
   
   
=== Parallel MPI Programs ===
+
=== MPI parallel Programs ===
  +
Under construction.
   
= Display Status of submitted Jobs =
 
   
  +
=== Multithreaded + MPI parallel Programs ===
  +
Under construction.
   
   
= Environment Variables for Batch Jobs =
+
=== Interactive Jobs ===
  +
Under construction.
   
  +
  +
= Display Status of submitted Jobs =
  +
Under construction.
  +
  +
  +
= Environment Variables for Batch Jobs =
  +
Under construction.
   
   

Revision as of 19:51, 17 January 2014

Navigation: bwHPC BPR / bwUniCluster


Important note: bwUniCluster is not in production mode yet.


Any kind of calculation on the compute nodes of bwUniCluster requires the user to define calculations as a sequence of commands or single command together with required run time, number of CPU cores and main memory and submit all, i.e., the batch job, to a resource and workload managing software. All bwHPC cluster, including bwUniCluster, have installed the workload managing software MOAB. Therefore any job submission by the user is to be executed by commands of the MOAB software. MOAB queues and runs user jobs based on fair sharing policies.


MOAB commands Brief explanation
msub submits a job and queues it in an input queue
checkjob displays detailed job state information
showq displays information about active, eligible, blocked, and/or recently completed jobs
showbf shows what resources are available for immediate use


1 Job Submission

Batch jobs are submitted using the command msub. The main purpose of the msub command is to specify the resources that are needed to run the job. msub will then queue the job into the input queue. The jobs are organized into different job classes. For each job class there are specific limits for the available resources (number of nodes, number of CPUs, maximum CPU time, maximum memory etc.).


1.1 msub Command

The syntax and use of msub can be displayed via:

$ man msub

msub options can be used from the command line or in your job script.

msub Options
Command line Script Purpose
-l resources #MSUB -l resources Defines the resources that are required by the job. See the description below for this important flag.
-N name #MSUB -N name Gives a user specified name to the job.
-I Declares the the job is to be run interactively.
-o filename #MSUB -o filename Defines the filename to be used for the standard output stream of the batch job. By default the file with defined filename is placed under your job submit directory. To place under a different location, expand filename by the relative or absolute path of destination.


1.1.1 msub -l resource_list

The -l option is one of the most important msub options. It is used to specify a number of resource requirements for your job. Multiple resource strings are separated by commas.

msub -l resource_list
resource Purpose
-l nodes=1
-l nodes=2:ppn=8
Number of nodes
Number of nodes and number of processes per node
-l walltime=600
-l walltime=1:30:00
Wall-clock time. Default units are seconds.
HH:MM:SS format is also accepted.
-l feature=tree
-l feature=blocking
-l feature=fat
For jobs that span over several nodes
For sequential jobs
For jobs that require up to 1 TB memory
-l pmem=1000mb Memory per process, allowed units are kb,mb,gb. Be aware that processes are either MPI tasks if running MPI parallel jobs or threads if running multithreaded jobs.

1.2 msub Examples

1.2.1 Serial Programs

To submit a serial job that runs the script job.sh and that requires 5000 MB of main memory and 3 hours of wall clock time execute:

$ msub -N test -l nodes=1:ppn=1,walltime=3:00:00,pmem=5000mb   job.sh

or add to the script job.sh the lines:

#MSUB -l nodes=1:ppn=1
#MSUB -l walltime=3:00:00
#MSUB -l pmem=5000mb
#MSUB -N test

and execute the modified script without any msub command line options:

$ msub job.sh

Note, that msub command line options overrule script options.


1.2.2 Multithreaded Programs

Under construction.


1.2.3 MPI parallel Programs

Under construction.


1.2.4 Multithreaded + MPI parallel Programs

Under construction.


1.2.5 Interactive Jobs

Under construction.


2 Display Status of submitted Jobs

Under construction.


3 Environment Variables for Batch Jobs

Under construction.