Difference between revisions of "BwForCluster JUSTUS 2 Slurm HOWTO"

From bwHPC Wiki
Jump to: navigation, search
Line 1: Line 1:
{{Justus2}}
 
 
{{Justus2}}
 
{{Justus2}}
   

Revision as of 14:01, 17 April 2020

The bwForCluster JUSTUS 2 is a state-wide high-performance compute resource dedicated to Computational Chemistry and Quantum Sciences in Baden-Württemberg, Germany.

Slurm Howto

1 Preface

This is a collection of howtos and convenient commands that I initially wrote for internal use at Ulm only. Scripts and commands have been tested within our Slurm test environment at JUSTUS (running Slurm 19.05 at the moment).

Maybe you find this collection useful, but use on your own risk. Things may behave differently with different Slurm versions and configurations.

2 GENERAL

2.1 How to find Slurm FAQ?

https://slurm.schedmd.com/faq.html

2.2 How to find a Slurm cheat sheet?

https://slurm.schedmd.com/pdfs/summary.pdf

2.3 How to get more information?

(Almost) every Slurm command has a man page. Use it.

Online versions: https://slurm.schedmd.com/man_index.html

3 JOB SUBMISSION

3.1 How to submit an interactive job?

Use srun command, e.g.:

$ srun --nodes=1 --ntasks-per-node=8 --pty bash 

3.2 How to enable X11 forwarding for an interactive job?

Use --x11 flag, e.g.

$ srun --nodes=1 --ntasks-per-node=8 --pty --x11 bash     # run shell with X11 forwarding enabled
$ srun --nodes=1 --ntasks-per-node=8 --pty --x11 xterm    # directly launch terminal window on node

Note:

  • For X11 forwarding to work, you must also enable X11 forwarding for your ssh login from your local computer to the cluster, i.e.:
local> ssh -X <username>@justus2.uni-ulm.de>

3.3 How to submit a batch job?

Use sbatch command:

 $ sbatch <job-script> 

3.3.1 How to convert Moab batch job scripts to Slurm?

Replace Moab/Torque job specification flags and environment variables in your job scripts by their corresponding Slurm counterparts.

Commonly used Moab job specification flags and their Slurm equivalents*

Option Moab (msub) Slurm (sbatch)
Script directive #MSUB #SBATCH
Job name -N <name> --job-name=<name> (-J <name>)
Account -A <account> --account=<account> (-A <account>)
Queue -q <queue> --partition=<partition> (-p <partition>)
Wall time limit -l walltime=<hh:mm:ss> --time=<hh:mm:ss> (-t <hh:mm:ss>)
Node count -l nodes=<count> --nodes=<count> (-N <count>)
Core count -l procs=<count> --ntasks=<count> (-n <count>)
Process count per node -l ppn=<count> --ntasks-per-node=<count>
Core count per process --cpus-per-task=<count>
Memory limit per node -l mem=<limit> --mem=<limit>
Memory limit per process -l pmem=<limit> --mem-per-cpu=<limit>
Job array -t <array indices> --array=<indices> (-a <indices>)
Node exclusive job -l naccesspolicy=singlejob --exclusive
Initial working directory -d <directory> (default: $HOME) --chdir=<directory> (-D <directory>) (default: submission directory)
Standard output file -o <file path> --output=<file> (-o <file>)
Standard error file -e <file path> --error=<file> (-e <file>)
Combine stdout/stderr to stdout -j oe --output=<combined stdout/stderr file>
Mail notification events -m <event> --mail-type=<events> (valid types include: NONE, BEGIN, END, FAIL, ALL)
Export environment to job -V --export=ALL (default)
Don't export environment to job (default) --export=NONE
Export environment variables to job -v <var[=value][,var2=value2[, ...]]> --export=<var[=value][,var2=value2[,...]]>