BwUniCluster2.0/Batch System Migration Guide

From bwHPC Wiki
< BwUniCluster2.0
Revision as of 13:14, 13 March 2020 by H Haefner (talk | contribs) (MPI Programs on many nodes)
Jump to: navigation, search

1 Serial Programs

  • Use the time option -t or --time (instead of -l walltime). If only one number is entered behind -t, the default unit is minutes.
  • Use the option -n 1 or --ntasks=1 (instead of -l nodes=1,ppn=1).
  • Use the option -m or --mem (instead of -l pmem). The default unit is MegaByte.
  • If you want to use one node exclusively, you must enter the whole memory (-m 96327 or --mem=96327).


Example for a serial job

$ sbatch -p single -t 60 -n 1 -m 96327 ./job.sh 

The script job.sh (containing the execution of a serial program) is started running 60 minutes exclusively on a batch node.


2 Multithreaded Programs

  • Use the time option -t or --time (instead of -l walltime). If only one number is entered behind -t, the default unit is minutes.
  • Use the option -N 1 or --nodes=1 and c x or --cpus-per-task=x (instead of -l nodes=1,ppn=x ). x can be a number between 1 and 40 (because of 40 cores within one node); it can also be a number between 41 and 80 (because of active hyperthreading).
  • Use the option -m or --mem (instead of -l pmem). The default unit is MegaByte.
  • Use the option --export to set the needed environment variable OMP_NUM_THREADS for the batch job. Adding ALL means to pass all interactively set environment variables to the batch job.
  • If you want to use one node exclusively, you must either enter the whole memory (-m 96327 or --mem=96327) or set the number of threads greater than 39.


Example for a multithreaded job

$ sbatch -p single -t 1:00:00 -N 1 -c 20 -m 50gb --export=ALL,OMP_NUM_THREADS=20 ./job_threaded.sh 

The script job_threaded.sh (containing a multithreaded program) is started running 1 hour in shared mode on 20 cores requesting 50GB on one batch node.


3 MPI Programs within one node

  • Use the time option -t or --time (instead of -l walltime). If only one number is entered behind -t, the default unit is minutes.
  • Use the option -n x or --ntasks=x (instead of -l nodes=1,ppn=x ). x can be a number between 1 and 40 (because of 40 cores within one node); you should'nt utilize hyperthreading.
  • Use the option -m or --mem (instead of -l pmem). The default unit is MegaByte.
  • If you want to use one node exclusively, you must either enter the whole memory (-m 96327 or --mem=96327) or set the number of MPI tasks greater than 39.
  • Don't forget to load the appropriate MPI-module in your job script.


Example for a MPI job

$ sbatch -p single -t 600 -n 10 -m 40000 ./job_mpi.sh 

The script job_mpi.sh (containing a MPI program after loading the appropriate MPI module) is started running 10 hours in shared mode on 10 cores requesting 40000 MB on one batch node.


4 MPI Programs on many nodes

  • Use the time option -t or --time (instead of -l walltime). If only one number is entered behind -t, the default unit is minutes.
  • Use the option -N y or --nodes=y and --ntasks-per-node=x (instead of -l nodes=y,ppn=x). x can be a number between 1 and 40 (28 for Broadwell nodes) (because of 40 (28) cores within one node); you should'nt utilize hyperthreading.
  • You should'nt use the option -m or --mem (instead of -l pmem) because the nodes are used exclusively.
  • You always use the nodes exclusively.
  • Don't forget to load the appropriate MPI-module in your job script.


Example for a MPI job

$ sbatch -p multiple -t 48:00:00 -N 10 --ntasks-per-node=40  ./job_mpi.sh 

The script job_mpi.sh (containing a MPI program after loading the appropriate MPI module) is started running 2 days on 400 cores on ten batch nodes.