BwUniCluster3.0/Batch Queues: Difference between revisions

From bwHPC Wiki
Jump to navigation Jump to search
Line 12: Line 12:
! style="width:13%"| minimal resources
! style="width:13%"| minimal resources
! style="width:13%"| maximum resources
! style="width:13%"| maximum resources
! Remarks
|-
|-
| <code>cpu_il</code>
| <code>cpu_il</code>
Line 19: Line 18:
|
|
| time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
| time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
|
|-
|-
| <code>cpu</code>
| <code>cpu</code>
Line 26: Line 24:
|
|
| time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
| time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
|
|-
|-
| <code>highmem</code>
| <code>highmem</code>
Line 33: Line 30:
|
|
| time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
| time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
|
|-
|-
| <code>gpu_h100</code>
| <code>gpu_h100</code>
Line 40: Line 36:
|
|
| time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
| time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
|
|-
|-
| <code>gpu_mi300</code>
| <code>gpu_mi300</code>
Line 47: Line 42:
|
|
| time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
| time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
|
|-
|-
| <code>gpu_a100_il</code>/<code>gpu_h100_il</code>
| <code>gpu_a100_il</code>/<code>gpu_h100_il</code>
Line 54: Line 48:
|
|
| time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
| time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
|
|}
|}
Table 1: Regular Queues
Table 1: Regular Queues
Line 65: Line 58:
! style="width:13%"| minimal resources
! style="width:13%"| minimal resources
! style="width:13%"| maximum resources
! style="width:13%"| maximum resources
! Remarks
|-
|-
| <code>cpu_il</code>
| <code>dev_cpu_il</code>
| CPU nodes<br/>Ice Lake
| CPU nodes<br/>Ice Lake
| time=10<br/>mem-per-cpu=1125mb
| time=10<br/>mem-per-cpu=1125mb
|
|
| time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
| time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
|
|-
|-
| <code>cpu</code>
| <code>dev_cpu</code>
| CPU nodes<br/>Standard
| CPU nodes<br/>Standard
| time=10<br/>mem-per-cpu=1125mb
| time=10<br/>mem-per-cpu=1125mb
|
|
| time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
| time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
|
|-
|-
| <code>highmem</code>
| <code>dev_highmem</code>
| CPU nodes<br/>High Memory
| CPU nodes<br/>High Memory
| time=10<br/>mem-per-cpu=1125mb
| time=10<br/>mem-per-cpu=1125mb
|
|
| time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
| time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
|
|-
|-
| <code>gpu_h100</code>
| <code>dev_gpu_h100</code>
| GPU nodes<br/>NVIDIA GPU x4
| GPU nodes<br/>NVIDIA GPU x4
| time=10<br/>mem-per-cpu=1125mb
| time=10<br/>mem-per-cpu=1125mb
|
|
| time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
| time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
|
|-
| <code>gpu_mi300</code>
| GPU node<br/>AMD GPU x4
| time=10<br/>mem-per-cpu=1125mb
|
| time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
|
|-
| <code>gpu_a100_il</code>/<code>gpu_h100_il</code>
| GPU nodes<br/>Ice Lake<br/>NVIDIA GPU x4
| time=10<br/>mem-per-cpu=1125mb
|
| time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
|
|}
|}
Table 2: Development Queues
Table 2: Development Queues

Revision as of 10:52, 4 December 2024

sbatch -p queue

Compute resources such as (wall-)time, nodes and memory are restricted and must fit into queues. Since requested compute resources are NOT always automatically mapped to the correct queue class, you must add the correct queue class to your sbatch command . The specification of a queue is obligatory on BwUniCluster 2.0.
Details are:

queue node default resources minimal resources maximum resources
cpu_il CPU nodes
Ice Lake
time=10
mem-per-cpu=1125mb
time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
cpu CPU nodes
Standard
time=10
mem-per-cpu=1125mb
time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
highmem CPU nodes
High Memory
time=10
mem-per-cpu=1125mb
time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
gpu_h100 GPU nodes
NVIDIA GPU x4
time=10
mem-per-cpu=1125mb
time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
gpu_mi300 GPU node
AMD GPU x4
time=10
mem-per-cpu=1125mb
time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
gpu_a100_il/gpu_h100_il GPU nodes
Ice Lake
NVIDIA GPU x4
time=10
mem-per-cpu=1125mb
time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)

Table 1: Regular Queues

queue node default resources minimal resources maximum resources
dev_cpu_il CPU nodes
Ice Lake
time=10
mem-per-cpu=1125mb
time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
dev_cpu CPU nodes
Standard
time=10
mem-per-cpu=1125mb
time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
dev_highmem CPU nodes
High Memory
time=10
mem-per-cpu=1125mb
time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
dev_gpu_h100 GPU nodes
NVIDIA GPU x4
time=10
mem-per-cpu=1125mb
time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)

Table 2: Development Queues

bwUniCluster 2.0
sbatch -p queue
queue node default resources minimum resources maximum resources
dev_single thin time=10, mem-per-cpu=1125mb time=30, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core=2)
6 nodes are reserved for this queue.
Only for development, i.e. debugging or performance optimization ...
single thin time=30, mem-per-cpu=1125mb time=72:00:00, nodes=1, mem=180000mb, ntasks-per-node=40, (threads-per-core)=2
dev_multiple hpc time=10, mem-per-cpu=1125mb nodes=2 time=30, nodes=4, mem=90000mb, ntasks-per-node=40, (threads-per-core=2)
8 nodes are reserved for this queue.
Only for development, i.e. debugging or performance optimization ...
multiple hpc time=30, mem-per-cpu=1125mb nodes=2 time=72:00:00, mem=90000mb, nodes=80, ntasks-per-node=40, (threads-per-core=2)
dev_multiple_il IceLake time=10, mem-per-cpu=1950mb nodes=2 time=30, nodes=8, mem=249600mb, ntasks-per-node=64, (threads-per-core=2)
8 nodes are reserved for this queue
Only for development, i.e. debugging or performance optimization ...
multiple_il IceLake time=10, mem-per-cpu=1950mb nodes=2 time=72:00:00, nodes=80, mem=249600mb, ntasks-per-node=64, (threads-per-core=2)
dev_gpu_4_a100 IceLake + A100 time=10, mem-per-gpu=127500mb, cpus-per-gpu=16 time=30, nodes=1, mem=510000mb, ntasks-per-node=64, (threads-per-core=2)
gpu_4_a100 IceLake + A100 time=10, mem-per-gpu=127500mb, cpus-per-gpu=16 time=48:00:00, nodes=9, mem=510000mb, ntasks-per-node=64, (threads-per-core=2)
gpu_4_h100 IceLake + H100 time=10, mem-per-gpu=127500mb, cpus-per-gpu=16 time=48:00:00, nodes=5, mem=510000mb, ntasks-per-node=64, (threads-per-core=2)
fat fat time=10, mem-per-cpu=18750mb mem=180001mb time=72:00:00, nodes=1, mem=3000000mb, ntasks-per-node=80, (threads-per-core=2)
dev_gpu_4 gpu4 time=10, mem-per-gpu=94000mb, cpus-per-gpu=10 time=30, nodes=1, mem=376000, ntasks-per-node=40, (threads-per-core=2)
1 node is reserved for this queue
Only for development, i.e. debugging or performance optimization ...
gpu_4 gpu4 time=10, mem-per-gpu=94000mb, cpus-per-gpu=10 time=48:00:00, mem=376000, nodes=14, ntasks-per-node=40, (threads-per-core=2)
gpu_8 gpu8 time=10, mem-per-cpu=94000mb, cpus-per-gpu=10 time=48:00:00, mem=752000, nodes=10, ntasks-per-node=40, (threads-per-core=2)

Default resources of a queue class defines time, #tasks and memory if not explicitly given with sbatch command. Resource list acronyms --time, --ntasks, --nodes, --mem and --mem-per-cpu are described here.

Queue class examples

To run your batch job on one of the thin nodes, please use:

$ sbatch --partition=dev_multiple
     or 
$ sbatch -p dev_multiple


Interactive Jobs

On bwUniCluster 2.0 you are only allowed to run short jobs (<< 1 hour) with little memory requirements (<< 8 GByte) on the logins nodes. If you want to run longer jobs and/or jobs with a request of more than 8 GByte of memory, you must allocate resources for so-called interactive jobs by usage of the command salloc on a login node. Considering a serial application running on a compute node that requires 5000 MByte of memory and limiting the interactive run to 2 hours the following command has to be executed:

$ salloc -p single -n 1 -t 120 --mem=5000

Then you will get one core on a compute node within the partition "single". After execution of this command DO NOT CLOSE your current terminal session but wait until the queueing system Slurm has granted you the requested resources on the compute system. You will be logged in automatically on the granted core! To run a serial program on the granted core you only have to type the name of the executable.

$ ./<my_serial_program>

Please be aware that your serial job must run less than 2 hours in this example, else the job will be killed during runtime by the system.


You can also start now a graphical X11-terminal connecting you to the dedicated resource that is available for 2 hours. You can start it by the command:

$ xterm

Note that, once the walltime limit has been reached the resources - i.e. the compute node - will automatically be revoked.


An interactive parallel application running on one compute node or on many compute nodes (e.g. here 5 nodes) with 40 cores each requires usually an amount of memory in GByte (e.g. 50 GByte) and a maximum time (e.g. 1 hour). E.g. 5 nodes can be allocated by the following command:

$ salloc -p multiple -N 5 --ntasks-per-node=40 -t 01:00:00  --mem=50gb

Now you can run parallel jobs on 200 cores requiring 50 GByte of memory per node. Please be aware that you will be logged in on core 0 of the first node. If you want to have access to another node you have to open a new terminal, connect it also to BwUniCluster 2.0 and type the following commands to connect to the running interactive job and then to a specific node:

$ srun --jobid=XXXXXXXX --pty /bin/bash
$ srun --nodelist=uc2nXXX --pty /bin/bash

With the command:

$ squeue

the jobid and the nodelist can be shown.

If you want to run MPI-programs, you can do it by simply typing mpirun <program_name>. Then your program will be run on 200 cores. A very simple example for starting a parallel job can be:

$ mpirun <my_mpi_program>

You can also start the debugger ddt by the commands:

$ module add devel/ddt
$ ddt <my_mpi_program>

The above commands will execute the parallel program <my_mpi_program> on all available cores. You can also start parallel programs on a subset of cores; an example for this can be:

$ mpirun -n 50 <my_mpi_program>

If you are using Intel MPI you must start <my_mpi_program> by the command mpiexec.hydra (instead of mpirun).