DACHS/Quickstart Slurm

From bwHPC Wiki
< DACHS
Revision as of 09:28, 25 November 2025 by M Kunzelmann (talk | contribs) (Created page with "<span id="quickstart"></span> = Quickstart = <span id="contact-and-further-links-to-bwhpc-wiki"></span> == Contact and Further Links to bwHPC wiki == If you don't find the information you need in this wiki, you can always reach us at [mailto:dachs-admin@hs-esslingen.de dachs-admin@hs-esslingen.de]. <span id="typical-sbatch-script"></span> == Typical SBATCH script == This is just a brief guide to get you started. All queues, limits, and hardware are documented in deta...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Quickstart

Contact and Further Links to bwHPC wiki

If you don't find the information you need in this wiki, you can always reach us at dachs-admin@hs-esslingen.de.

Typical SBATCH script

This is just a brief guide to get you started. All queues, limits, and hardware are documented in detail at the DACHS/Hardware.

Our default queue is gpu1 that you specify with --partition=gpu1 explicitly. Other queues are gpu4 (4 AMD MI300A APUs) and gpu8 (8 NVIDIA H100 GPUs). For more deatiled information, visit DACHS/Queues.

This is the content of testjob.sh. It is basically a Bash script with some instructions in the format of comment for the Slurm scheduler.

#!/bin/bash
#SBATCH --ntasks=1       # allocate 1 CPU
#SBATCH --time=30:00     # time limit of 30 min
#SBATCH --mem=42gb       # allocate 42 GB RAM
#SBATch --gres=gpu:1     # allocate one GPU
#SBATCH --job-name="CHANGEME job name"
# Uncomment the following lines to get email notifications about your job
# #SBATCH --mail-type=ALL  # list of "ALL,START,END
# #SBATCH --mail-user=CHANGME@EMAIL.COM

# You Shell script that setups your job and starts your work is here.
# That might include loading a module from provided software (check with
# `module avail` or sourcing a Python environment.

# Load Python 3.13.3 compiled with gnu 14.2
module load devel/python/3.13.3-gnu-14.2

# Run your Python script that you previously prepared on the Login node.
uv run python3 main.py

For a detailed overview of slurm visit the SBATCH options and Slurm wiki page.

You can display available nodes by running

sinfo_t_idle

Submit your job:

sbatch testjob.sh

When you queued a job, you can show its status:

squeue

If resources are not immediately available add --start to show its expected start time:

squeue --start

If you want to cancel your job run

scancel <jobid>