DACHS/Quickstart Slurm: Difference between revisions

From bwHPC Wiki
Jump to navigation Jump to search
(Created page with "<span id="quickstart"></span> = Quickstart = <span id="contact-and-further-links-to-bwhpc-wiki"></span> == Contact and Further Links to bwHPC wiki == If you don't find the information you need in this wiki, you can always reach us at [mailto:dachs-admin@hs-esslingen.de dachs-admin@hs-esslingen.de]. <span id="typical-sbatch-script"></span> == Typical SBATCH script == This is just a brief guide to get you started. All queues, limits, and hardware are documented in deta...")
(No difference)

Revision as of 09:28, 25 November 2025

Quickstart

Contact and Further Links to bwHPC wiki

If you don't find the information you need in this wiki, you can always reach us at dachs-admin@hs-esslingen.de.

Typical SBATCH script

This is just a brief guide to get you started. All queues, limits, and hardware are documented in detail at the DACHS/Hardware.

Our default queue is gpu1 that you specify with --partition=gpu1 explicitly. Other queues are gpu4 (4 AMD MI300A APUs) and gpu8 (8 NVIDIA H100 GPUs). For more deatiled information, visit DACHS/Queues.

This is the content of testjob.sh. It is basically a Bash script with some instructions in the format of comment for the Slurm scheduler.

#!/bin/bash
#SBATCH --ntasks=1       # allocate 1 CPU
#SBATCH --time=30:00     # time limit of 30 min
#SBATCH --mem=42gb       # allocate 42 GB RAM
#SBATch --gres=gpu:1     # allocate one GPU
#SBATCH --job-name="CHANGEME job name"
# Uncomment the following lines to get email notifications about your job
# #SBATCH --mail-type=ALL  # list of "ALL,START,END
# #SBATCH --mail-user=CHANGME@EMAIL.COM

# You Shell script that setups your job and starts your work is here.
# That might include loading a module from provided software (check with
# `module avail` or sourcing a Python environment.

# Load Python 3.13.3 compiled with gnu 14.2
module load devel/python/3.13.3-gnu-14.2

# Run your Python script that you previously prepared on the Login node.
uv run python3 main.py

For a detailed overview of slurm visit the SBATCH options and Slurm wiki page.

You can display available nodes by running

sinfo_t_idle

Submit your job:

sbatch testjob.sh

When you queued a job, you can show its status:

squeue

If resources are not immediately available add --start to show its expected start time:

squeue --start

If you want to cancel your job run

scancel <jobid>