<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.bwhpc.de/wiki/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=S+Braun</id>
	<title>bwHPC Wiki - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.bwhpc.de/wiki/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=S+Braun"/>
	<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/e/Special:Contributions/S_Braun"/>
	<updated>2026-04-21T02:37:30Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.39.17</generator>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Login&amp;diff=15854</id>
		<title>BwUniCluster3.0/Login</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Login&amp;diff=15854"/>
		<updated>2026-03-20T05:55:00Z</updated>

		<summary type="html">&lt;p&gt;S Braun: /* Login with SSH command (Linux, Mac, Windows) */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
Access to bwUniCluster 3.0 is &#039;&#039;&#039;limited to IP addresses from the BelWü network&#039;&#039;&#039;.&lt;br /&gt;
All home institutions of our current users are connected to BelWü, so if you are on your campus network (e.g. in your office or on the Campus WiFi) you should be able to connect to bwUniCluster 3.0 without restrictions.&lt;br /&gt;
If you are outside one of the BelWü networks (e.g. at home), a VPN connection to the home institution or a connection to an SSH jump host at the home institution must be established first.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
The login nodes of the bwHPC clusters are the access point to the compute system, your &amp;lt;code&amp;gt;$HOME&amp;lt;/code&amp;gt; directory and your workspaces.&lt;br /&gt;
All users must log in through these nodes to submit jobs to the cluster.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Prerequisites for successful login:&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
You need to have&lt;br /&gt;
# Completed the 3-step [[registration|&#039;&#039;&#039;registration&#039;&#039;&#039;]] procedure.&lt;br /&gt;
# Set a [[Registration/Password|&#039;&#039;&#039;service password&#039;&#039;&#039;]] for bwUniCluster 3.0.&lt;br /&gt;
# Set up a [[Registration/2FA|&#039;&#039;&#039;second factor&#039;&#039;&#039;]] for the time-based one-time password (TOTP).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Login to the bwUniCluster =&lt;br /&gt;
&lt;br /&gt;
Login to the bwUniCluster 3.0 is only possible with a Secure Shell (SSH) client for which you must know your username on the cluster and the hostname of the login nodes.&lt;br /&gt;
For more general information on SSH clients, visit the [[BwUniCluster3.0/Login/Client|SSH Clients Guide]].&lt;br /&gt;
&lt;br /&gt;
== Username ==&lt;br /&gt;
&lt;br /&gt;
If you want to use the bwUniCluster 3.0 you need to add a prefix to your local username.&lt;br /&gt;
&lt;br /&gt;
For prefixes please refer to the [[Registration/Login/Username#Prefix_for_Universities|prefix table]].&lt;br /&gt;
&lt;br /&gt;
Examples:&amp;lt;br/&amp;gt;&lt;br /&gt;
* If your local username for the University is &amp;lt;code&amp;gt;ab123&amp;lt;/code&amp;gt; and you are a user from the University of Freiburg this would combine to: &amp;lt;code&amp;gt;fr_ab123&amp;lt;/code&amp;gt;.&lt;br /&gt;
* If your KIT username is &amp;lt;code&amp;gt;ab1234&amp;lt;/code&amp;gt; and you are a user from KIT this would combine to: &amp;lt;code&amp;gt;ka_ab1234&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== Hostnames ==&lt;br /&gt;
&lt;br /&gt;
The system has two login nodes.&lt;br /&gt;
The selection of the login node is done automatically.&lt;br /&gt;
If you are logging in multiple times, different sessions might run on different login nodes.&lt;br /&gt;
&lt;br /&gt;
Login to bwUniCluster 3.0:&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! Hostname !! Node type&lt;br /&gt;
|-&lt;br /&gt;
| &#039;&#039;&#039;uc3.scc.kit.edu&#039;&#039;&#039;          || login to one of the two login nodes&lt;br /&gt;
|-&lt;br /&gt;
| &#039;&#039;&#039;bwunicluster.scc.kit.edu&#039;&#039;&#039; || login to one of the two login nodes&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
With the launch of bwUniCluster 3.0, &#039;&#039;&#039;bwunicluster.scc.kit.edu&#039;&#039;&#039; no longer points to &#039;&#039;&#039;uc2.scc.kit.edu&#039;&#039;&#039; but to &#039;&#039;&#039;uc3.scc.kit.edu&#039;&#039;&#039;. In order to remove the warnings from your SSH client, you can delete the old hostkey as follows: &amp;lt;code&amp;gt;ssh-keygen -R bwunicluster&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
For the sake of simplicity, &#039;&#039;&#039;we recommend using uc3.scc.kit.edu as the server address&#039;&#039;&#039;: &amp;lt;code&amp;gt;ssh prefix_&amp;lt;username&amp;gt;@uc3.scc.kit.edu&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Until 06.07.2025 the login to bwUniCluster 2.0 is possible analogously via &amp;lt;code&amp;gt;ssh &amp;lt;username&amp;gt;@uc2.scc.kit.edu&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
In general, you should use automatic selection to allow us to balance the load over the three login nodes.&lt;br /&gt;
If you need to connect to specific login nodes, you can use the following hostnames:&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! Hostname !! Node type&lt;br /&gt;
|-&lt;br /&gt;
| &#039;&#039;&#039;uc3-login1.scc.kit.edu&#039;&#039;&#039; || bwUniCluster 3.0 first login node&lt;br /&gt;
|-&lt;br /&gt;
| &#039;&#039;&#039;uc3-login2.scc.kit.edu&#039;&#039;&#039; || bwUniCluster 3.0 second login node&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Host Keys ==&lt;br /&gt;
&lt;br /&gt;
When you log in, you may receive the message &amp;lt;code&amp;gt;The authenticity of host &#039;&amp;lt;host address&amp;gt;&#039; can&#039;t be established.&amp;lt;/code&amp;gt; along with the host key fingerprint. This is intended so you can verify the authenticity of the host you are connecting to. Before you continue you should verify, if this fingerprint matches one of the following:&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! Algorithm !! Fingerprint (SHA256)&lt;br /&gt;
|-&lt;br /&gt;
| &#039;&#039;&#039;RSA&#039;&#039;&#039; || SHA256:RaE0/tqQMMBmJuDCIo3WZ38YJsz0godVyt6aUOk/E0M&lt;br /&gt;
|-&lt;br /&gt;
| &#039;&#039;&#039;ECDSA&#039;&#039;&#039; || SHA256:LjBYL/x86ZAlL0JdlXrCmPYXvS3DaSiMuvycojBMdwQ&lt;br /&gt;
|-&lt;br /&gt;
| &#039;&#039;&#039;ED25519&#039;&#039;&#039; || SHA256:5mZYEpKigwK5ibBMHRrh3WIkOtCqomJW6H7OMbPk3ec&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Login with SSH command (Linux, Mac, Windows) ==&lt;br /&gt;
&lt;br /&gt;
Linux, Mac OS, other Unix-like operating systems and Microsoft Windows come with a built-in SSH client, most likely provided by the OpenSSH project.&lt;br /&gt;
&lt;br /&gt;
For login use one of the following ssh commands:&lt;br /&gt;
&lt;br /&gt;
 ssh -l &amp;lt;username&amp;gt; uc3.scc.kit.edu&lt;br /&gt;
 ssh &amp;lt;username&amp;gt;@bwunicluster.scc.kit.edu&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
To run graphical applications, you can use the &amp;lt;code&amp;gt;-X&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;-Y&amp;lt;/code&amp;gt; flag to &amp;lt;code&amp;gt;ssh&amp;lt;/code&amp;gt;:&lt;br /&gt;
&lt;br /&gt;
 ssh -Y -l &amp;lt;username&amp;gt; bwunicluster.scc.kit.edu&lt;br /&gt;
&lt;br /&gt;
For better performance, we recommend using [[VNC]].&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Login with graphical SSH client (Windows) ==&lt;br /&gt;
&lt;br /&gt;
For Windows we suggest using  [[Data_Transfer/Graphical_Clients#MobaXterm|MobaXterm]] for login and file transfer.&lt;br /&gt;
 &lt;br /&gt;
Start &#039;&#039;MobaXterm&#039;&#039;, fill in the following fields:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Remote name              : uc3.scc.kit.edu    # or bwunicluster.scc.kit.edu&lt;br /&gt;
Specify user name        : &amp;lt;username&amp;gt;&lt;br /&gt;
Port                     : 22&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
After that click on &#039;ok&#039;. Then a terminal will be opened and there you can enter your credentials.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; When using File transfer with MobaXterm version 23.6 the following configuration change has to be made:&lt;br /&gt;
In the settings in the tab &amp;quot;SSH&amp;quot;, change the option &amp;quot;SSH engine&amp;quot; from &amp;quot;&amp;lt;new&amp;gt;&amp;quot; to &amp;quot;&amp;lt;legacy&amp;gt;&amp;quot;. Then restart MobaXterm&lt;br /&gt;
&lt;br /&gt;
== Login with Jupyterhub ==&lt;br /&gt;
&lt;br /&gt;
Login takes place at:&lt;br /&gt;
* bwUniCluster 3.0: [https://uc3-jupyter.scc.kit.edu uc3-jupyter.scc.kit.edu]&lt;br /&gt;
* SDIL: [https://sdil-jupyter.scc.kit.edu sdil-jupyter.scc.kit.edu]&lt;br /&gt;
&lt;br /&gt;
More Information can be found [[BwUniCluster3.0/Jupyter#Login_process|here]].&lt;br /&gt;
&lt;br /&gt;
== Login Example ==&lt;br /&gt;
&lt;br /&gt;
To log in to bwUniCluster 3.0, you must provide your [[Registration/Password|service password]].&lt;br /&gt;
Proceed as follows:&lt;br /&gt;
# Use SSH for a login node.&lt;br /&gt;
# The system will ask for a one-time password &amp;lt;code&amp;gt;Your OTP:&amp;lt;/code&amp;gt;. Please enter your OTP and confirm it with Enter/Return. If you do not have a second factor yet, please create one (see [[Registration/2FA]]).&lt;br /&gt;
# The system will ask you for your service password &amp;lt;code&amp;gt;Password:&amp;lt;/code&amp;gt;. Please enter it and confirm it with Enter/Return. If you do not have a service password yet or have forgotten it, please create one (see [[Registration/Password]]).&lt;br /&gt;
# You will be greeted by the cluster, followed by a shell.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[user@client ~]$ ssh ka_ab1234@uc3.scc.kit.edu&lt;br /&gt;
(ka_ab1234@uc3.scc.kit.edu) Your OTP: cccccctlljdbrjdleujigivvfnkjbucudugjjlutfbrk&lt;br /&gt;
(ka_ab1234@uc3.scc.kit.edu) Password: &lt;br /&gt;
********************************************************************************&lt;br /&gt;
*                                                                              *&lt;br /&gt;
*                   Karlsruher Institut für Technologie (KIT)                  *&lt;br /&gt;
*                                                                              *&lt;br /&gt;
*                       Scientific Computing Center (SCC)                      *&lt;br /&gt;
*                                                                              *&lt;br /&gt;
*                            _    _    _____   ____                            *&lt;br /&gt;
*                           | |  | |  / ____| |___ \                           *&lt;br /&gt;
*                           | |  | | | |        __) |                          *&lt;br /&gt;
*                           | |  | | | |       |__ &amp;lt;                           *&lt;br /&gt;
*                           | |__| | | |____   ___) |                          *&lt;br /&gt;
*                            \____/   \_____| |____/                           *&lt;br /&gt;
*                                                                              *&lt;br /&gt;
*                                                                              *&lt;br /&gt;
*                  (KITE 2.0, RHEL 9.4, Lustre 2.14.0_ddn154)                  *&lt;br /&gt;
*                                                                              *&lt;br /&gt;
*                                                                              *&lt;br /&gt;
********************************************************************************&lt;br /&gt;
Last login: Wed Feb 26 11:08:20 2025 from 2a00:1398:4:181c:2be1:437b:1c36:1337&lt;br /&gt;
&lt;br /&gt;
[ka_ab1234@uc3n990 ~]$&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting ==&lt;br /&gt;
&lt;br /&gt;
See [[BwUniCluster3.0/FAQ#Login|bwUniCluster FAQ]].&lt;br /&gt;
&lt;br /&gt;
= Allowed Activities on Login Nodes =&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#ffa500; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#ffa500; text-align:left&amp;quot;|&lt;br /&gt;
To guarantee usability for all the users of clusters you must not run your compute jobs on the login nodes.&lt;br /&gt;
Compute jobs must be submitted to the queuing system.&amp;lt;br/&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Any compute job running on the login nodes will be terminated without any notice.&#039;&#039;&#039;&amp;lt;br/&amp;gt;&lt;br /&gt;
Any long-running compilation or any long-running pre- or post-processing of batch jobs must also be submitted to the queuing system.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
The login nodes of the bwHPC clusters are the access point to the compute system, your &amp;lt;code&amp;gt;$HOME&amp;lt;/code&amp;gt; directory and your workspaces.&lt;br /&gt;
These nodes are shared with all the users therefore, your activities on the login nodes are limited to primarily set up your batch jobs.&lt;br /&gt;
Your activities may also be:&lt;br /&gt;
* &#039;&#039;&#039;short&#039;&#039;&#039; compilation of your program code and&lt;br /&gt;
* &#039;&#039;&#039;light weight&#039;&#039;&#039; pre- and post-processing of your batch jobs.&lt;br /&gt;
&lt;br /&gt;
We advise users to use [[BwUniCluster3.0/Batch_Queues#Interactive_Jobs|interactive jobs]] for compute and memory intensive tasks like compiling.&lt;br /&gt;
&lt;br /&gt;
= Related Information =&lt;br /&gt;
&lt;br /&gt;
* If you want to reset your service password, consult the [[Registration/Password|Password Guide]].&lt;br /&gt;
* If you want to register a new token for the two factor authentication (2FA), consult the [[Registration/2FA|2FA Guide]].&lt;br /&gt;
* If you want to de-register, consult the [[Registration/Deregistration|De-registration Guide]].&lt;br /&gt;
* If you need an SSH key for your workflow, read [[Registration/SSH|Registering SSH Keys with your Cluster]].&lt;br /&gt;
* Configuring your shell: [[.bashrc Do&#039;s and Don&#039;ts]]&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15844</id>
		<title>BwUniCluster3.0</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15844"/>
		<updated>2026-03-18T16:58:35Z</updated>

		<summary type="html">&lt;p&gt;S Braun: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## Picture of bwUniCluster - right side  ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## About bwUniCluster                    ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
The &#039;&#039;&#039;bwUniCluster 3.0+KIT-GFA-HPC 3&#039;&#039;&#039; is the joint high-performance computer system of Baden-Württemberg&#039;s Universities and Universities of Applied Sciences for &#039;&#039;&#039;general purpose and teaching&#039;&#039;&#039; and located at the Scientific Computing Center (SCC) at Karlsruhe Institute of Technology (KIT). The bwUniCluster 3.0 complements the four bwForClusters and their dedicated scientific areas.&lt;br /&gt;
[[File:DSCF6485_rectangled_perspective.jpg|center|600px|frameless|alt=bwUniCluster3.0 |upright=1| bwUniCluster 3.0 ]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Maintenance Section     ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no upcoming maintenance&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | SOLVED: Service Incident Notice: bwUniCluster 3.0 Login Not Possible&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
The login issue with bwUniCluster 3.0, which had been occurring since Friday, March 13, 2026, at 10:00 p.m., has been resolved.&lt;br /&gt;
&lt;br /&gt;
The cause was a software error in the parallel file system, which has since been successfully corrected.&lt;br /&gt;
A patch developed for us by the manufacturer has been applied. However, we would like to point out that we cannot currently completely rule out the possibility that the problem may recur under certain circumstances.&lt;br /&gt;
&lt;br /&gt;
You can now log in as usual. Please check the results of your calculations and resubmit any jobs that were interrupted. &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: News section            ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no news&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Transition bwUniCluster 2.0 &amp;amp;rarr; bwUniCluster 3.0&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
&lt;br /&gt;
The HPC cluster bwUniCluster 3.0 is the successor of bwUniCluster 2.0. It features accelerated and CPU-only nodes, with the host system of both node types consisting of classic x86 processor architectures.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
To ensure that you can use the new system successfully and set up your working environment with ease, the following points should be noted.&lt;br /&gt;
&lt;br /&gt;
== Registration ==&lt;br /&gt;
All users who already have an entitlement on bwUniCluster 2.0 are authorized to access bwUniCluster 3.0. The user only needs to &#039;&#039;&#039;register for the new service&#039;&#039;&#039; at https://bwidm.scc.kit.edu .&lt;br /&gt;
&lt;br /&gt;
== Changes ==&lt;br /&gt;
&lt;br /&gt;
Hardware, software and the operating system have been updated and adapted to the latest standards. We would like to draw your attention in particular to the changes in policy, which must also be taken into account.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Changes to hardware, software and policy can be looked up here: [[BwUniCluster3.0/Data_Migration_Guide#Summary_of_changes|Summary of Changes]]&lt;br /&gt;
&lt;br /&gt;
== Migration ==&lt;br /&gt;
bwUniCluster 3.0 features a completely new file system. &#039;&#039;&#039;There is no automatic migration of user data!&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
The file systems of the old system and the login nodes will remain in operation for a period of &#039;&#039;&#039;3 months&#039;&#039;&#039; after the new system goes live (till July 6, 2025).&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
In order to move data that is still needed, user software, and user specific settings from the old HOME directory to the new HOME directory, or to new workspaces, instructions are provided here: [[BwUniCluster3.0/Data_Migration_Guide#Migration_of_Data|Data Migration Guide]]&lt;br /&gt;
|}&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Training/Support section##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#eeeefe; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#dedefe; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Training &amp;amp; Support&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
* [[BwUniCluster3.0/Getting_Started|Getting Started]]&lt;br /&gt;
* [https://training.bwhpc.de E-Learning Courses]&lt;br /&gt;
* [[BwUniCluster3.0/Support|Support]]&lt;br /&gt;
* [[BwUniCluster3.0/FAQ|FAQ]]&lt;br /&gt;
* Send [[Feedback|Feedback]] about Wiki pages&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: User Documentation      ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#cef2e0; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | User Documentation&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Access: [[Registration/bwUniCluster|Registration]], [[Registration/Deregistration|Deregistration]], [[BwUniCluster3.0/Policies|Policies]]&lt;br /&gt;
* [[BwUniCluster3.0/Login|Login]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Client|SSH Clients]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Data_Transfer|Data Transfer]]&lt;br /&gt;
* [[BwUniCluster3.0/Hardware_and_Architecture|Hardware and Architecture]]&lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#Compute_resources|Compute Resources]] &lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#File_Systems|File Systems]] &lt;br /&gt;
* [[BwUniCluster3.0/Software|Cluster Specific Software]]&lt;br /&gt;
** [[BwUniCluster3.0/Containers|Using Containers]]&lt;br /&gt;
* [[BwUniCluster3.0/Running_Jobs|Runnning Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Batch_Jobs:_sbatch|Running Batch Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Interactive_Jobs:_salloc|Running Interactive Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Jupyter|Interactive Computing with Jupyter]]&lt;br /&gt;
* [[BwUniCluster3.0/Maintenance|Operational Changes]]&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Acknowledgement         ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#e6e9eb; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#d1dadf; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Cluster Funding&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Please [[BwUniCluster3.0/Acknowledgement|acknowledge]] bwUniCluster 3.0 in your publications.&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15832</id>
		<title>BwUniCluster3.0</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15832"/>
		<updated>2026-03-17T08:08:32Z</updated>

		<summary type="html">&lt;p&gt;S Braun: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## Picture of bwUniCluster - right side  ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## About bwUniCluster                    ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
The &#039;&#039;&#039;bwUniCluster 3.0+KIT-GFA-HPC 3&#039;&#039;&#039; is the joint high-performance computer system of Baden-Württemberg&#039;s Universities and Universities of Applied Sciences for &#039;&#039;&#039;general purpose and teaching&#039;&#039;&#039; and located at the Scientific Computing Center (SCC) at Karlsruhe Institute of Technology (KIT). The bwUniCluster 3.0 complements the four bwForClusters and their dedicated scientific areas.&lt;br /&gt;
[[File:DSCF6485_rectangled_perspective.jpg|center|600px|frameless|alt=bwUniCluster3.0 |upright=1| bwUniCluster 3.0 ]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Maintenance Section     ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no upcoming maintenance&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Service Incident Notice: bwUniCluster 3.0 Login Not Possible&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
We are currently experiencing an issue on bwUniCluster 3.0 that prevents users from logging in. The disruption is caused by a software error in the filesystem.&lt;br /&gt;
Our team is working intensively to resolve the problem, in close collaboration with the system’s manufacturer. At this time, we are unable to provide an exact estimate for when the issue will be fully resolved.&lt;br /&gt;
&lt;br /&gt;
We do not expect a long‑term outage; therefore, any workspaces that may have expired during the disruption should be easily restorable using ws_restore.&lt;br /&gt;
&amp;lt;!-- Please see the [[BwUniCluster3.0/Maintenance|maintenance]] page for more information about planned upgrades and other changes --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We will inform you via the mailing list as soon as there are any updates.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: News section            ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no news&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Transition bwUniCluster 2.0 &amp;amp;rarr; bwUniCluster 3.0&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
&lt;br /&gt;
The HPC cluster bwUniCluster 3.0 is the successor of bwUniCluster 2.0. It features accelerated and CPU-only nodes, with the host system of both node types consisting of classic x86 processor architectures.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
To ensure that you can use the new system successfully and set up your working environment with ease, the following points should be noted.&lt;br /&gt;
&lt;br /&gt;
== Registration ==&lt;br /&gt;
All users who already have an entitlement on bwUniCluster 2.0 are authorized to access bwUniCluster 3.0. The user only needs to &#039;&#039;&#039;register for the new service&#039;&#039;&#039; at https://bwidm.scc.kit.edu .&lt;br /&gt;
&lt;br /&gt;
== Changes ==&lt;br /&gt;
&lt;br /&gt;
Hardware, software and the operating system have been updated and adapted to the latest standards. We would like to draw your attention in particular to the changes in policy, which must also be taken into account.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Changes to hardware, software and policy can be looked up here: [[BwUniCluster3.0/Data_Migration_Guide#Summary_of_changes|Summary of Changes]]&lt;br /&gt;
&lt;br /&gt;
== Migration ==&lt;br /&gt;
bwUniCluster 3.0 features a completely new file system. &#039;&#039;&#039;There is no automatic migration of user data!&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
The file systems of the old system and the login nodes will remain in operation for a period of &#039;&#039;&#039;3 months&#039;&#039;&#039; after the new system goes live (till July 6, 2025).&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
In order to move data that is still needed, user software, and user specific settings from the old HOME directory to the new HOME directory, or to new workspaces, instructions are provided here: [[BwUniCluster3.0/Data_Migration_Guide#Migration_of_Data|Data Migration Guide]]&lt;br /&gt;
|}&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Training/Support section##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#eeeefe; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#dedefe; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Training &amp;amp; Support&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
* [[BwUniCluster3.0/Getting_Started|Getting Started]]&lt;br /&gt;
* [https://training.bwhpc.de E-Learning Courses]&lt;br /&gt;
* [[BwUniCluster3.0/Support|Support]]&lt;br /&gt;
* [[BwUniCluster3.0/FAQ|FAQ]]&lt;br /&gt;
* Send [[Feedback|Feedback]] about Wiki pages&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: User Documentation      ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#cef2e0; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | User Documentation&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Access: [[Registration/bwUniCluster|Registration]], [[Registration/Deregistration|Deregistration]], [[BwUniCluster3.0/Policies|Policies]]&lt;br /&gt;
* [[BwUniCluster3.0/Login|Login]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Client|SSH Clients]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Data_Transfer|Data Transfer]]&lt;br /&gt;
* [[BwUniCluster3.0/Hardware_and_Architecture|Hardware and Architecture]]&lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#Compute_resources|Compute Resources]] &lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#File_Systems|File Systems]] &lt;br /&gt;
* [[BwUniCluster3.0/Software|Cluster Specific Software]]&lt;br /&gt;
** [[BwUniCluster3.0/Containers|Using Containers]]&lt;br /&gt;
* [[BwUniCluster3.0/Running_Jobs|Runnning Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Batch_Jobs:_sbatch|Running Batch Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Interactive_Jobs:_salloc|Running Interactive Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Jupyter|Interactive Computing with Jupyter]]&lt;br /&gt;
* [[BwUniCluster3.0/Maintenance|Operational Changes]]&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Acknowledgement         ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#e6e9eb; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#d1dadf; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Cluster Funding&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Please [[BwUniCluster3.0/Acknowledgement|acknowledge]] bwUniCluster 3.0 in your publications.&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15831</id>
		<title>BwUniCluster3.0</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15831"/>
		<updated>2026-03-17T08:08:22Z</updated>

		<summary type="html">&lt;p&gt;S Braun: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## Picture of bwUniCluster - right side  ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## About bwUniCluster                    ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
The &#039;&#039;&#039;bwUniCluster 3.0+KIT-GFA-HPC 3&#039;&#039;&#039; is the joint high-performance computer system of Baden-Württemberg&#039;s Universities and Universities of Applied Sciences for &#039;&#039;&#039;general purpose and teaching&#039;&#039;&#039; and located at the Scientific Computing Center (SCC) at Karlsruhe Institute of Technology (KIT). The bwUniCluster 3.0 complements the four bwForClusters and their dedicated scientific areas.&lt;br /&gt;
[[File:DSCF6485_rectangled_perspective.jpg|center|600px|frameless|alt=bwUniCluster3.0 |upright=1| bwUniCluster 3.0 ]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Maintenance Section     ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no upcoming maintenance&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Service Incident Notice: bwUniCluster 3.0 Login Not Possible&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
We are currently experiencing an issue on bwUniCluster 3.0 that prevents users from logging in. The disruption is caused by a software error in the filesystem.&lt;br /&gt;
Our team is working intensively to resolve the problem, in close collaboration with the system’s manufacturer. At this time, we are unable to provide an exact estimate for when the issue will be fully resolved.&lt;br /&gt;
&lt;br /&gt;
We do not expect a long‑term outage; therefore, any workspaces that may have expired during the disruption should be easily restorable using ws_restore.&lt;br /&gt;
&amp;lt;!-- Please see the [[BwUniCluster3.0/Maintenance|maintenance]] page for more information about planned upgrades and other changes --&amp;gt;&lt;br /&gt;
We will inform you via the mailing list as soon as there are any updates.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: News section            ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no news&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Transition bwUniCluster 2.0 &amp;amp;rarr; bwUniCluster 3.0&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
&lt;br /&gt;
The HPC cluster bwUniCluster 3.0 is the successor of bwUniCluster 2.0. It features accelerated and CPU-only nodes, with the host system of both node types consisting of classic x86 processor architectures.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
To ensure that you can use the new system successfully and set up your working environment with ease, the following points should be noted.&lt;br /&gt;
&lt;br /&gt;
== Registration ==&lt;br /&gt;
All users who already have an entitlement on bwUniCluster 2.0 are authorized to access bwUniCluster 3.0. The user only needs to &#039;&#039;&#039;register for the new service&#039;&#039;&#039; at https://bwidm.scc.kit.edu .&lt;br /&gt;
&lt;br /&gt;
== Changes ==&lt;br /&gt;
&lt;br /&gt;
Hardware, software and the operating system have been updated and adapted to the latest standards. We would like to draw your attention in particular to the changes in policy, which must also be taken into account.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Changes to hardware, software and policy can be looked up here: [[BwUniCluster3.0/Data_Migration_Guide#Summary_of_changes|Summary of Changes]]&lt;br /&gt;
&lt;br /&gt;
== Migration ==&lt;br /&gt;
bwUniCluster 3.0 features a completely new file system. &#039;&#039;&#039;There is no automatic migration of user data!&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
The file systems of the old system and the login nodes will remain in operation for a period of &#039;&#039;&#039;3 months&#039;&#039;&#039; after the new system goes live (till July 6, 2025).&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
In order to move data that is still needed, user software, and user specific settings from the old HOME directory to the new HOME directory, or to new workspaces, instructions are provided here: [[BwUniCluster3.0/Data_Migration_Guide#Migration_of_Data|Data Migration Guide]]&lt;br /&gt;
|}&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Training/Support section##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#eeeefe; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#dedefe; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Training &amp;amp; Support&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
* [[BwUniCluster3.0/Getting_Started|Getting Started]]&lt;br /&gt;
* [https://training.bwhpc.de E-Learning Courses]&lt;br /&gt;
* [[BwUniCluster3.0/Support|Support]]&lt;br /&gt;
* [[BwUniCluster3.0/FAQ|FAQ]]&lt;br /&gt;
* Send [[Feedback|Feedback]] about Wiki pages&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: User Documentation      ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#cef2e0; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | User Documentation&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Access: [[Registration/bwUniCluster|Registration]], [[Registration/Deregistration|Deregistration]], [[BwUniCluster3.0/Policies|Policies]]&lt;br /&gt;
* [[BwUniCluster3.0/Login|Login]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Client|SSH Clients]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Data_Transfer|Data Transfer]]&lt;br /&gt;
* [[BwUniCluster3.0/Hardware_and_Architecture|Hardware and Architecture]]&lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#Compute_resources|Compute Resources]] &lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#File_Systems|File Systems]] &lt;br /&gt;
* [[BwUniCluster3.0/Software|Cluster Specific Software]]&lt;br /&gt;
** [[BwUniCluster3.0/Containers|Using Containers]]&lt;br /&gt;
* [[BwUniCluster3.0/Running_Jobs|Runnning Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Batch_Jobs:_sbatch|Running Batch Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Interactive_Jobs:_salloc|Running Interactive Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Jupyter|Interactive Computing with Jupyter]]&lt;br /&gt;
* [[BwUniCluster3.0/Maintenance|Operational Changes]]&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Acknowledgement         ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#e6e9eb; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#d1dadf; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Cluster Funding&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Please [[BwUniCluster3.0/Acknowledgement|acknowledge]] bwUniCluster 3.0 in your publications.&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15830</id>
		<title>BwUniCluster3.0</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15830"/>
		<updated>2026-03-17T08:08:07Z</updated>

		<summary type="html">&lt;p&gt;S Braun: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## Picture of bwUniCluster - right side  ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## About bwUniCluster                    ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
The &#039;&#039;&#039;bwUniCluster 3.0+KIT-GFA-HPC 3&#039;&#039;&#039; is the joint high-performance computer system of Baden-Württemberg&#039;s Universities and Universities of Applied Sciences for &#039;&#039;&#039;general purpose and teaching&#039;&#039;&#039; and located at the Scientific Computing Center (SCC) at Karlsruhe Institute of Technology (KIT). The bwUniCluster 3.0 complements the four bwForClusters and their dedicated scientific areas.&lt;br /&gt;
[[File:DSCF6485_rectangled_perspective.jpg|center|600px|frameless|alt=bwUniCluster3.0 |upright=1| bwUniCluster 3.0 ]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Maintenance Section     ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no upcoming maintenance&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Service Incident Notice: bwUniCluster 3.0 Login Not Possible&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
We are currently experiencing an issue on bwUniCluster 3.0 that prevents users from logging in. The disruption is caused by a software error in the filesystem.&lt;br /&gt;
Our team is working intensively to resolve the problem, in close collaboration with the system’s manufacturer. At this time, we are unable to provide an exact estimate for when the issue will be fully resolved.&lt;br /&gt;
&lt;br /&gt;
We do not expect a long‑term outage; therefore, any workspaces that may have expired during the disruption should be easily restorable using ws_restore.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Please see the [[BwUniCluster3.0/Maintenance|maintenance]] page for more information about planned upgrades and other changes --&amp;gt;&lt;br /&gt;
We will inform you via the mailing list as soon as there are any updates.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: News section            ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no news&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Transition bwUniCluster 2.0 &amp;amp;rarr; bwUniCluster 3.0&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
&lt;br /&gt;
The HPC cluster bwUniCluster 3.0 is the successor of bwUniCluster 2.0. It features accelerated and CPU-only nodes, with the host system of both node types consisting of classic x86 processor architectures.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
To ensure that you can use the new system successfully and set up your working environment with ease, the following points should be noted.&lt;br /&gt;
&lt;br /&gt;
== Registration ==&lt;br /&gt;
All users who already have an entitlement on bwUniCluster 2.0 are authorized to access bwUniCluster 3.0. The user only needs to &#039;&#039;&#039;register for the new service&#039;&#039;&#039; at https://bwidm.scc.kit.edu .&lt;br /&gt;
&lt;br /&gt;
== Changes ==&lt;br /&gt;
&lt;br /&gt;
Hardware, software and the operating system have been updated and adapted to the latest standards. We would like to draw your attention in particular to the changes in policy, which must also be taken into account.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Changes to hardware, software and policy can be looked up here: [[BwUniCluster3.0/Data_Migration_Guide#Summary_of_changes|Summary of Changes]]&lt;br /&gt;
&lt;br /&gt;
== Migration ==&lt;br /&gt;
bwUniCluster 3.0 features a completely new file system. &#039;&#039;&#039;There is no automatic migration of user data!&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
The file systems of the old system and the login nodes will remain in operation for a period of &#039;&#039;&#039;3 months&#039;&#039;&#039; after the new system goes live (till July 6, 2025).&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
In order to move data that is still needed, user software, and user specific settings from the old HOME directory to the new HOME directory, or to new workspaces, instructions are provided here: [[BwUniCluster3.0/Data_Migration_Guide#Migration_of_Data|Data Migration Guide]]&lt;br /&gt;
|}&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Training/Support section##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#eeeefe; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#dedefe; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Training &amp;amp; Support&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
* [[BwUniCluster3.0/Getting_Started|Getting Started]]&lt;br /&gt;
* [https://training.bwhpc.de E-Learning Courses]&lt;br /&gt;
* [[BwUniCluster3.0/Support|Support]]&lt;br /&gt;
* [[BwUniCluster3.0/FAQ|FAQ]]&lt;br /&gt;
* Send [[Feedback|Feedback]] about Wiki pages&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: User Documentation      ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#cef2e0; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | User Documentation&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Access: [[Registration/bwUniCluster|Registration]], [[Registration/Deregistration|Deregistration]], [[BwUniCluster3.0/Policies|Policies]]&lt;br /&gt;
* [[BwUniCluster3.0/Login|Login]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Client|SSH Clients]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Data_Transfer|Data Transfer]]&lt;br /&gt;
* [[BwUniCluster3.0/Hardware_and_Architecture|Hardware and Architecture]]&lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#Compute_resources|Compute Resources]] &lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#File_Systems|File Systems]] &lt;br /&gt;
* [[BwUniCluster3.0/Software|Cluster Specific Software]]&lt;br /&gt;
** [[BwUniCluster3.0/Containers|Using Containers]]&lt;br /&gt;
* [[BwUniCluster3.0/Running_Jobs|Runnning Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Batch_Jobs:_sbatch|Running Batch Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Interactive_Jobs:_salloc|Running Interactive Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Jupyter|Interactive Computing with Jupyter]]&lt;br /&gt;
* [[BwUniCluster3.0/Maintenance|Operational Changes]]&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Acknowledgement         ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#e6e9eb; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#d1dadf; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Cluster Funding&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Please [[BwUniCluster3.0/Acknowledgement|acknowledge]] bwUniCluster 3.0 in your publications.&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15829</id>
		<title>BwUniCluster3.0</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15829"/>
		<updated>2026-03-17T08:07:24Z</updated>

		<summary type="html">&lt;p&gt;S Braun: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## Picture of bwUniCluster - right side  ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## About bwUniCluster                    ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
The &#039;&#039;&#039;bwUniCluster 3.0+KIT-GFA-HPC 3&#039;&#039;&#039; is the joint high-performance computer system of Baden-Württemberg&#039;s Universities and Universities of Applied Sciences for &#039;&#039;&#039;general purpose and teaching&#039;&#039;&#039; and located at the Scientific Computing Center (SCC) at Karlsruhe Institute of Technology (KIT). The bwUniCluster 3.0 complements the four bwForClusters and their dedicated scientific areas.&lt;br /&gt;
[[File:DSCF6485_rectangled_perspective.jpg|center|600px|frameless|alt=bwUniCluster3.0 |upright=1| bwUniCluster 3.0 ]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Maintenance Section     ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no upcoming maintenance&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Service Incident Notice: bwUniCluster 3.0 Login Not Possible&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
We are currently experiencing an issue on bwUniCluster 3.0 that prevents users from logging in. The disruption is caused by a software error in the filesystem.&lt;br /&gt;
Our team is working intensively to resolve the problem, in close collaboration with the system’s manufacturer. At this time, we are unable to provide an exact estimate for when the issue will be fully resolved.&lt;br /&gt;
&lt;br /&gt;
We do not expect a long‑term outage; therefore, any workspaces that may have expired during the disruption should be easily restorable using ws_restore.&lt;br /&gt;
&lt;br /&gt;
We will keep you updated as soon as new information becomes available.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Please see the [[BwUniCluster3.0/Maintenance|maintenance]] page for more information about planned upgrades and other changes --&amp;gt;&lt;br /&gt;
We will inform you via the mailing list as soon as there are any updates.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: News section            ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no news&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Transition bwUniCluster 2.0 &amp;amp;rarr; bwUniCluster 3.0&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
&lt;br /&gt;
The HPC cluster bwUniCluster 3.0 is the successor of bwUniCluster 2.0. It features accelerated and CPU-only nodes, with the host system of both node types consisting of classic x86 processor architectures.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
To ensure that you can use the new system successfully and set up your working environment with ease, the following points should be noted.&lt;br /&gt;
&lt;br /&gt;
== Registration ==&lt;br /&gt;
All users who already have an entitlement on bwUniCluster 2.0 are authorized to access bwUniCluster 3.0. The user only needs to &#039;&#039;&#039;register for the new service&#039;&#039;&#039; at https://bwidm.scc.kit.edu .&lt;br /&gt;
&lt;br /&gt;
== Changes ==&lt;br /&gt;
&lt;br /&gt;
Hardware, software and the operating system have been updated and adapted to the latest standards. We would like to draw your attention in particular to the changes in policy, which must also be taken into account.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Changes to hardware, software and policy can be looked up here: [[BwUniCluster3.0/Data_Migration_Guide#Summary_of_changes|Summary of Changes]]&lt;br /&gt;
&lt;br /&gt;
== Migration ==&lt;br /&gt;
bwUniCluster 3.0 features a completely new file system. &#039;&#039;&#039;There is no automatic migration of user data!&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
The file systems of the old system and the login nodes will remain in operation for a period of &#039;&#039;&#039;3 months&#039;&#039;&#039; after the new system goes live (till July 6, 2025).&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
In order to move data that is still needed, user software, and user specific settings from the old HOME directory to the new HOME directory, or to new workspaces, instructions are provided here: [[BwUniCluster3.0/Data_Migration_Guide#Migration_of_Data|Data Migration Guide]]&lt;br /&gt;
|}&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Training/Support section##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#eeeefe; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#dedefe; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Training &amp;amp; Support&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
* [[BwUniCluster3.0/Getting_Started|Getting Started]]&lt;br /&gt;
* [https://training.bwhpc.de E-Learning Courses]&lt;br /&gt;
* [[BwUniCluster3.0/Support|Support]]&lt;br /&gt;
* [[BwUniCluster3.0/FAQ|FAQ]]&lt;br /&gt;
* Send [[Feedback|Feedback]] about Wiki pages&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: User Documentation      ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#cef2e0; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | User Documentation&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Access: [[Registration/bwUniCluster|Registration]], [[Registration/Deregistration|Deregistration]], [[BwUniCluster3.0/Policies|Policies]]&lt;br /&gt;
* [[BwUniCluster3.0/Login|Login]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Client|SSH Clients]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Data_Transfer|Data Transfer]]&lt;br /&gt;
* [[BwUniCluster3.0/Hardware_and_Architecture|Hardware and Architecture]]&lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#Compute_resources|Compute Resources]] &lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#File_Systems|File Systems]] &lt;br /&gt;
* [[BwUniCluster3.0/Software|Cluster Specific Software]]&lt;br /&gt;
** [[BwUniCluster3.0/Containers|Using Containers]]&lt;br /&gt;
* [[BwUniCluster3.0/Running_Jobs|Runnning Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Batch_Jobs:_sbatch|Running Batch Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Interactive_Jobs:_salloc|Running Interactive Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Jupyter|Interactive Computing with Jupyter]]&lt;br /&gt;
* [[BwUniCluster3.0/Maintenance|Operational Changes]]&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Acknowledgement         ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#e6e9eb; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#d1dadf; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Cluster Funding&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Please [[BwUniCluster3.0/Acknowledgement|acknowledge]] bwUniCluster 3.0 in your publications.&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15735</id>
		<title>BwUniCluster3.0</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15735"/>
		<updated>2026-02-18T14:38:35Z</updated>

		<summary type="html">&lt;p&gt;S Braun: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## Picture of bwUniCluster - right side  ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## About bwUniCluster                    ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
The &#039;&#039;&#039;bwUniCluster 3.0+KIT-GFA-HPC 3&#039;&#039;&#039; is the joint high-performance computer system of Baden-Württemberg&#039;s Universities and Universities of Applied Sciences for &#039;&#039;&#039;general purpose and teaching&#039;&#039;&#039; and located at the Scientific Computing Center (SCC) at Karlsruhe Institute of Technology (KIT). The bwUniCluster 3.0 complements the four bwForClusters and their dedicated scientific areas.&lt;br /&gt;
[[File:DSCF6485_rectangled_perspective.jpg|center|600px|frameless|alt=bwUniCluster3.0 |upright=1| bwUniCluster 3.0 ]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Maintenance Section     ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no upcoming maintenance&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Maintenance&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
Due to extensive work on the electrical installation, the HPC system bwUniCluster 3.0 and all other HPC services will be unavailable from&lt;br /&gt;
&lt;br /&gt;
09.02.2026 at 06:00 AM until 18.02.2026&lt;br /&gt;
&lt;br /&gt;
Please see the [[BwUniCluster3.0/Maintenance|maintenance]] page for more information about planned upgrades and other changes&lt;br /&gt;
|}&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: News section            ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no news&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Transition bwUniCluster 2.0 &amp;amp;rarr; bwUniCluster 3.0&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
&lt;br /&gt;
The HPC cluster bwUniCluster 3.0 is the successor of bwUniCluster 2.0. It features accelerated and CPU-only nodes, with the host system of both node types consisting of classic x86 processor architectures.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
To ensure that you can use the new system successfully and set up your working environment with ease, the following points should be noted.&lt;br /&gt;
&lt;br /&gt;
== Registration ==&lt;br /&gt;
All users who already have an entitlement on bwUniCluster 2.0 are authorized to access bwUniCluster 3.0. The user only needs to &#039;&#039;&#039;register for the new service&#039;&#039;&#039; at https://bwidm.scc.kit.edu .&lt;br /&gt;
&lt;br /&gt;
== Changes ==&lt;br /&gt;
&lt;br /&gt;
Hardware, software and the operating system have been updated and adapted to the latest standards. We would like to draw your attention in particular to the changes in policy, which must also be taken into account.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Changes to hardware, software and policy can be looked up here: [[BwUniCluster3.0/Data_Migration_Guide#Summary_of_changes|Summary of Changes]]&lt;br /&gt;
&lt;br /&gt;
== Migration ==&lt;br /&gt;
bwUniCluster 3.0 features a completely new file system. &#039;&#039;&#039;There is no automatic migration of user data!&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
The file systems of the old system and the login nodes will remain in operation for a period of &#039;&#039;&#039;3 months&#039;&#039;&#039; after the new system goes live (till July 6, 2025).&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
In order to move data that is still needed, user software, and user specific settings from the old HOME directory to the new HOME directory, or to new workspaces, instructions are provided here: [[BwUniCluster3.0/Data_Migration_Guide#Migration_of_Data|Data Migration Guide]]&lt;br /&gt;
|}&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Training/Support section##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#eeeefe; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#dedefe; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Training &amp;amp; Support&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
* [[BwUniCluster3.0/Getting_Started|Getting Started]]&lt;br /&gt;
* [https://training.bwhpc.de E-Learning Courses]&lt;br /&gt;
* [[BwUniCluster3.0/Support|Support]]&lt;br /&gt;
* [[BwUniCluster3.0/FAQ|FAQ]]&lt;br /&gt;
* Send [[Feedback|Feedback]] about Wiki pages&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: User Documentation      ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#cef2e0; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | User Documentation&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Access: [[Registration/bwUniCluster|Registration]], [[Registration/Deregistration|Deregistration]], [[BwUniCluster3.0/Policies|Policies]]&lt;br /&gt;
* [[BwUniCluster3.0/Login|Login]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Client|SSH Clients]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Data_Transfer|Data Transfer]]&lt;br /&gt;
* [[BwUniCluster3.0/Hardware_and_Architecture|Hardware and Architecture]]&lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#Compute_resources|Compute Resources]] &lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#File_Systems|File Systems]] &lt;br /&gt;
* [[BwUniCluster3.0/Software|Cluster Specific Software]]&lt;br /&gt;
** [[BwUniCluster3.0/Containers|Using Containers]]&lt;br /&gt;
* [[BwUniCluster3.0/Running_Jobs|Runnning Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Batch_Jobs:_sbatch|Running Batch Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Interactive_Jobs:_salloc|Running Interactive Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Jupyter|Interactive Computing with Jupyter]]&lt;br /&gt;
* [[BwUniCluster3.0/Maintenance|Operational Changes]]&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Acknowledgement         ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#e6e9eb; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#d1dadf; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Cluster Funding&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Please [[BwUniCluster3.0/Acknowledgement|acknowledge]] bwUniCluster 3.0 in your publications.&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Maintenance&amp;diff=15717</id>
		<title>BwUniCluster3.0/Maintenance</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Maintenance&amp;diff=15717"/>
		<updated>2026-02-09T08:52:54Z</updated>

		<summary type="html">&lt;p&gt;S Braun: /* Maintenance records of bwUniCluster 3.0 */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;=== Maintenance records of bwUniCluster 3.0 ===&lt;br /&gt;
&#039;&#039;&#039;2026&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Minor updates regarding drivers, and the kernel.&lt;br /&gt;
&lt;br /&gt;
=== Maintenance records of retired bwUniCluster 2.0 ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2024&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster2.0/Maintenance/2024-05]] from 21.05.2024 to 24.05.2024&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2023&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster2.0/Maintenance/2023-03]] from 20.03.2023 to 24.03.2023&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2022&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster2.0/Maintenance/2022-11]] from 07.11.2022 to 10.11.2022&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster2.0/Maintenance/2022-03]] from 28.03.2022 to 31.03.2022&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2021&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster2.0/Maintenance/2021-10]] from 11.10.2021 to 15.10.2021&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2020&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster2.0/Maintenance/2020-10]] from 06.10.2020 to 13.10.2020&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Maintenance records of retired bwUniCluster 1.0 ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2019&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster/Maintenance/2019-02]] from 02.02.2019 to 08.02.2019&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2017&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster/Maintenance/2017-05]] from 02.05.2017 to 02.05.2017&lt;br /&gt;
* [[BwUniCluster/Maintenance/2017-03]] from 20.03.2017 to 21.03.2017&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2016&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster/Maintenance/2016-10]] from 17.10.2016 to 21.10.2016&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Maintenance&amp;diff=15716</id>
		<title>BwUniCluster3.0/Maintenance</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Maintenance&amp;diff=15716"/>
		<updated>2026-02-09T08:52:45Z</updated>

		<summary type="html">&lt;p&gt;S Braun: /* Maintenance records of bwUniCluster 3.0 */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;=== Maintenance records of bwUniCluster 3.0 ===&lt;br /&gt;
&#039;&#039;&#039;2026&#039;&#039;&#039;&lt;br /&gt;
Minor updates regarding drivers, and the kernel.&lt;br /&gt;
&lt;br /&gt;
=== Maintenance records of retired bwUniCluster 2.0 ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2024&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster2.0/Maintenance/2024-05]] from 21.05.2024 to 24.05.2024&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2023&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster2.0/Maintenance/2023-03]] from 20.03.2023 to 24.03.2023&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2022&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster2.0/Maintenance/2022-11]] from 07.11.2022 to 10.11.2022&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster2.0/Maintenance/2022-03]] from 28.03.2022 to 31.03.2022&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2021&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster2.0/Maintenance/2021-10]] from 11.10.2021 to 15.10.2021&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2020&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster2.0/Maintenance/2020-10]] from 06.10.2020 to 13.10.2020&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Maintenance records of retired bwUniCluster 1.0 ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2019&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster/Maintenance/2019-02]] from 02.02.2019 to 08.02.2019&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2017&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster/Maintenance/2017-05]] from 02.05.2017 to 02.05.2017&lt;br /&gt;
* [[BwUniCluster/Maintenance/2017-03]] from 20.03.2017 to 21.03.2017&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2016&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster/Maintenance/2016-10]] from 17.10.2016 to 21.10.2016&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Maintenance&amp;diff=15715</id>
		<title>BwUniCluster3.0/Maintenance</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Maintenance&amp;diff=15715"/>
		<updated>2026-02-09T08:52:09Z</updated>

		<summary type="html">&lt;p&gt;S Braun: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;=== Maintenance records of bwUniCluster 3.0 ===&lt;br /&gt;
&#039;&#039;&#039;2026&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Maintenance records of retired bwUniCluster 2.0 ===&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2024&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster2.0/Maintenance/2024-05]] from 21.05.2024 to 24.05.2024&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2023&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster2.0/Maintenance/2023-03]] from 20.03.2023 to 24.03.2023&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2022&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster2.0/Maintenance/2022-11]] from 07.11.2022 to 10.11.2022&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster2.0/Maintenance/2022-03]] from 28.03.2022 to 31.03.2022&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2021&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster2.0/Maintenance/2021-10]] from 11.10.2021 to 15.10.2021&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2020&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster2.0/Maintenance/2020-10]] from 06.10.2020 to 13.10.2020&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Maintenance records of retired bwUniCluster 1.0 ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2019&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster/Maintenance/2019-02]] from 02.02.2019 to 08.02.2019&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2017&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster/Maintenance/2017-05]] from 02.05.2017 to 02.05.2017&lt;br /&gt;
* [[BwUniCluster/Maintenance/2017-03]] from 20.03.2017 to 21.03.2017&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;2016&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* [[BwUniCluster/Maintenance/2016-10]] from 17.10.2016 to 21.10.2016&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15714</id>
		<title>BwUniCluster3.0</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15714"/>
		<updated>2026-02-09T08:51:27Z</updated>

		<summary type="html">&lt;p&gt;S Braun: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## Picture of bwUniCluster - right side  ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## About bwUniCluster                    ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
The &#039;&#039;&#039;bwUniCluster 3.0+KIT-GFA-HPC 3&#039;&#039;&#039; is the joint high-performance computer system of Baden-Württemberg&#039;s Universities and Universities of Applied Sciences for &#039;&#039;&#039;general purpose and teaching&#039;&#039;&#039; and located at the Scientific Computing Center (SCC) at Karlsruhe Institute of Technology (KIT). The bwUniCluster 3.0 complements the four bwForClusters and their dedicated scientific areas.&lt;br /&gt;
[[File:DSCF6485_rectangled_perspective.jpg|center|600px|frameless|alt=bwUniCluster3.0 |upright=1| bwUniCluster 3.0 ]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Maintenance Section     ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no upcoming maintenance&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Maintenance&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
Due to extensive work on the electrical installation, the HPC system bwUniCluster 3.0 and all other HPC services will be unavailable from&lt;br /&gt;
&lt;br /&gt;
09.02.2026 at 06:00 AM until 18.02.2026&lt;br /&gt;
&lt;br /&gt;
Please see the [[BwUniCluster3.0/Maintenance|maintenance]] page for more information about planned upgrades and other changes&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: News section            ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no news&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Transition bwUniCluster 2.0 &amp;amp;rarr; bwUniCluster 3.0&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
&lt;br /&gt;
The HPC cluster bwUniCluster 3.0 is the successor of bwUniCluster 2.0. It features accelerated and CPU-only nodes, with the host system of both node types consisting of classic x86 processor architectures.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
To ensure that you can use the new system successfully and set up your working environment with ease, the following points should be noted.&lt;br /&gt;
&lt;br /&gt;
== Registration ==&lt;br /&gt;
All users who already have an entitlement on bwUniCluster 2.0 are authorized to access bwUniCluster 3.0. The user only needs to &#039;&#039;&#039;register for the new service&#039;&#039;&#039; at https://bwidm.scc.kit.edu .&lt;br /&gt;
&lt;br /&gt;
== Changes ==&lt;br /&gt;
&lt;br /&gt;
Hardware, software and the operating system have been updated and adapted to the latest standards. We would like to draw your attention in particular to the changes in policy, which must also be taken into account.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Changes to hardware, software and policy can be looked up here: [[BwUniCluster3.0/Data_Migration_Guide#Summary_of_changes|Summary of Changes]]&lt;br /&gt;
&lt;br /&gt;
== Migration ==&lt;br /&gt;
bwUniCluster 3.0 features a completely new file system. &#039;&#039;&#039;There is no automatic migration of user data!&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
The file systems of the old system and the login nodes will remain in operation for a period of &#039;&#039;&#039;3 months&#039;&#039;&#039; after the new system goes live (till July 6, 2025).&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
In order to move data that is still needed, user software, and user specific settings from the old HOME directory to the new HOME directory, or to new workspaces, instructions are provided here: [[BwUniCluster3.0/Data_Migration_Guide#Migration_of_Data|Data Migration Guide]]&lt;br /&gt;
|}&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Training/Support section##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#eeeefe; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#dedefe; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Training &amp;amp; Support&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
* [[BwUniCluster3.0/Getting_Started|Getting Started]]&lt;br /&gt;
* [https://training.bwhpc.de E-Learning Courses]&lt;br /&gt;
* [[BwUniCluster3.0/Support|Support]]&lt;br /&gt;
* [[BwUniCluster3.0/FAQ|FAQ]]&lt;br /&gt;
* Send [[Feedback|Feedback]] about Wiki pages&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: User Documentation      ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#cef2e0; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | User Documentation&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Access: [[Registration/bwUniCluster|Registration]], [[Registration/Deregistration|Deregistration]], [[BwUniCluster3.0/Policies|Policies]]&lt;br /&gt;
* [[BwUniCluster3.0/Login|Login]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Client|SSH Clients]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Data_Transfer|Data Transfer]]&lt;br /&gt;
* [[BwUniCluster3.0/Hardware_and_Architecture|Hardware and Architecture]]&lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#Compute_resources|Compute Resources]] &lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#File_Systems|File Systems]] &lt;br /&gt;
* [[BwUniCluster3.0/Software|Cluster Specific Software]]&lt;br /&gt;
** [[BwUniCluster3.0/Containers|Using Containers]]&lt;br /&gt;
* [[BwUniCluster3.0/Running_Jobs|Runnning Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Batch_Jobs:_sbatch|Running Batch Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Interactive_Jobs:_salloc|Running Interactive Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Jupyter|Interactive Computing with Jupyter]]&lt;br /&gt;
* [[BwUniCluster3.0/Maintenance|Operational Changes]]&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Acknowledgement         ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#e6e9eb; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#d1dadf; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Cluster Funding&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Please [[BwUniCluster3.0/Acknowledgement|acknowledge]] bwUniCluster 3.0 in your publications.&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15713</id>
		<title>BwUniCluster3.0</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15713"/>
		<updated>2026-02-09T08:50:29Z</updated>

		<summary type="html">&lt;p&gt;S Braun: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## Picture of bwUniCluster - right side  ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## About bwUniCluster                    ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
The &#039;&#039;&#039;bwUniCluster 3.0+KIT-GFA-HPC 3&#039;&#039;&#039; is the joint high-performance computer system of Baden-Württemberg&#039;s Universities and Universities of Applied Sciences for &#039;&#039;&#039;general purpose and teaching&#039;&#039;&#039; and located at the Scientific Computing Center (SCC) at Karlsruhe Institute of Technology (KIT). The bwUniCluster 3.0 complements the four bwForClusters and their dedicated scientific areas.&lt;br /&gt;
[[File:DSCF6485_rectangled_perspective.jpg|center|600px|frameless|alt=bwUniCluster3.0 |upright=1| bwUniCluster 3.0 ]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Maintenance Section     ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no upcoming maintenance&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Maintenance&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
Due to extensive work on the electrical installation, the HPC system bwUniCluster 3.0 and all other HPC services will be unavailable from&lt;br /&gt;
&lt;br /&gt;
09.02.2026 at 06:00 AM until 18.02.2026&lt;br /&gt;
&lt;br /&gt;
Please see the [[BwUniCluster2.0/Maintenance/2024-05|maintenance]] page for more information about planned upgrades and other changes&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: News section            ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no news&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Transition bwUniCluster 2.0 &amp;amp;rarr; bwUniCluster 3.0&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
&lt;br /&gt;
The HPC cluster bwUniCluster 3.0 is the successor of bwUniCluster 2.0. It features accelerated and CPU-only nodes, with the host system of both node types consisting of classic x86 processor architectures.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
To ensure that you can use the new system successfully and set up your working environment with ease, the following points should be noted.&lt;br /&gt;
&lt;br /&gt;
== Registration ==&lt;br /&gt;
All users who already have an entitlement on bwUniCluster 2.0 are authorized to access bwUniCluster 3.0. The user only needs to &#039;&#039;&#039;register for the new service&#039;&#039;&#039; at https://bwidm.scc.kit.edu .&lt;br /&gt;
&lt;br /&gt;
== Changes ==&lt;br /&gt;
&lt;br /&gt;
Hardware, software and the operating system have been updated and adapted to the latest standards. We would like to draw your attention in particular to the changes in policy, which must also be taken into account.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Changes to hardware, software and policy can be looked up here: [[BwUniCluster3.0/Data_Migration_Guide#Summary_of_changes|Summary of Changes]]&lt;br /&gt;
&lt;br /&gt;
== Migration ==&lt;br /&gt;
bwUniCluster 3.0 features a completely new file system. &#039;&#039;&#039;There is no automatic migration of user data!&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
The file systems of the old system and the login nodes will remain in operation for a period of &#039;&#039;&#039;3 months&#039;&#039;&#039; after the new system goes live (till July 6, 2025).&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
In order to move data that is still needed, user software, and user specific settings from the old HOME directory to the new HOME directory, or to new workspaces, instructions are provided here: [[BwUniCluster3.0/Data_Migration_Guide#Migration_of_Data|Data Migration Guide]]&lt;br /&gt;
|}&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Training/Support section##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#eeeefe; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#dedefe; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Training &amp;amp; Support&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
* [[BwUniCluster3.0/Getting_Started|Getting Started]]&lt;br /&gt;
* [https://training.bwhpc.de E-Learning Courses]&lt;br /&gt;
* [[BwUniCluster3.0/Support|Support]]&lt;br /&gt;
* [[BwUniCluster3.0/FAQ|FAQ]]&lt;br /&gt;
* Send [[Feedback|Feedback]] about Wiki pages&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: User Documentation      ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#cef2e0; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | User Documentation&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Access: [[Registration/bwUniCluster|Registration]], [[Registration/Deregistration|Deregistration]], [[BwUniCluster3.0/Policies|Policies]]&lt;br /&gt;
* [[BwUniCluster3.0/Login|Login]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Client|SSH Clients]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Data_Transfer|Data Transfer]]&lt;br /&gt;
* [[BwUniCluster3.0/Hardware_and_Architecture|Hardware and Architecture]]&lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#Compute_resources|Compute Resources]] &lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#File_Systems|File Systems]] &lt;br /&gt;
* [[BwUniCluster3.0/Software|Cluster Specific Software]]&lt;br /&gt;
** [[BwUniCluster3.0/Containers|Using Containers]]&lt;br /&gt;
* [[BwUniCluster3.0/Running_Jobs|Runnning Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Batch_Jobs:_sbatch|Running Batch Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Interactive_Jobs:_salloc|Running Interactive Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Jupyter|Interactive Computing with Jupyter]]&lt;br /&gt;
* [[BwUniCluster3.0/Maintenance|Operational Changes]]&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Acknowledgement         ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#e6e9eb; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#d1dadf; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Cluster Funding&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Please [[BwUniCluster3.0/Acknowledgement|acknowledge]] bwUniCluster 3.0 in your publications.&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15712</id>
		<title>BwUniCluster3.0</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15712"/>
		<updated>2026-02-09T08:46:28Z</updated>

		<summary type="html">&lt;p&gt;S Braun: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## Picture of bwUniCluster - right side  ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## About bwUniCluster                    ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
The &#039;&#039;&#039;bwUniCluster 3.0+KIT-GFA-HPC 3&#039;&#039;&#039; is the joint high-performance computer system of Baden-Württemberg&#039;s Universities and Universities of Applied Sciences for &#039;&#039;&#039;general purpose and teaching&#039;&#039;&#039; and located at the Scientific Computing Center (SCC) at Karlsruhe Institute of Technology (KIT). The bwUniCluster 3.0 complements the four bwForClusters and their dedicated scientific areas.&lt;br /&gt;
[[File:DSCF6485_rectangled_perspective.jpg|center|600px|frameless|alt=bwUniCluster3.0 |upright=1| bwUniCluster 3.0 ]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Maintenance Section     ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no upcoming maintenance&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Next maintenance&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
Due to regular maintenance work the HPC System bwUnicluster 2 will not be available from &lt;br /&gt;
&lt;br /&gt;
21.05.2024 at 08:30 AM until 24.05.2024 at 15:00 AM&lt;br /&gt;
&lt;br /&gt;
Please see the [[BwUniCluster2.0/Maintenance/2024-05|maintenance]] page for more information about planned upgrades and other changes&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: News section            ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no news&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Transition bwUniCluster 2.0 &amp;amp;rarr; bwUniCluster 3.0&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
&lt;br /&gt;
The HPC cluster bwUniCluster 3.0 is the successor of bwUniCluster 2.0. It features accelerated and CPU-only nodes, with the host system of both node types consisting of classic x86 processor architectures.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
To ensure that you can use the new system successfully and set up your working environment with ease, the following points should be noted.&lt;br /&gt;
&lt;br /&gt;
== Registration ==&lt;br /&gt;
All users who already have an entitlement on bwUniCluster 2.0 are authorized to access bwUniCluster 3.0. The user only needs to &#039;&#039;&#039;register for the new service&#039;&#039;&#039; at https://bwidm.scc.kit.edu .&lt;br /&gt;
&lt;br /&gt;
== Changes ==&lt;br /&gt;
&lt;br /&gt;
Hardware, software and the operating system have been updated and adapted to the latest standards. We would like to draw your attention in particular to the changes in policy, which must also be taken into account.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Changes to hardware, software and policy can be looked up here: [[BwUniCluster3.0/Data_Migration_Guide#Summary_of_changes|Summary of Changes]]&lt;br /&gt;
&lt;br /&gt;
== Migration ==&lt;br /&gt;
bwUniCluster 3.0 features a completely new file system. &#039;&#039;&#039;There is no automatic migration of user data!&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
The file systems of the old system and the login nodes will remain in operation for a period of &#039;&#039;&#039;3 months&#039;&#039;&#039; after the new system goes live (till July 6, 2025).&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
In order to move data that is still needed, user software, and user specific settings from the old HOME directory to the new HOME directory, or to new workspaces, instructions are provided here: [[BwUniCluster3.0/Data_Migration_Guide#Migration_of_Data|Data Migration Guide]]&lt;br /&gt;
|}&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Training/Support section##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#eeeefe; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#dedefe; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Training &amp;amp; Support&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
* [[BwUniCluster3.0/Getting_Started|Getting Started]]&lt;br /&gt;
* [https://training.bwhpc.de E-Learning Courses]&lt;br /&gt;
* [[BwUniCluster3.0/Support|Support]]&lt;br /&gt;
* [[BwUniCluster3.0/FAQ|FAQ]]&lt;br /&gt;
* Send [[Feedback|Feedback]] about Wiki pages&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: User Documentation      ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#cef2e0; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | User Documentation&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Access: [[Registration/bwUniCluster|Registration]], [[Registration/Deregistration|Deregistration]], [[BwUniCluster3.0/Policies|Policies]]&lt;br /&gt;
* [[BwUniCluster3.0/Login|Login]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Client|SSH Clients]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Data_Transfer|Data Transfer]]&lt;br /&gt;
* [[BwUniCluster3.0/Hardware_and_Architecture|Hardware and Architecture]]&lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#Compute_resources|Compute Resources]] &lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#File_Systems|File Systems]] &lt;br /&gt;
* [[BwUniCluster3.0/Software|Cluster Specific Software]]&lt;br /&gt;
** [[BwUniCluster3.0/Containers|Using Containers]]&lt;br /&gt;
* [[BwUniCluster3.0/Running_Jobs|Runnning Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Batch_Jobs:_sbatch|Running Batch Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Interactive_Jobs:_salloc|Running Interactive Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Jupyter|Interactive Computing with Jupyter]]&lt;br /&gt;
* [[BwUniCluster3.0/Maintenance|Operational Changes]]&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Acknowledgement         ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#e6e9eb; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#d1dadf; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Cluster Funding&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Please [[BwUniCluster3.0/Acknowledgement|acknowledge]] bwUniCluster 3.0 in your publications.&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Running_Jobs&amp;diff=15710</id>
		<title>BwUniCluster3.0/Running Jobs</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Running_Jobs&amp;diff=15710"/>
		<updated>2026-02-03T14:36:30Z</updated>

		<summary type="html">&lt;p&gt;S Braun: /* Interactive Computing with Jupyter */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
= Purpose and function of a queuing system =&lt;br /&gt;
&lt;br /&gt;
All compute activities on bwUniCluster 3.0 have to be performed on the compute nodes. Compute nodes are only available by requesting the corresponding resources via the queuing system. As soon as the requested resources are available, automated tasks are executed via a batch script or they can be accessed interactively.&amp;lt;br&amp;gt;&lt;br /&gt;
General procedure: Hint to [[Running_Calculations | Running Calculations]]&lt;br /&gt;
&lt;br /&gt;
== Job submission process ==&lt;br /&gt;
&lt;br /&gt;
bwUniCluster 3.0 has installed the workload managing software Slurm. Therefore any job submission by the user is to be executed by commands of the Slurm software. Slurm queues and runs user jobs based on fair sharing policies.&lt;br /&gt;
&lt;br /&gt;
== Slurm ==&lt;br /&gt;
&lt;br /&gt;
HPC Workload Manager on bwUniCluster 3.0 is Slurm.&lt;br /&gt;
Slurm is a cluster management and job scheduling system. Slurm has three key functions. &lt;br /&gt;
* It allocates access to resources (compute cores on nodes) to users for some duration of time so they can perform work. &lt;br /&gt;
* It provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. &lt;br /&gt;
* It arbitrates contention for resources by managing a queue of pending work.&lt;br /&gt;
&lt;br /&gt;
Any kind of calculation on the compute nodes of bwUniCluster 3.0 requires the user to define calculations as a sequence of commands together with required run time, number of CPU cores and main memory and submit all, i.e., the &#039;&#039;&#039;batch job&#039;&#039;&#039;, to a resource and workload managing software.&lt;br /&gt;
&lt;br /&gt;
== Terms and definitions ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Partitions &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Slurm manages job queues for different &#039;&#039;&#039;partitions&#039;&#039;&#039;. Partitions are used to group similar node types (e.g. nodes with and without accelerators) and to enforce different access policies and resource limits.&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 there are different partitions:&lt;br /&gt;
&lt;br /&gt;
* CPU-only nodes&lt;br /&gt;
** 2-socket nodes, consisting of 2 Intel Ice Lake processors with 32 cores each or 2 AMD processors with 48 cores each&lt;br /&gt;
** 2-socket nodes with very high RAM capacity, consisting of 2 AMD processors with 48 cores each&lt;br /&gt;
* GPU-accelerated nodes&lt;br /&gt;
** 2-socket nodes with 4x NVIDIA A100 or 4x NVIDIA H100 GPUs&lt;br /&gt;
** 4-socket node with 4x AMD Instinct accelerator&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Queues &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Job &#039;&#039;&#039;queues&#039;&#039;&#039; are used to manage jobs that request access to shared but limited computing resources of a certain kind (partition).&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 there are different main types of queues:&lt;br /&gt;
* Regular queues&lt;br /&gt;
** cpu: Jobs that request CPU-only nodes.&lt;br /&gt;
** gpu: Jobs that request GPU-accelerated nodes.&lt;br /&gt;
* Development queues (dev)&lt;br /&gt;
** Short, usually interactive jobs that are used for developing, compiling and testing code and workflows. The intention behind development queues is to provide users with immediate access to computer resources without having to wait. This is the place to realize instantaneous heavy compute without affecting other users, as would be the case on the login nodes.&lt;br /&gt;
&lt;br /&gt;
Requested compute resources such as (wall-)time, number of nodes and amount of memory are restricted and must fit into the boundaries imposed by the queues. The request for compute resources on the bwUniCluster 3.0 &amp;lt;font color=red&amp;gt;requires at least the specification of the &#039;&#039;&#039;queue&#039;&#039;&#039; and the &#039;&#039;&#039;time&#039;&#039;&#039;&amp;lt;/font&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Jobs &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Jobs can be run non-interactively as &#039;&#039;&#039;batch jobs&#039;&#039;&#039; or as &#039;&#039;&#039;interactive jobs&#039;&#039;&#039;.&amp;lt;br&amp;gt;&lt;br /&gt;
Submitting a batch job means, that all steps of a compute project are defined in a Bash script. This Bash script is queued and executed as soon as the compute resources are available and allocated. Jobs are enqueued with the &amp;lt;code&amp;gt;sbatch&amp;lt;/code&amp;gt; command.&lt;br /&gt;
For interactive jobs, the resources are requested with the &amp;lt;code&amp;gt;salloc&amp;lt;/code&amp;gt; command. As soon as the computing resources are available and allocated, a command line prompt is returned on a computing node and the user can freely dispose of the resources now available to him.&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
&#039;&#039;&#039;Please remember:&#039;&#039;&#039;&lt;br /&gt;
* &#039;&#039;&#039;Heavy computations are not allowed on the login nodes&#039;&#039;&#039;.&amp;lt;br&amp;gt;Use a developement or a regular job queue instead! Please refer to [[BwUniCluster3.0/Login#Allowed_Activities_on_Login_Nodes|Allowed Activities on Login Nodes]].&lt;br /&gt;
* &#039;&#039;&#039;Development queues&#039;&#039;&#039; are meant for &#039;&#039;&#039;development tasks&#039;&#039;&#039;.&amp;lt;br&amp;gt;Do not misuse this queue for regular, short-running jobs or chain jobs! Only one running job at a time is enabled. Maximum queue length is reduced to 3.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
= Queues on bwUniCluster 3.0 = &lt;br /&gt;
== Policy ==&lt;br /&gt;
&lt;br /&gt;
The computing time is provided in accordance with the &#039;&#039;&#039;fair share policy&#039;&#039;&#039;. The individual investment shares of the respective university and the resources already used by its members are taken into account. Furthermore, the following throttling policy is also active: The &#039;&#039;&#039;maximum amount of physical cores&#039;&#039;&#039; used at any given time from jobs running is &#039;&#039;&#039;1920 per user&#039;&#039;&#039; (aggregated over all running jobs). This number corresponds to 30 nodes on the Ice Lake partition or 20 nodes on the standard partition. The aim is to minimize waiting times and maximize the number of users who can access computing time at the same time.&lt;br /&gt;
&lt;br /&gt;
== Regular Queues ==&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node-Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;cpu_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Ice Lake&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=72:00:00, nodes=30, mem=249600mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;cpu&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Standard&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=72:00:00, nodes=20, mem=380000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;highmem&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;High Memory&lt;br /&gt;
| mem-per-cpu=12090mb&lt;br /&gt;
| mem=380001mb&lt;br /&gt;
| time=72:00:00, nodes=4, mem=2300000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_h100&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=193300mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=72:00:00, nodes=12, mem=760000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_mi300&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU node&amp;lt;br/&amp;gt;AMD GPU x4&lt;br /&gt;
| mem-per-gpu=128200mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=72:00:00, nodes=1, mem=510000mb, ntasks-per-node=40, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_a100_il&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;gpu_h100_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;Ice Lake&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=127500mb&amp;lt;br/&amp;gt;cpus-per-gpu=16&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=48:00:00, nodes=9(A100)/nodes=5(H100) , mem=510000mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|}&lt;br /&gt;
Table 1: Regular Queues&lt;br /&gt;
&lt;br /&gt;
== Short Queues ==&lt;br /&gt;
&amp;lt;p style=&amp;quot;color:red; &amp;quot;&amp;gt;&amp;lt;b&amp;gt;Queues with a short runtime of 30 minutes.&amp;lt;/b&amp;gt;&amp;lt;/p&amp;gt; &lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_a100_short&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;Ice Lake&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=94000mb&amp;lt;br/&amp;gt;cpus-per-gpu=12&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=12, mem=376000mb, ntasks-per-node=48, (threads-per-core=2)&lt;br /&gt;
|}&lt;br /&gt;
Table 2: Short Queues&lt;br /&gt;
&lt;br /&gt;
== Development Queues ==&lt;br /&gt;
Only for development, i.e. debugging or performance optimization ...&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_cpu_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Ice Lake&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=30, nodes=8, mem=249600mb, ntasks-per-node=64, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_cpu&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Standard&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=30, nodes=1, mem=380000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_gpu_h100&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=193300mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=1, mem=760000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_gpu_a100_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&amp;lt;br/&amp;gt;&lt;br /&gt;
| mem-per-gpu=127500mb&amp;lt;br/&amp;gt;cpus-per-gpu=16 &lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=1, mem=510000mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|}&lt;br /&gt;
Table 3: Development Queues&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Default resources of a queue class define number of tasks and memory if not explicitly given with sbatch command. Resource list acronyms &#039;&#039;--time&#039;&#039;, &#039;&#039;--ntasks&#039;&#039;, &#039;&#039;--nodes&#039;&#039;, &#039;&#039;--mem&#039;&#039; and &#039;&#039;--mem-per-cpu&#039;&#039; are described [[BwUniCluster3.0/Running_Jobs/Slurm|here]].&lt;br /&gt;
&lt;br /&gt;
== Check available resources: sinfo_t_idle ==&lt;br /&gt;
The Slurm command sinfo is used to view partition and node information for a system running Slurm. It incorporates down time, reservations, and node state information in determining the available backfill window. The sinfo command can only be used by the administrator.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
SCC has prepared a special script (sinfo_t_idle) to find out how many processors are available for immediate use on the system. It is anticipated that users will use this information to submit jobs that meet these criteria and thus obtain quick job turnaround times. &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* The following command displays what resources are available for immediate use for the whole partition.&lt;br /&gt;
&amp;lt;pre&amp;gt;$ sinfo_t_idle &lt;br /&gt;
Partition dev_cpu                 :      1 nodes idle&lt;br /&gt;
Partition cpu                     :      1 nodes idle&lt;br /&gt;
Partition highmem                 :      2 nodes idle&lt;br /&gt;
Partition dev_gpu_h100            :      0 nodes idle&lt;br /&gt;
Partition gpu_h100                :      0 nodes idle&lt;br /&gt;
Partition gpu_mi300               :      0 nodes idle&lt;br /&gt;
Partition dev_cpu_il              :      7 nodes idle&lt;br /&gt;
Partition cpu_il                  :      2 nodes idle&lt;br /&gt;
Partition dev_gpu_a100_il         :      1 nodes idle&lt;br /&gt;
Partition gpu_a100_il             :      0 nodes idle&lt;br /&gt;
Partition gpu_h100_il             :      1 nodes idle&lt;br /&gt;
Partition gpu_a100_short          :      0 nodes idle&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Running Jobs =&lt;br /&gt;
&lt;br /&gt;
== Slurm Commands (excerpt) ==&lt;br /&gt;
Important Slurm commands for non-administrators working on bwUniCluster 3.0.&lt;br /&gt;
{| width=850px class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! Slurm commands !! Brief explanation&lt;br /&gt;
|-&lt;br /&gt;
| [[#Batch Jobs: sbatch|sbatch]] || Submits a job and puts it into the queue [[https://slurm.schedmd.com/sbatch.html sbatch]] &lt;br /&gt;
|-&lt;br /&gt;
| [[#Interactive Jobs: salloc|salloc]] || Requests resources for an interactive Job [[https://slurm.schedmd.com/salloc.html salloc]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Monitor and manage jobs |scontrol show job]] || Displays detailed job state information [[https://slurm.schedmd.com/scontrol.html scontrol]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#List of your submitted jobs : squeue|squeue]] || Displays information about active, eligible, blocked, and/or recently completed jobs [[https://slurm.schedmd.com/squeue.html squeue]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#List of your submitted jobs : squeue|squeue --start]] || Returns start time of submitted job [[https://slurm.schedmd.com/squeue.html squeue]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Check available resources: sinfo_t_idle|sinfo_t_idle]] || Shows what resources are available for immediate use [[https://slurm.schedmd.com/sinfo.html sinfo]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Canceling own jobs : scancel|scancel]] || Cancels a job [[https://slurm.schedmd.com/scancel.html scancel]]&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
* [https://slurm.schedmd.com/tutorials.html  Slurm Tutorials]&lt;br /&gt;
* [https://slurm.schedmd.com/pdfs/summary.pdf  Slurm command/option summary (2 pages)]&lt;br /&gt;
* [https://slurm.schedmd.com/man_index.html  Slurm Commands]&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Batch Jobs: sbatch ==&lt;br /&gt;
&lt;br /&gt;
Batch jobs are submitted by using the command &#039;&#039;&#039;sbatch&#039;&#039;&#039;. The main purpose of the &#039;&#039;&#039;sbatch&#039;&#039;&#039; command is to specify the resources that are needed to run the job. &#039;&#039;&#039;sbatch&#039;&#039;&#039; will then queue the batch job. However, starting of batch job depends on the availability of the requested resources and the fair sharing value.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* The syntax and use of &#039;&#039;&#039;sbatch&#039;&#039;&#039; can be displayed via:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ man sbatch&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;sbatch&#039;&#039;&#039; options can be used from the command line or in your job script. Different defaults for some of these options are set based on the queue and can be found [[BwUniCluster3.0/Slurm | here]]&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! colspan=&amp;quot;3&amp;quot; | sbatch Options&lt;br /&gt;
|-&lt;br /&gt;
! style=&amp;quot;width:8%&amp;quot;| Command line&lt;br /&gt;
! style=&amp;quot;width:9%&amp;quot;| Script&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Purpose&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -t, --time=&#039;&#039;time&#039;&#039;&lt;br /&gt;
| #SBATCH --time=&#039;&#039;time&#039;&#039;&lt;br /&gt;
| Wall clock time limit.&amp;lt;br&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -N, --nodes=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --nodes=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of nodes to be used.&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -n, --ntasks=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --ntasks=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of tasks to be launched.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --ntasks-per-node=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --ntasks-per-node=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Maximum count of tasks per node.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -c, --cpus-per-task=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --cpus-per-task=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of CPUs required per (MPI-)task.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --gres=gpu:&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --gres=gpu:&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of GPUs required per node.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mem=&#039;&#039;value_in_MB&#039;&#039;&lt;br /&gt;
| #SBATCH --mem=&#039;&#039;value_in_MB&#039;&#039; &lt;br /&gt;
| Memory in MegaByte per node. (You should omit the setting of this option.)&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mem-per-cpu=&#039;&#039;value_in_MB&#039;&#039;&lt;br /&gt;
| #SBATCH --mem-per-cpu=&#039;&#039;value_in_MB&#039;&#039; &lt;br /&gt;
| Minimum Memory required per allocated CPU. (You should omit the setting of this option.)&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --exclusive&lt;br /&gt;
| #SBATCH --exclusive &lt;br /&gt;
| The job allocates all CPUs and GPUs on the nodes. It will not share the node with other running jobs&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mail-type=&#039;&#039;type&#039;&#039;&lt;br /&gt;
| #SBATCH --mail-type=&#039;&#039;type&#039;&#039;&lt;br /&gt;
| Notify user by email when certain event types occur.&amp;lt;br&amp;gt;Valid type values are NONE, BEGIN, END, FAIL, REQUEUE, ALL.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mail-user=&#039;&#039;mail-address&#039;&#039;&lt;br /&gt;
| #SBATCH --mail-user=&#039;&#039;mail-address&#039;&#039;&lt;br /&gt;
|  The specified mail-address receives email notification of state changes as defined by --mail-type.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --output=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --output=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| File in which job output is stored. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --error=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --error=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| File in which job error messages are stored. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -J, --job-name=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --job-name=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| Job name.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --export=[ALL,] &#039;&#039;env-variables&#039;&#039;&lt;br /&gt;
| #SBATCH --export=[ALL,] &#039;&#039;env-variables&#039;&#039;&lt;br /&gt;
| Identifies which environment variables from the submission environment are propagated to the launched application. Default is ALL.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -A, --account=&#039;&#039;group-name&#039;&#039;&lt;br /&gt;
| #SBATCH --account=&#039;&#039;group-name&#039;&#039;&lt;br /&gt;
| Change resources used by this job to specified group. You may need this option if your account is assigned to more than one group. By command &amp;quot;scontrol show job&amp;quot; the project group the job is accounted on can be seen behind &amp;quot;Account=&amp;quot;. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -p, --partition=&#039;&#039;queue-name&#039;&#039;&lt;br /&gt;
| #SBATCH --partition=&#039;&#039;queue-name&#039;&#039;&lt;br /&gt;
| Request a specific queue for the resource allocation.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --reservation=&#039;&#039;reservation-name&#039;&#039;&lt;br /&gt;
| #SBATCH --reservation=&#039;&#039;reservation-name&#039;&#039;&lt;br /&gt;
| Use a specific reservation for the resource allocation.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -C, --constraint=&#039;&#039;LSDF&#039;&#039;&lt;br /&gt;
| #SBATCH --constraint=LSDF&lt;br /&gt;
| Job constraint LSDF filesystems.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -C, --constraint=&#039;&#039;BEEOND (BEEOND_4MDS, BEEOND_MAXMDS)&#039;&#039;&lt;br /&gt;
| #SBATCH --constraint=BEEOND (BEEOND_4MDS, BEEOND_MAXMDS)&lt;br /&gt;
| Job constraint BeeOND filesystem.&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Interactive Jobs: salloc ==&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 you are only allowed to run short jobs (&amp;lt;&amp;lt; 1 hour) with little memory requirements (&amp;lt;&amp;lt; 8 GByte) on the logins nodes. If you want to run longer jobs and/or jobs with a request of more than 8 GByte of memory, you must allocate resources for so-called interactive jobs by usage of the command salloc on a login node. Considering a serial application running on a compute node that requires 5000 MByte of memory and limiting the interactive run to 2 hours the following command has to be executed:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ salloc -p cpu -n 1 -t 120 --mem=5000&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then you will get one core on a compute node within the partition &amp;quot;cpu&amp;quot;. After execution of this command &#039;&#039;&#039;DO NOT CLOSE&#039;&#039;&#039; your current terminal session but wait until the queueing system Slurm has granted you the requested resources on the compute system. You will be logged in automatically on the granted core! To run a serial program on the granted core you only have to type the name of the executable.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./&amp;lt;my_serial_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Please be aware that your serial job must run less than 2 hours in this example, else the job will be killed during runtime by the system. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
You can also start now a graphical X11-terminal connecting you to the dedicated resource that is available for 2 hours. You can start it by the command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ xterm&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that, once the walltime limit has been reached the resources - i.e. the compute node - will automatically be revoked.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
An interactive parallel application running on one compute node or on many compute nodes (e.g. here 5 nodes) with 96 cores each requires usually an amount of memory in GByte (e.g. 50 GByte) and a maximum time (e.g. 1 hour). E.g. 5 nodes can be allocated by the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ salloc -p cpu -N 5 --ntasks-per-node=96 -t 01:00:00  --mem=50gb&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now you can run parallel jobs on 480 cores requiring 50 GByte of memory per node. Please be aware that you will be logged in on core 0 of the first node.&lt;br /&gt;
If you want to have access to another node you have to open a new terminal, connect it also to bwUniCluster 3.0 and type the following commands to&lt;br /&gt;
connect to the running interactive job and then to a specific node:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ srun --jobid=XXXXXXXX --pty /bin/bash&lt;br /&gt;
$ srun --nodelist=uc3nXXX --pty /bin/bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With the command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
the jobid and the nodelist can be shown.&lt;br /&gt;
&lt;br /&gt;
If you want to run MPI-programs, you can do it by simply typing mpirun &amp;lt;program_name&amp;gt;. Then your program will be run on 480 cores. A very simple example for starting a parallel job can be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mpirun &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You can also start the debugger ddt by the commands:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module add devel/ddt&lt;br /&gt;
$ ddt &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The above commands will execute the parallel program &amp;lt;my_mpi_program&amp;gt; on all available cores. You can also start parallel programs on a subset of cores; an example for this can be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mpirun -n 50 &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you are using Intel MPI you must start &amp;lt;my_mpi_program&amp;gt; by the command mpiexec.hydra (instead of mpirun).&lt;br /&gt;
&lt;br /&gt;
== Monitor and manage jobs ==&lt;br /&gt;
&lt;br /&gt;
=== List of your submitted jobs : squeue ===&lt;br /&gt;
Displays information about YOUR active, pending and/or recently completed jobs. The command displays all own active and pending jobs. The command squeue is explained in detail on the webpage https://slurm.schedmd.com/squeue.html or via manpage (man squeue).&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;squeue&#039;&#039; example on bwUniCluster 3.0 &amp;lt;small&amp;gt;(Only your own jobs are displayed!)&amp;lt;/small&amp;gt;.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue &lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
              1262       cpu     wrap ka_ab123  R       8:15      1 uc3n002&lt;br /&gt;
              1267 dev_gpu_h     wrap ka_ab123 PD       0:00      1 (Resources)&lt;br /&gt;
              1265   highmem     wrap ka_ab123  R       2:41      1 uc3n084&lt;br /&gt;
$ squeue -l&lt;br /&gt;
             JOBID PARTITION     NAME     USER    STATE       TIME TIME_LIMI  NODES NODELIST(REASON)&lt;br /&gt;
              1262       cpu     wrap ka_ab123  RUNNING       8:55     20:00      1 uc3n002&lt;br /&gt;
              1267 dev_gpu_h     wrap ka_ab123  PENDING       0:00     20:00      1 (Resources)&lt;br /&gt;
              1265   highmem     wrap ka_ab123  RUNNING       3:21     20:00      1 uc3n084&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
If resources are not immediately available add &amp;lt;code&amp;gt;--start&amp;lt;/code&amp;gt; to show its expected start time:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;sh&amp;quot;&amp;gt;squeue --start&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Detailed job information : scontrol show job ===&lt;br /&gt;
scontrol show job displays detailed job state information and diagnostic output for all or a specified job of yours. Detailed information is available for active, pending and recently completed jobs. The command scontrol is explained in detail on the webpage https://slurm.schedmd.com/scontrol.html or via manpage (man scontrol). &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Display the state of all your jobs in normal mode: scontrol show job&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Display the state of a job with &amp;lt;jobid&amp;gt; in normal mode: scontrol show job &amp;lt;jobid&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here is an example from bwUniCluster 3.0:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue&lt;br /&gt;
JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
1262       cpu     wrap ka_zs040  R       1:12      1 uc3n002&lt;br /&gt;
&lt;br /&gt;
$&lt;br /&gt;
$ # now, see what&#039;s up with my pending job with jobid 1262&lt;br /&gt;
$ &lt;br /&gt;
$ scontrol show job 1262&lt;br /&gt;
&lt;br /&gt;
JobId=1262 JobName=wrap&lt;br /&gt;
   UserId=ka_zs0402(241992) GroupId=ka_scc(12345) MCS_label=N/A&lt;br /&gt;
   Priority=4246 Nice=0 Account=ka QOS=normal&lt;br /&gt;
   JobState=RUNNING Reason=None Dependency=(null)&lt;br /&gt;
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0&lt;br /&gt;
   RunTime=00:00:37 TimeLimit=00:20:00 TimeMin=N/A&lt;br /&gt;
   SubmitTime=2025-04-04T10:01:30 EligibleTime=2025-04-04T10:01:30&lt;br /&gt;
   AccrueTime=2025-04-04T10:01:30&lt;br /&gt;
   StartTime=2025-04-04T10:01:31 EndTime=2025-04-04T10:21:31 Deadline=N/A&lt;br /&gt;
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2025-04-04T10:01:31 Scheduler=Main&lt;br /&gt;
   Partition=cpu AllocNode:Sid=uc3n999:2819841&lt;br /&gt;
   ReqNodeList=(null) ExcNodeList=(null)&lt;br /&gt;
   NodeList=uc3n002&lt;br /&gt;
   BatchHost=uc3n002&lt;br /&gt;
   NumNodes=1 NumCPUs=2 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*&lt;br /&gt;
   ReqTRES=cpu=1,mem=2000M,node=1,billing=1&lt;br /&gt;
   AllocTRES=cpu=2,mem=4000M,node=1,billing=2&lt;br /&gt;
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:1 CoreSpec=*&lt;br /&gt;
   MinCPUsNode=1 MinMemoryCPU=2000M MinTmpDiskNode=0&lt;br /&gt;
   Features=(null) DelayBoot=00:00:00&lt;br /&gt;
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)&lt;br /&gt;
   Command=(null)&lt;br /&gt;
   WorkDir=/pfs/data6/home/ka/ka_scc/ka_zs0402&lt;br /&gt;
   StdErr=/pfs/data6/home/ka/ka_scc/ka_zs0402/slurm-1262.out&lt;br /&gt;
   StdIn=/dev/null&lt;br /&gt;
   StdOut=/pfs/data6/home/ka/ka_scc/ka_zs0402/slurm-1262.out&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Each request to the Slurm workload manager generates a load. &amp;lt;p style=&amp;quot;color:red;&amp;gt;&amp;lt;b&amp;gt;Therefore, do not use &amp;lt;code&amp;gt;squeue&amp;lt;/code&amp;gt; with a simple &amp;lt;code&amp;gt;watch&amp;lt;/code&amp;gt;.&amp;lt;/b&amp;gt;&amp;lt;/p&amp;gt; The smallest allowed time interval is &amp;lt;b&amp;gt;30 seconds&amp;lt;/b&amp;gt;.&amp;lt;br&amp;gt;&lt;br /&gt;
Any violation of this rule will result in the task being terminated without notice.&lt;br /&gt;
&lt;br /&gt;
=== Canceling own jobs : scancel ===&lt;br /&gt;
The scancel command is used to cancel jobs. The command scancel is explained in detail on the webpage https://slurm.schedmd.com/scancel.html or via manpage (man scancel). The command is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ scancel [-i] &amp;lt;job-id&amp;gt;&lt;br /&gt;
$ scancel -t &amp;lt;job_state_name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Slurm Options =&lt;br /&gt;
[[BwUniCluster3.0/Running_Jobs/Slurm | Detailed Slurm usage]]&lt;br /&gt;
&lt;br /&gt;
= Best Practices =&lt;br /&gt;
&lt;br /&gt;
== Step-by-Step example==&lt;br /&gt;
&lt;br /&gt;
== Dos and Don&#039;ts ==&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;| Do not run squeue and other slurm commands in loops or &amp;quot;watch&amp;quot; as not to saturate up the slurm daemon with rpc requests&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Running_Jobs&amp;diff=15709</id>
		<title>BwUniCluster3.0/Running Jobs</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Running_Jobs&amp;diff=15709"/>
		<updated>2026-02-03T14:35:52Z</updated>

		<summary type="html">&lt;p&gt;S Braun: /* Monitor and manage jobs */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
= Purpose and function of a queuing system =&lt;br /&gt;
&lt;br /&gt;
All compute activities on bwUniCluster 3.0 have to be performed on the compute nodes. Compute nodes are only available by requesting the corresponding resources via the queuing system. As soon as the requested resources are available, automated tasks are executed via a batch script or they can be accessed interactively.&amp;lt;br&amp;gt;&lt;br /&gt;
General procedure: Hint to [[Running_Calculations | Running Calculations]]&lt;br /&gt;
&lt;br /&gt;
== Job submission process ==&lt;br /&gt;
&lt;br /&gt;
bwUniCluster 3.0 has installed the workload managing software Slurm. Therefore any job submission by the user is to be executed by commands of the Slurm software. Slurm queues and runs user jobs based on fair sharing policies.&lt;br /&gt;
&lt;br /&gt;
== Slurm ==&lt;br /&gt;
&lt;br /&gt;
HPC Workload Manager on bwUniCluster 3.0 is Slurm.&lt;br /&gt;
Slurm is a cluster management and job scheduling system. Slurm has three key functions. &lt;br /&gt;
* It allocates access to resources (compute cores on nodes) to users for some duration of time so they can perform work. &lt;br /&gt;
* It provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. &lt;br /&gt;
* It arbitrates contention for resources by managing a queue of pending work.&lt;br /&gt;
&lt;br /&gt;
Any kind of calculation on the compute nodes of bwUniCluster 3.0 requires the user to define calculations as a sequence of commands together with required run time, number of CPU cores and main memory and submit all, i.e., the &#039;&#039;&#039;batch job&#039;&#039;&#039;, to a resource and workload managing software.&lt;br /&gt;
&lt;br /&gt;
== Terms and definitions ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Partitions &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Slurm manages job queues for different &#039;&#039;&#039;partitions&#039;&#039;&#039;. Partitions are used to group similar node types (e.g. nodes with and without accelerators) and to enforce different access policies and resource limits.&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 there are different partitions:&lt;br /&gt;
&lt;br /&gt;
* CPU-only nodes&lt;br /&gt;
** 2-socket nodes, consisting of 2 Intel Ice Lake processors with 32 cores each or 2 AMD processors with 48 cores each&lt;br /&gt;
** 2-socket nodes with very high RAM capacity, consisting of 2 AMD processors with 48 cores each&lt;br /&gt;
* GPU-accelerated nodes&lt;br /&gt;
** 2-socket nodes with 4x NVIDIA A100 or 4x NVIDIA H100 GPUs&lt;br /&gt;
** 4-socket node with 4x AMD Instinct accelerator&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Queues &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Job &#039;&#039;&#039;queues&#039;&#039;&#039; are used to manage jobs that request access to shared but limited computing resources of a certain kind (partition).&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 there are different main types of queues:&lt;br /&gt;
* Regular queues&lt;br /&gt;
** cpu: Jobs that request CPU-only nodes.&lt;br /&gt;
** gpu: Jobs that request GPU-accelerated nodes.&lt;br /&gt;
* Development queues (dev)&lt;br /&gt;
** Short, usually interactive jobs that are used for developing, compiling and testing code and workflows. The intention behind development queues is to provide users with immediate access to computer resources without having to wait. This is the place to realize instantaneous heavy compute without affecting other users, as would be the case on the login nodes.&lt;br /&gt;
&lt;br /&gt;
Requested compute resources such as (wall-)time, number of nodes and amount of memory are restricted and must fit into the boundaries imposed by the queues. The request for compute resources on the bwUniCluster 3.0 &amp;lt;font color=red&amp;gt;requires at least the specification of the &#039;&#039;&#039;queue&#039;&#039;&#039; and the &#039;&#039;&#039;time&#039;&#039;&#039;&amp;lt;/font&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Jobs &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Jobs can be run non-interactively as &#039;&#039;&#039;batch jobs&#039;&#039;&#039; or as &#039;&#039;&#039;interactive jobs&#039;&#039;&#039;.&amp;lt;br&amp;gt;&lt;br /&gt;
Submitting a batch job means, that all steps of a compute project are defined in a Bash script. This Bash script is queued and executed as soon as the compute resources are available and allocated. Jobs are enqueued with the &amp;lt;code&amp;gt;sbatch&amp;lt;/code&amp;gt; command.&lt;br /&gt;
For interactive jobs, the resources are requested with the &amp;lt;code&amp;gt;salloc&amp;lt;/code&amp;gt; command. As soon as the computing resources are available and allocated, a command line prompt is returned on a computing node and the user can freely dispose of the resources now available to him.&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
&#039;&#039;&#039;Please remember:&#039;&#039;&#039;&lt;br /&gt;
* &#039;&#039;&#039;Heavy computations are not allowed on the login nodes&#039;&#039;&#039;.&amp;lt;br&amp;gt;Use a developement or a regular job queue instead! Please refer to [[BwUniCluster3.0/Login#Allowed_Activities_on_Login_Nodes|Allowed Activities on Login Nodes]].&lt;br /&gt;
* &#039;&#039;&#039;Development queues&#039;&#039;&#039; are meant for &#039;&#039;&#039;development tasks&#039;&#039;&#039;.&amp;lt;br&amp;gt;Do not misuse this queue for regular, short-running jobs or chain jobs! Only one running job at a time is enabled. Maximum queue length is reduced to 3.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
= Queues on bwUniCluster 3.0 = &lt;br /&gt;
== Policy ==&lt;br /&gt;
&lt;br /&gt;
The computing time is provided in accordance with the &#039;&#039;&#039;fair share policy&#039;&#039;&#039;. The individual investment shares of the respective university and the resources already used by its members are taken into account. Furthermore, the following throttling policy is also active: The &#039;&#039;&#039;maximum amount of physical cores&#039;&#039;&#039; used at any given time from jobs running is &#039;&#039;&#039;1920 per user&#039;&#039;&#039; (aggregated over all running jobs). This number corresponds to 30 nodes on the Ice Lake partition or 20 nodes on the standard partition. The aim is to minimize waiting times and maximize the number of users who can access computing time at the same time.&lt;br /&gt;
&lt;br /&gt;
== Regular Queues ==&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node-Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;cpu_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Ice Lake&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=72:00:00, nodes=30, mem=249600mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;cpu&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Standard&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=72:00:00, nodes=20, mem=380000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;highmem&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;High Memory&lt;br /&gt;
| mem-per-cpu=12090mb&lt;br /&gt;
| mem=380001mb&lt;br /&gt;
| time=72:00:00, nodes=4, mem=2300000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_h100&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=193300mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=72:00:00, nodes=12, mem=760000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_mi300&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU node&amp;lt;br/&amp;gt;AMD GPU x4&lt;br /&gt;
| mem-per-gpu=128200mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=72:00:00, nodes=1, mem=510000mb, ntasks-per-node=40, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_a100_il&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;gpu_h100_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;Ice Lake&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=127500mb&amp;lt;br/&amp;gt;cpus-per-gpu=16&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=48:00:00, nodes=9(A100)/nodes=5(H100) , mem=510000mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|}&lt;br /&gt;
Table 1: Regular Queues&lt;br /&gt;
&lt;br /&gt;
== Short Queues ==&lt;br /&gt;
&amp;lt;p style=&amp;quot;color:red; &amp;quot;&amp;gt;&amp;lt;b&amp;gt;Queues with a short runtime of 30 minutes.&amp;lt;/b&amp;gt;&amp;lt;/p&amp;gt; &lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_a100_short&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;Ice Lake&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=94000mb&amp;lt;br/&amp;gt;cpus-per-gpu=12&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=12, mem=376000mb, ntasks-per-node=48, (threads-per-core=2)&lt;br /&gt;
|}&lt;br /&gt;
Table 2: Short Queues&lt;br /&gt;
&lt;br /&gt;
== Development Queues ==&lt;br /&gt;
Only for development, i.e. debugging or performance optimization ...&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_cpu_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Ice Lake&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=30, nodes=8, mem=249600mb, ntasks-per-node=64, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_cpu&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Standard&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=30, nodes=1, mem=380000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_gpu_h100&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=193300mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=1, mem=760000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_gpu_a100_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&amp;lt;br/&amp;gt;&lt;br /&gt;
| mem-per-gpu=127500mb&amp;lt;br/&amp;gt;cpus-per-gpu=16 &lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=1, mem=510000mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|}&lt;br /&gt;
Table 3: Development Queues&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Default resources of a queue class define number of tasks and memory if not explicitly given with sbatch command. Resource list acronyms &#039;&#039;--time&#039;&#039;, &#039;&#039;--ntasks&#039;&#039;, &#039;&#039;--nodes&#039;&#039;, &#039;&#039;--mem&#039;&#039; and &#039;&#039;--mem-per-cpu&#039;&#039; are described [[BwUniCluster3.0/Running_Jobs/Slurm|here]].&lt;br /&gt;
&lt;br /&gt;
== Check available resources: sinfo_t_idle ==&lt;br /&gt;
The Slurm command sinfo is used to view partition and node information for a system running Slurm. It incorporates down time, reservations, and node state information in determining the available backfill window. The sinfo command can only be used by the administrator.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
SCC has prepared a special script (sinfo_t_idle) to find out how many processors are available for immediate use on the system. It is anticipated that users will use this information to submit jobs that meet these criteria and thus obtain quick job turnaround times. &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* The following command displays what resources are available for immediate use for the whole partition.&lt;br /&gt;
&amp;lt;pre&amp;gt;$ sinfo_t_idle &lt;br /&gt;
Partition dev_cpu                 :      1 nodes idle&lt;br /&gt;
Partition cpu                     :      1 nodes idle&lt;br /&gt;
Partition highmem                 :      2 nodes idle&lt;br /&gt;
Partition dev_gpu_h100            :      0 nodes idle&lt;br /&gt;
Partition gpu_h100                :      0 nodes idle&lt;br /&gt;
Partition gpu_mi300               :      0 nodes idle&lt;br /&gt;
Partition dev_cpu_il              :      7 nodes idle&lt;br /&gt;
Partition cpu_il                  :      2 nodes idle&lt;br /&gt;
Partition dev_gpu_a100_il         :      1 nodes idle&lt;br /&gt;
Partition gpu_a100_il             :      0 nodes idle&lt;br /&gt;
Partition gpu_h100_il             :      1 nodes idle&lt;br /&gt;
Partition gpu_a100_short          :      0 nodes idle&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Running Jobs =&lt;br /&gt;
&lt;br /&gt;
== Slurm Commands (excerpt) ==&lt;br /&gt;
Important Slurm commands for non-administrators working on bwUniCluster 3.0.&lt;br /&gt;
{| width=850px class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! Slurm commands !! Brief explanation&lt;br /&gt;
|-&lt;br /&gt;
| [[#Batch Jobs: sbatch|sbatch]] || Submits a job and puts it into the queue [[https://slurm.schedmd.com/sbatch.html sbatch]] &lt;br /&gt;
|-&lt;br /&gt;
| [[#Interactive Jobs: salloc|salloc]] || Requests resources for an interactive Job [[https://slurm.schedmd.com/salloc.html salloc]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Monitor and manage jobs |scontrol show job]] || Displays detailed job state information [[https://slurm.schedmd.com/scontrol.html scontrol]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#List of your submitted jobs : squeue|squeue]] || Displays information about active, eligible, blocked, and/or recently completed jobs [[https://slurm.schedmd.com/squeue.html squeue]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#List of your submitted jobs : squeue|squeue --start]] || Returns start time of submitted job [[https://slurm.schedmd.com/squeue.html squeue]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Check available resources: sinfo_t_idle|sinfo_t_idle]] || Shows what resources are available for immediate use [[https://slurm.schedmd.com/sinfo.html sinfo]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Canceling own jobs : scancel|scancel]] || Cancels a job [[https://slurm.schedmd.com/scancel.html scancel]]&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
* [https://slurm.schedmd.com/tutorials.html  Slurm Tutorials]&lt;br /&gt;
* [https://slurm.schedmd.com/pdfs/summary.pdf  Slurm command/option summary (2 pages)]&lt;br /&gt;
* [https://slurm.schedmd.com/man_index.html  Slurm Commands]&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Batch Jobs: sbatch ==&lt;br /&gt;
&lt;br /&gt;
Batch jobs are submitted by using the command &#039;&#039;&#039;sbatch&#039;&#039;&#039;. The main purpose of the &#039;&#039;&#039;sbatch&#039;&#039;&#039; command is to specify the resources that are needed to run the job. &#039;&#039;&#039;sbatch&#039;&#039;&#039; will then queue the batch job. However, starting of batch job depends on the availability of the requested resources and the fair sharing value.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* The syntax and use of &#039;&#039;&#039;sbatch&#039;&#039;&#039; can be displayed via:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ man sbatch&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;sbatch&#039;&#039;&#039; options can be used from the command line or in your job script. Different defaults for some of these options are set based on the queue and can be found [[BwUniCluster3.0/Slurm | here]]&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! colspan=&amp;quot;3&amp;quot; | sbatch Options&lt;br /&gt;
|-&lt;br /&gt;
! style=&amp;quot;width:8%&amp;quot;| Command line&lt;br /&gt;
! style=&amp;quot;width:9%&amp;quot;| Script&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Purpose&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -t, --time=&#039;&#039;time&#039;&#039;&lt;br /&gt;
| #SBATCH --time=&#039;&#039;time&#039;&#039;&lt;br /&gt;
| Wall clock time limit.&amp;lt;br&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -N, --nodes=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --nodes=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of nodes to be used.&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -n, --ntasks=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --ntasks=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of tasks to be launched.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --ntasks-per-node=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --ntasks-per-node=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Maximum count of tasks per node.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -c, --cpus-per-task=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --cpus-per-task=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of CPUs required per (MPI-)task.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --gres=gpu:&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --gres=gpu:&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of GPUs required per node.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mem=&#039;&#039;value_in_MB&#039;&#039;&lt;br /&gt;
| #SBATCH --mem=&#039;&#039;value_in_MB&#039;&#039; &lt;br /&gt;
| Memory in MegaByte per node. (You should omit the setting of this option.)&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mem-per-cpu=&#039;&#039;value_in_MB&#039;&#039;&lt;br /&gt;
| #SBATCH --mem-per-cpu=&#039;&#039;value_in_MB&#039;&#039; &lt;br /&gt;
| Minimum Memory required per allocated CPU. (You should omit the setting of this option.)&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --exclusive&lt;br /&gt;
| #SBATCH --exclusive &lt;br /&gt;
| The job allocates all CPUs and GPUs on the nodes. It will not share the node with other running jobs&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mail-type=&#039;&#039;type&#039;&#039;&lt;br /&gt;
| #SBATCH --mail-type=&#039;&#039;type&#039;&#039;&lt;br /&gt;
| Notify user by email when certain event types occur.&amp;lt;br&amp;gt;Valid type values are NONE, BEGIN, END, FAIL, REQUEUE, ALL.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mail-user=&#039;&#039;mail-address&#039;&#039;&lt;br /&gt;
| #SBATCH --mail-user=&#039;&#039;mail-address&#039;&#039;&lt;br /&gt;
|  The specified mail-address receives email notification of state changes as defined by --mail-type.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --output=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --output=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| File in which job output is stored. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --error=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --error=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| File in which job error messages are stored. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -J, --job-name=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --job-name=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| Job name.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --export=[ALL,] &#039;&#039;env-variables&#039;&#039;&lt;br /&gt;
| #SBATCH --export=[ALL,] &#039;&#039;env-variables&#039;&#039;&lt;br /&gt;
| Identifies which environment variables from the submission environment are propagated to the launched application. Default is ALL.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -A, --account=&#039;&#039;group-name&#039;&#039;&lt;br /&gt;
| #SBATCH --account=&#039;&#039;group-name&#039;&#039;&lt;br /&gt;
| Change resources used by this job to specified group. You may need this option if your account is assigned to more than one group. By command &amp;quot;scontrol show job&amp;quot; the project group the job is accounted on can be seen behind &amp;quot;Account=&amp;quot;. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -p, --partition=&#039;&#039;queue-name&#039;&#039;&lt;br /&gt;
| #SBATCH --partition=&#039;&#039;queue-name&#039;&#039;&lt;br /&gt;
| Request a specific queue for the resource allocation.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --reservation=&#039;&#039;reservation-name&#039;&#039;&lt;br /&gt;
| #SBATCH --reservation=&#039;&#039;reservation-name&#039;&#039;&lt;br /&gt;
| Use a specific reservation for the resource allocation.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -C, --constraint=&#039;&#039;LSDF&#039;&#039;&lt;br /&gt;
| #SBATCH --constraint=LSDF&lt;br /&gt;
| Job constraint LSDF filesystems.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -C, --constraint=&#039;&#039;BEEOND (BEEOND_4MDS, BEEOND_MAXMDS)&#039;&#039;&lt;br /&gt;
| #SBATCH --constraint=BEEOND (BEEOND_4MDS, BEEOND_MAXMDS)&lt;br /&gt;
| Job constraint BeeOND filesystem.&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Interactive Jobs: salloc ==&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 you are only allowed to run short jobs (&amp;lt;&amp;lt; 1 hour) with little memory requirements (&amp;lt;&amp;lt; 8 GByte) on the logins nodes. If you want to run longer jobs and/or jobs with a request of more than 8 GByte of memory, you must allocate resources for so-called interactive jobs by usage of the command salloc on a login node. Considering a serial application running on a compute node that requires 5000 MByte of memory and limiting the interactive run to 2 hours the following command has to be executed:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ salloc -p cpu -n 1 -t 120 --mem=5000&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then you will get one core on a compute node within the partition &amp;quot;cpu&amp;quot;. After execution of this command &#039;&#039;&#039;DO NOT CLOSE&#039;&#039;&#039; your current terminal session but wait until the queueing system Slurm has granted you the requested resources on the compute system. You will be logged in automatically on the granted core! To run a serial program on the granted core you only have to type the name of the executable.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./&amp;lt;my_serial_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Please be aware that your serial job must run less than 2 hours in this example, else the job will be killed during runtime by the system. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
You can also start now a graphical X11-terminal connecting you to the dedicated resource that is available for 2 hours. You can start it by the command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ xterm&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that, once the walltime limit has been reached the resources - i.e. the compute node - will automatically be revoked.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
An interactive parallel application running on one compute node or on many compute nodes (e.g. here 5 nodes) with 96 cores each requires usually an amount of memory in GByte (e.g. 50 GByte) and a maximum time (e.g. 1 hour). E.g. 5 nodes can be allocated by the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ salloc -p cpu -N 5 --ntasks-per-node=96 -t 01:00:00  --mem=50gb&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now you can run parallel jobs on 480 cores requiring 50 GByte of memory per node. Please be aware that you will be logged in on core 0 of the first node.&lt;br /&gt;
If you want to have access to another node you have to open a new terminal, connect it also to bwUniCluster 3.0 and type the following commands to&lt;br /&gt;
connect to the running interactive job and then to a specific node:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ srun --jobid=XXXXXXXX --pty /bin/bash&lt;br /&gt;
$ srun --nodelist=uc3nXXX --pty /bin/bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With the command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
the jobid and the nodelist can be shown.&lt;br /&gt;
&lt;br /&gt;
If you want to run MPI-programs, you can do it by simply typing mpirun &amp;lt;program_name&amp;gt;. Then your program will be run on 480 cores. A very simple example for starting a parallel job can be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mpirun &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You can also start the debugger ddt by the commands:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module add devel/ddt&lt;br /&gt;
$ ddt &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The above commands will execute the parallel program &amp;lt;my_mpi_program&amp;gt; on all available cores. You can also start parallel programs on a subset of cores; an example for this can be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mpirun -n 50 &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you are using Intel MPI you must start &amp;lt;my_mpi_program&amp;gt; by the command mpiexec.hydra (instead of mpirun).&lt;br /&gt;
&lt;br /&gt;
== Interactive Computing with Jupyter ==&lt;br /&gt;
&lt;br /&gt;
== Monitor and manage jobs ==&lt;br /&gt;
&lt;br /&gt;
=== List of your submitted jobs : squeue ===&lt;br /&gt;
Displays information about YOUR active, pending and/or recently completed jobs. The command displays all own active and pending jobs. The command squeue is explained in detail on the webpage https://slurm.schedmd.com/squeue.html or via manpage (man squeue).&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;squeue&#039;&#039; example on bwUniCluster 3.0 &amp;lt;small&amp;gt;(Only your own jobs are displayed!)&amp;lt;/small&amp;gt;.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue &lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
              1262       cpu     wrap ka_ab123  R       8:15      1 uc3n002&lt;br /&gt;
              1267 dev_gpu_h     wrap ka_ab123 PD       0:00      1 (Resources)&lt;br /&gt;
              1265   highmem     wrap ka_ab123  R       2:41      1 uc3n084&lt;br /&gt;
$ squeue -l&lt;br /&gt;
             JOBID PARTITION     NAME     USER    STATE       TIME TIME_LIMI  NODES NODELIST(REASON)&lt;br /&gt;
              1262       cpu     wrap ka_ab123  RUNNING       8:55     20:00      1 uc3n002&lt;br /&gt;
              1267 dev_gpu_h     wrap ka_ab123  PENDING       0:00     20:00      1 (Resources)&lt;br /&gt;
              1265   highmem     wrap ka_ab123  RUNNING       3:21     20:00      1 uc3n084&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
If resources are not immediately available add &amp;lt;code&amp;gt;--start&amp;lt;/code&amp;gt; to show its expected start time:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;sh&amp;quot;&amp;gt;squeue --start&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Detailed job information : scontrol show job ===&lt;br /&gt;
scontrol show job displays detailed job state information and diagnostic output for all or a specified job of yours. Detailed information is available for active, pending and recently completed jobs. The command scontrol is explained in detail on the webpage https://slurm.schedmd.com/scontrol.html or via manpage (man scontrol). &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Display the state of all your jobs in normal mode: scontrol show job&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Display the state of a job with &amp;lt;jobid&amp;gt; in normal mode: scontrol show job &amp;lt;jobid&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here is an example from bwUniCluster 3.0:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue&lt;br /&gt;
JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
1262       cpu     wrap ka_zs040  R       1:12      1 uc3n002&lt;br /&gt;
&lt;br /&gt;
$&lt;br /&gt;
$ # now, see what&#039;s up with my pending job with jobid 1262&lt;br /&gt;
$ &lt;br /&gt;
$ scontrol show job 1262&lt;br /&gt;
&lt;br /&gt;
JobId=1262 JobName=wrap&lt;br /&gt;
   UserId=ka_zs0402(241992) GroupId=ka_scc(12345) MCS_label=N/A&lt;br /&gt;
   Priority=4246 Nice=0 Account=ka QOS=normal&lt;br /&gt;
   JobState=RUNNING Reason=None Dependency=(null)&lt;br /&gt;
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0&lt;br /&gt;
   RunTime=00:00:37 TimeLimit=00:20:00 TimeMin=N/A&lt;br /&gt;
   SubmitTime=2025-04-04T10:01:30 EligibleTime=2025-04-04T10:01:30&lt;br /&gt;
   AccrueTime=2025-04-04T10:01:30&lt;br /&gt;
   StartTime=2025-04-04T10:01:31 EndTime=2025-04-04T10:21:31 Deadline=N/A&lt;br /&gt;
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2025-04-04T10:01:31 Scheduler=Main&lt;br /&gt;
   Partition=cpu AllocNode:Sid=uc3n999:2819841&lt;br /&gt;
   ReqNodeList=(null) ExcNodeList=(null)&lt;br /&gt;
   NodeList=uc3n002&lt;br /&gt;
   BatchHost=uc3n002&lt;br /&gt;
   NumNodes=1 NumCPUs=2 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*&lt;br /&gt;
   ReqTRES=cpu=1,mem=2000M,node=1,billing=1&lt;br /&gt;
   AllocTRES=cpu=2,mem=4000M,node=1,billing=2&lt;br /&gt;
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:1 CoreSpec=*&lt;br /&gt;
   MinCPUsNode=1 MinMemoryCPU=2000M MinTmpDiskNode=0&lt;br /&gt;
   Features=(null) DelayBoot=00:00:00&lt;br /&gt;
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)&lt;br /&gt;
   Command=(null)&lt;br /&gt;
   WorkDir=/pfs/data6/home/ka/ka_scc/ka_zs0402&lt;br /&gt;
   StdErr=/pfs/data6/home/ka/ka_scc/ka_zs0402/slurm-1262.out&lt;br /&gt;
   StdIn=/dev/null&lt;br /&gt;
   StdOut=/pfs/data6/home/ka/ka_scc/ka_zs0402/slurm-1262.out&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Each request to the Slurm workload manager generates a load. &amp;lt;p style=&amp;quot;color:red;&amp;gt;&amp;lt;b&amp;gt;Therefore, do not use &amp;lt;code&amp;gt;squeue&amp;lt;/code&amp;gt; with a simple &amp;lt;code&amp;gt;watch&amp;lt;/code&amp;gt;.&amp;lt;/b&amp;gt;&amp;lt;/p&amp;gt; The smallest allowed time interval is &amp;lt;b&amp;gt;30 seconds&amp;lt;/b&amp;gt;.&amp;lt;br&amp;gt;&lt;br /&gt;
Any violation of this rule will result in the task being terminated without notice.&lt;br /&gt;
&lt;br /&gt;
=== Canceling own jobs : scancel ===&lt;br /&gt;
The scancel command is used to cancel jobs. The command scancel is explained in detail on the webpage https://slurm.schedmd.com/scancel.html or via manpage (man scancel). The command is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ scancel [-i] &amp;lt;job-id&amp;gt;&lt;br /&gt;
$ scancel -t &amp;lt;job_state_name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Slurm Options =&lt;br /&gt;
[[BwUniCluster3.0/Running_Jobs/Slurm | Detailed Slurm usage]]&lt;br /&gt;
&lt;br /&gt;
= Best Practices =&lt;br /&gt;
&lt;br /&gt;
== Step-by-Step example==&lt;br /&gt;
&lt;br /&gt;
== Dos and Don&#039;ts ==&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;| Do not run squeue and other slurm commands in loops or &amp;quot;watch&amp;quot; as not to saturate up the slurm daemon with rpc requests&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Running_Jobs&amp;diff=15672</id>
		<title>BwUniCluster3.0/Running Jobs</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Running_Jobs&amp;diff=15672"/>
		<updated>2026-01-07T13:52:09Z</updated>

		<summary type="html">&lt;p&gt;S Braun: /* Batch Jobs: sbatch */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
= Purpose and function of a queuing system =&lt;br /&gt;
&lt;br /&gt;
All compute activities on bwUniCluster 3.0 have to be performed on the compute nodes. Compute nodes are only available by requesting the corresponding resources via the queuing system. As soon as the requested resources are available, automated tasks are executed via a batch script or they can be accessed interactively.&amp;lt;br&amp;gt;&lt;br /&gt;
General procedure: Hint to [[Running_Calculations | Running Calculations]]&lt;br /&gt;
&lt;br /&gt;
== Job submission process ==&lt;br /&gt;
&lt;br /&gt;
bwUniCluster 3.0 has installed the workload managing software Slurm. Therefore any job submission by the user is to be executed by commands of the Slurm software. Slurm queues and runs user jobs based on fair sharing policies.&lt;br /&gt;
&lt;br /&gt;
== Slurm ==&lt;br /&gt;
&lt;br /&gt;
HPC Workload Manager on bwUniCluster 3.0 is Slurm.&lt;br /&gt;
Slurm is a cluster management and job scheduling system. Slurm has three key functions. &lt;br /&gt;
* It allocates access to resources (compute cores on nodes) to users for some duration of time so they can perform work. &lt;br /&gt;
* It provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. &lt;br /&gt;
* It arbitrates contention for resources by managing a queue of pending work.&lt;br /&gt;
&lt;br /&gt;
Any kind of calculation on the compute nodes of bwUniCluster 3.0 requires the user to define calculations as a sequence of commands together with required run time, number of CPU cores and main memory and submit all, i.e., the &#039;&#039;&#039;batch job&#039;&#039;&#039;, to a resource and workload managing software.&lt;br /&gt;
&lt;br /&gt;
== Terms and definitions ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Partitions &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Slurm manages job queues for different &#039;&#039;&#039;partitions&#039;&#039;&#039;. Partitions are used to group similar node types (e.g. nodes with and without accelerators) and to enforce different access policies and resource limits.&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 there are different partitions:&lt;br /&gt;
&lt;br /&gt;
* CPU-only nodes&lt;br /&gt;
** 2-socket nodes, consisting of 2 Intel Ice Lake processors with 32 cores each or 2 AMD processors with 48 cores each&lt;br /&gt;
** 2-socket nodes with very high RAM capacity, consisting of 2 AMD processors with 48 cores each&lt;br /&gt;
* GPU-accelerated nodes&lt;br /&gt;
** 2-socket nodes with 4x NVIDIA A100 or 4x NVIDIA H100 GPUs&lt;br /&gt;
** 4-socket node with 4x AMD Instinct accelerator&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Queues &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Job &#039;&#039;&#039;queues&#039;&#039;&#039; are used to manage jobs that request access to shared but limited computing resources of a certain kind (partition).&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 there are different main types of queues:&lt;br /&gt;
* Regular queues&lt;br /&gt;
** cpu: Jobs that request CPU-only nodes.&lt;br /&gt;
** gpu: Jobs that request GPU-accelerated nodes.&lt;br /&gt;
* Development queues (dev)&lt;br /&gt;
** Short, usually interactive jobs that are used for developing, compiling and testing code and workflows. The intention behind development queues is to provide users with immediate access to computer resources without having to wait. This is the place to realize instantaneous heavy compute without affecting other users, as would be the case on the login nodes.&lt;br /&gt;
&lt;br /&gt;
Requested compute resources such as (wall-)time, number of nodes and amount of memory are restricted and must fit into the boundaries imposed by the queues. The request for compute resources on the bwUniCluster 3.0 &amp;lt;font color=red&amp;gt;requires at least the specification of the &#039;&#039;&#039;queue&#039;&#039;&#039; and the &#039;&#039;&#039;time&#039;&#039;&#039;&amp;lt;/font&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Jobs &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Jobs can be run non-interactively as &#039;&#039;&#039;batch jobs&#039;&#039;&#039; or as &#039;&#039;&#039;interactive jobs&#039;&#039;&#039;.&amp;lt;br&amp;gt;&lt;br /&gt;
Submitting a batch job means, that all steps of a compute project are defined in a Bash script. This Bash script is queued and executed as soon as the compute resources are available and allocated. Jobs are enqueued with the &amp;lt;code&amp;gt;sbatch&amp;lt;/code&amp;gt; command.&lt;br /&gt;
For interactive jobs, the resources are requested with the &amp;lt;code&amp;gt;salloc&amp;lt;/code&amp;gt; command. As soon as the computing resources are available and allocated, a command line prompt is returned on a computing node and the user can freely dispose of the resources now available to him.&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
&#039;&#039;&#039;Please remember:&#039;&#039;&#039;&lt;br /&gt;
* &#039;&#039;&#039;Heavy computations are not allowed on the login nodes&#039;&#039;&#039;.&amp;lt;br&amp;gt;Use a developement or a regular job queue instead! Please refer to [[BwUniCluster3.0/Login#Allowed_Activities_on_Login_Nodes|Allowed Activities on Login Nodes]].&lt;br /&gt;
* &#039;&#039;&#039;Development queues&#039;&#039;&#039; are meant for &#039;&#039;&#039;development tasks&#039;&#039;&#039;.&amp;lt;br&amp;gt;Do not misuse this queue for regular, short-running jobs or chain jobs! Only one running job at a time is enabled. Maximum queue length is reduced to 3.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
= Queues on bwUniCluster 3.0 = &lt;br /&gt;
== Policy ==&lt;br /&gt;
&lt;br /&gt;
The computing time is provided in accordance with the &#039;&#039;&#039;fair share policy&#039;&#039;&#039;. The individual investment shares of the respective university and the resources already used by its members are taken into account. Furthermore, the following throttling policy is also active: The &#039;&#039;&#039;maximum amount of physical cores&#039;&#039;&#039; used at any given time from jobs running is &#039;&#039;&#039;1920 per user&#039;&#039;&#039; (aggregated over all running jobs). This number corresponds to 30 nodes on the Ice Lake partition or 20 nodes on the standard partition. The aim is to minimize waiting times and maximize the number of users who can access computing time at the same time.&lt;br /&gt;
&lt;br /&gt;
== Regular Queues ==&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node-Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;cpu_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Ice Lake&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=72:00:00, nodes=30, mem=249600mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;cpu&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Standard&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=72:00:00, nodes=20, mem=380000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;highmem&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;High Memory&lt;br /&gt;
| mem-per-cpu=12090mb&lt;br /&gt;
| mem=380001mb&lt;br /&gt;
| time=72:00:00, nodes=4, mem=2300000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_h100&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=193300mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=72:00:00, nodes=12, mem=760000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_mi300&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU node&amp;lt;br/&amp;gt;AMD GPU x4&lt;br /&gt;
| mem-per-gpu=128200mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=72:00:00, nodes=1, mem=510000mb, ntasks-per-node=40, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_a100_il&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;gpu_h100_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;Ice Lake&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=127500mb&amp;lt;br/&amp;gt;cpus-per-gpu=16&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=48:00:00, nodes=9(A100)/nodes=5(H100) , mem=510000mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|}&lt;br /&gt;
Table 1: Regular Queues&lt;br /&gt;
&lt;br /&gt;
== Short Queues ==&lt;br /&gt;
&amp;lt;p style=&amp;quot;color:red; &amp;quot;&amp;gt;&amp;lt;b&amp;gt;Queues with a short runtime of 30 minutes.&amp;lt;/b&amp;gt;&amp;lt;/p&amp;gt; &lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_a100_short&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;Ice Lake&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=94000mb&amp;lt;br/&amp;gt;cpus-per-gpu=12&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=12, mem=376000mb, ntasks-per-node=48, (threads-per-core=2)&lt;br /&gt;
|}&lt;br /&gt;
Table 2: Short Queues&lt;br /&gt;
&lt;br /&gt;
== Development Queues ==&lt;br /&gt;
Only for development, i.e. debugging or performance optimization ...&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_cpu_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Ice Lake&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=30, nodes=8, mem=249600mb, ntasks-per-node=64, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_cpu&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Standard&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=30, nodes=1, mem=380000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_gpu_h100&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=193300mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=1, mem=760000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_gpu_a100_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&amp;lt;br/&amp;gt;&lt;br /&gt;
| mem-per-gpu=127500mb&amp;lt;br/&amp;gt;cpus-per-gpu=16 &lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=1, mem=510000mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|}&lt;br /&gt;
Table 3: Development Queues&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Default resources of a queue class define number of tasks and memory if not explicitly given with sbatch command. Resource list acronyms &#039;&#039;--time&#039;&#039;, &#039;&#039;--ntasks&#039;&#039;, &#039;&#039;--nodes&#039;&#039;, &#039;&#039;--mem&#039;&#039; and &#039;&#039;--mem-per-cpu&#039;&#039; are described [[BwUniCluster3.0/Running_Jobs/Slurm|here]].&lt;br /&gt;
&lt;br /&gt;
== Check available resources: sinfo_t_idle ==&lt;br /&gt;
The Slurm command sinfo is used to view partition and node information for a system running Slurm. It incorporates down time, reservations, and node state information in determining the available backfill window. The sinfo command can only be used by the administrator.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
SCC has prepared a special script (sinfo_t_idle) to find out how many processors are available for immediate use on the system. It is anticipated that users will use this information to submit jobs that meet these criteria and thus obtain quick job turnaround times. &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* The following command displays what resources are available for immediate use for the whole partition.&lt;br /&gt;
&amp;lt;pre&amp;gt;$ sinfo_t_idle &lt;br /&gt;
Partition dev_cpu                 :      1 nodes idle&lt;br /&gt;
Partition cpu                     :      1 nodes idle&lt;br /&gt;
Partition highmem                 :      2 nodes idle&lt;br /&gt;
Partition dev_gpu_h100            :      0 nodes idle&lt;br /&gt;
Partition gpu_h100                :      0 nodes idle&lt;br /&gt;
Partition gpu_mi300               :      0 nodes idle&lt;br /&gt;
Partition dev_cpu_il              :      7 nodes idle&lt;br /&gt;
Partition cpu_il                  :      2 nodes idle&lt;br /&gt;
Partition dev_gpu_a100_il         :      1 nodes idle&lt;br /&gt;
Partition gpu_a100_il             :      0 nodes idle&lt;br /&gt;
Partition gpu_h100_il             :      1 nodes idle&lt;br /&gt;
Partition gpu_a100_short          :      0 nodes idle&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Running Jobs =&lt;br /&gt;
&lt;br /&gt;
== Slurm Commands (excerpt) ==&lt;br /&gt;
Important Slurm commands for non-administrators working on bwUniCluster 3.0.&lt;br /&gt;
{| width=850px class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! Slurm commands !! Brief explanation&lt;br /&gt;
|-&lt;br /&gt;
| [[#Batch Jobs: sbatch|sbatch]] || Submits a job and puts it into the queue [[https://slurm.schedmd.com/sbatch.html sbatch]] &lt;br /&gt;
|-&lt;br /&gt;
| [[#Interactive Jobs: salloc|salloc]] || Requests resources for an interactive Job [[https://slurm.schedmd.com/salloc.html salloc]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Monitor and manage jobs |scontrol show job]] || Displays detailed job state information [[https://slurm.schedmd.com/scontrol.html scontrol]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#List of your submitted jobs : squeue|squeue]] || Displays information about active, eligible, blocked, and/or recently completed jobs [[https://slurm.schedmd.com/squeue.html squeue]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#List of your submitted jobs : squeue|squeue --start]] || Returns start time of submitted job [[https://slurm.schedmd.com/squeue.html squeue]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Check available resources: sinfo_t_idle|sinfo_t_idle]] || Shows what resources are available for immediate use [[https://slurm.schedmd.com/sinfo.html sinfo]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Canceling own jobs : scancel|scancel]] || Cancels a job [[https://slurm.schedmd.com/scancel.html scancel]]&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
* [https://slurm.schedmd.com/tutorials.html  Slurm Tutorials]&lt;br /&gt;
* [https://slurm.schedmd.com/pdfs/summary.pdf  Slurm command/option summary (2 pages)]&lt;br /&gt;
* [https://slurm.schedmd.com/man_index.html  Slurm Commands]&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Batch Jobs: sbatch ==&lt;br /&gt;
&lt;br /&gt;
Batch jobs are submitted by using the command &#039;&#039;&#039;sbatch&#039;&#039;&#039;. The main purpose of the &#039;&#039;&#039;sbatch&#039;&#039;&#039; command is to specify the resources that are needed to run the job. &#039;&#039;&#039;sbatch&#039;&#039;&#039; will then queue the batch job. However, starting of batch job depends on the availability of the requested resources and the fair sharing value.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* The syntax and use of &#039;&#039;&#039;sbatch&#039;&#039;&#039; can be displayed via:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ man sbatch&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;sbatch&#039;&#039;&#039; options can be used from the command line or in your job script. Different defaults for some of these options are set based on the queue and can be found [[BwUniCluster3.0/Slurm | here]]&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! colspan=&amp;quot;3&amp;quot; | sbatch Options&lt;br /&gt;
|-&lt;br /&gt;
! style=&amp;quot;width:8%&amp;quot;| Command line&lt;br /&gt;
! style=&amp;quot;width:9%&amp;quot;| Script&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Purpose&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -t, --time=&#039;&#039;time&#039;&#039;&lt;br /&gt;
| #SBATCH --time=&#039;&#039;time&#039;&#039;&lt;br /&gt;
| Wall clock time limit.&amp;lt;br&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -N, --nodes=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --nodes=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of nodes to be used.&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -n, --ntasks=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --ntasks=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of tasks to be launched.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --ntasks-per-node=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --ntasks-per-node=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Maximum count of tasks per node.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -c, --cpus-per-task=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --cpus-per-task=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of CPUs required per (MPI-)task.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --gres=gpu:&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --gres=gpu:&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of GPUs required per node.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mem=&#039;&#039;value_in_MB&#039;&#039;&lt;br /&gt;
| #SBATCH --mem=&#039;&#039;value_in_MB&#039;&#039; &lt;br /&gt;
| Memory in MegaByte per node. (You should omit the setting of this option.)&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mem-per-cpu=&#039;&#039;value_in_MB&#039;&#039;&lt;br /&gt;
| #SBATCH --mem-per-cpu=&#039;&#039;value_in_MB&#039;&#039; &lt;br /&gt;
| Minimum Memory required per allocated CPU. (You should omit the setting of this option.)&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --exclusive&lt;br /&gt;
| #SBATCH --exclusive &lt;br /&gt;
| The job allocates all CPUs and GPUs on the nodes. It will not share the node with other running jobs&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mail-type=&#039;&#039;type&#039;&#039;&lt;br /&gt;
| #SBATCH --mail-type=&#039;&#039;type&#039;&#039;&lt;br /&gt;
| Notify user by email when certain event types occur.&amp;lt;br&amp;gt;Valid type values are NONE, BEGIN, END, FAIL, REQUEUE, ALL.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mail-user=&#039;&#039;mail-address&#039;&#039;&lt;br /&gt;
| #SBATCH --mail-user=&#039;&#039;mail-address&#039;&#039;&lt;br /&gt;
|  The specified mail-address receives email notification of state changes as defined by --mail-type.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --output=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --output=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| File in which job output is stored. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --error=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --error=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| File in which job error messages are stored. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -J, --job-name=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --job-name=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| Job name.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --export=[ALL,] &#039;&#039;env-variables&#039;&#039;&lt;br /&gt;
| #SBATCH --export=[ALL,] &#039;&#039;env-variables&#039;&#039;&lt;br /&gt;
| Identifies which environment variables from the submission environment are propagated to the launched application. Default is ALL.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -A, --account=&#039;&#039;group-name&#039;&#039;&lt;br /&gt;
| #SBATCH --account=&#039;&#039;group-name&#039;&#039;&lt;br /&gt;
| Change resources used by this job to specified group. You may need this option if your account is assigned to more than one group. By command &amp;quot;scontrol show job&amp;quot; the project group the job is accounted on can be seen behind &amp;quot;Account=&amp;quot;. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -p, --partition=&#039;&#039;queue-name&#039;&#039;&lt;br /&gt;
| #SBATCH --partition=&#039;&#039;queue-name&#039;&#039;&lt;br /&gt;
| Request a specific queue for the resource allocation.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --reservation=&#039;&#039;reservation-name&#039;&#039;&lt;br /&gt;
| #SBATCH --reservation=&#039;&#039;reservation-name&#039;&#039;&lt;br /&gt;
| Use a specific reservation for the resource allocation.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -C, --constraint=&#039;&#039;LSDF&#039;&#039;&lt;br /&gt;
| #SBATCH --constraint=LSDF&lt;br /&gt;
| Job constraint LSDF filesystems.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -C, --constraint=&#039;&#039;BEEOND (BEEOND_4MDS, BEEOND_MAXMDS)&#039;&#039;&lt;br /&gt;
| #SBATCH --constraint=BEEOND (BEEOND_4MDS, BEEOND_MAXMDS)&lt;br /&gt;
| Job constraint BeeOND filesystem.&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Interactive Jobs: salloc ==&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 you are only allowed to run short jobs (&amp;lt;&amp;lt; 1 hour) with little memory requirements (&amp;lt;&amp;lt; 8 GByte) on the logins nodes. If you want to run longer jobs and/or jobs with a request of more than 8 GByte of memory, you must allocate resources for so-called interactive jobs by usage of the command salloc on a login node. Considering a serial application running on a compute node that requires 5000 MByte of memory and limiting the interactive run to 2 hours the following command has to be executed:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ salloc -p cpu -n 1 -t 120 --mem=5000&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then you will get one core on a compute node within the partition &amp;quot;cpu&amp;quot;. After execution of this command &#039;&#039;&#039;DO NOT CLOSE&#039;&#039;&#039; your current terminal session but wait until the queueing system Slurm has granted you the requested resources on the compute system. You will be logged in automatically on the granted core! To run a serial program on the granted core you only have to type the name of the executable.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./&amp;lt;my_serial_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Please be aware that your serial job must run less than 2 hours in this example, else the job will be killed during runtime by the system. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
You can also start now a graphical X11-terminal connecting you to the dedicated resource that is available for 2 hours. You can start it by the command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ xterm&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that, once the walltime limit has been reached the resources - i.e. the compute node - will automatically be revoked.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
An interactive parallel application running on one compute node or on many compute nodes (e.g. here 5 nodes) with 96 cores each requires usually an amount of memory in GByte (e.g. 50 GByte) and a maximum time (e.g. 1 hour). E.g. 5 nodes can be allocated by the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ salloc -p cpu -N 5 --ntasks-per-node=96 -t 01:00:00  --mem=50gb&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now you can run parallel jobs on 480 cores requiring 50 GByte of memory per node. Please be aware that you will be logged in on core 0 of the first node.&lt;br /&gt;
If you want to have access to another node you have to open a new terminal, connect it also to bwUniCluster 3.0 and type the following commands to&lt;br /&gt;
connect to the running interactive job and then to a specific node:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ srun --jobid=XXXXXXXX --pty /bin/bash&lt;br /&gt;
$ srun --nodelist=uc3nXXX --pty /bin/bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With the command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
the jobid and the nodelist can be shown.&lt;br /&gt;
&lt;br /&gt;
If you want to run MPI-programs, you can do it by simply typing mpirun &amp;lt;program_name&amp;gt;. Then your program will be run on 480 cores. A very simple example for starting a parallel job can be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mpirun &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You can also start the debugger ddt by the commands:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module add devel/ddt&lt;br /&gt;
$ ddt &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The above commands will execute the parallel program &amp;lt;my_mpi_program&amp;gt; on all available cores. You can also start parallel programs on a subset of cores; an example for this can be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mpirun -n 50 &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you are using Intel MPI you must start &amp;lt;my_mpi_program&amp;gt; by the command mpiexec.hydra (instead of mpirun).&lt;br /&gt;
&lt;br /&gt;
== Interactive Computing with Jupyter ==&lt;br /&gt;
&lt;br /&gt;
== Monitor and manage jobs ==&lt;br /&gt;
&lt;br /&gt;
=== List of your submitted jobs : squeue ===&lt;br /&gt;
Displays information about YOUR active, pending and/or recently completed jobs. The command displays all own active and pending jobs. The command squeue is explained in detail on the webpage https://slurm.schedmd.com/squeue.html or via manpage (man squeue).&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;squeue&#039;&#039; example on bwUniCluster 3.0 &amp;lt;small&amp;gt;(Only your own jobs are displayed!)&amp;lt;/small&amp;gt;.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue &lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
              1262       cpu     wrap ka_ab123  R       8:15      1 uc3n002&lt;br /&gt;
              1267 dev_gpu_h     wrap ka_ab123 PD       0:00      1 (Resources)&lt;br /&gt;
              1265   highmem     wrap ka_ab123  R       2:41      1 uc3n084&lt;br /&gt;
$ squeue -l&lt;br /&gt;
             JOBID PARTITION     NAME     USER    STATE       TIME TIME_LIMI  NODES NODELIST(REASON)&lt;br /&gt;
              1262       cpu     wrap ka_ab123  RUNNING       8:55     20:00      1 uc3n002&lt;br /&gt;
              1267 dev_gpu_h     wrap ka_ab123  PENDING       0:00     20:00      1 (Resources)&lt;br /&gt;
              1265   highmem     wrap ka_ab123  RUNNING       3:21     20:00      1 uc3n084&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Detailed job information : scontrol show job ===&lt;br /&gt;
scontrol show job displays detailed job state information and diagnostic output for all or a specified job of yours. Detailed information is available for active, pending and recently completed jobs. The command scontrol is explained in detail on the webpage https://slurm.schedmd.com/scontrol.html or via manpage (man scontrol). &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Display the state of all your jobs in normal mode: scontrol show job&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Display the state of a job with &amp;lt;jobid&amp;gt; in normal mode: scontrol show job &amp;lt;jobid&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here is an example from bwUniCluster 3.0:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue&lt;br /&gt;
JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
1262       cpu     wrap ka_zs040  R       1:12      1 uc3n002&lt;br /&gt;
&lt;br /&gt;
$&lt;br /&gt;
$ # now, see what&#039;s up with my pending job with jobid 1262&lt;br /&gt;
$ &lt;br /&gt;
$ scontrol show job 1262&lt;br /&gt;
&lt;br /&gt;
JobId=1262 JobName=wrap&lt;br /&gt;
   UserId=ka_zs0402(241992) GroupId=ka_scc(12345) MCS_label=N/A&lt;br /&gt;
   Priority=4246 Nice=0 Account=ka QOS=normal&lt;br /&gt;
   JobState=RUNNING Reason=None Dependency=(null)&lt;br /&gt;
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0&lt;br /&gt;
   RunTime=00:00:37 TimeLimit=00:20:00 TimeMin=N/A&lt;br /&gt;
   SubmitTime=2025-04-04T10:01:30 EligibleTime=2025-04-04T10:01:30&lt;br /&gt;
   AccrueTime=2025-04-04T10:01:30&lt;br /&gt;
   StartTime=2025-04-04T10:01:31 EndTime=2025-04-04T10:21:31 Deadline=N/A&lt;br /&gt;
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2025-04-04T10:01:31 Scheduler=Main&lt;br /&gt;
   Partition=cpu AllocNode:Sid=uc3n999:2819841&lt;br /&gt;
   ReqNodeList=(null) ExcNodeList=(null)&lt;br /&gt;
   NodeList=uc3n002&lt;br /&gt;
   BatchHost=uc3n002&lt;br /&gt;
   NumNodes=1 NumCPUs=2 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*&lt;br /&gt;
   ReqTRES=cpu=1,mem=2000M,node=1,billing=1&lt;br /&gt;
   AllocTRES=cpu=2,mem=4000M,node=1,billing=2&lt;br /&gt;
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:1 CoreSpec=*&lt;br /&gt;
   MinCPUsNode=1 MinMemoryCPU=2000M MinTmpDiskNode=0&lt;br /&gt;
   Features=(null) DelayBoot=00:00:00&lt;br /&gt;
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)&lt;br /&gt;
   Command=(null)&lt;br /&gt;
   WorkDir=/pfs/data6/home/ka/ka_scc/ka_zs0402&lt;br /&gt;
   StdErr=/pfs/data6/home/ka/ka_scc/ka_zs0402/slurm-1262.out&lt;br /&gt;
   StdIn=/dev/null&lt;br /&gt;
   StdOut=/pfs/data6/home/ka/ka_scc/ka_zs0402/slurm-1262.out&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Each request to the Slurm workload manager generates a load. &amp;lt;p style=&amp;quot;color:red;&amp;gt;&amp;lt;b&amp;gt;Therefore, do not use &amp;lt;code&amp;gt;squeue&amp;lt;/code&amp;gt; with a simple &amp;lt;code&amp;gt;watch&amp;lt;/code&amp;gt;.&amp;lt;/b&amp;gt;&amp;lt;/p&amp;gt; The smallest allowed time interval is &amp;lt;b&amp;gt;30 seconds&amp;lt;/b&amp;gt;.&amp;lt;br&amp;gt;&lt;br /&gt;
Any violation of this rule will result in the task being terminated without notice.&lt;br /&gt;
&lt;br /&gt;
=== Canceling own jobs : scancel ===&lt;br /&gt;
The scancel command is used to cancel jobs. The command scancel is explained in detail on the webpage https://slurm.schedmd.com/scancel.html or via manpage (man scancel). The command is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ scancel [-i] &amp;lt;job-id&amp;gt;&lt;br /&gt;
$ scancel -t &amp;lt;job_state_name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Slurm Options =&lt;br /&gt;
[[BwUniCluster3.0/Running_Jobs/Slurm | Detailed Slurm usage]]&lt;br /&gt;
&lt;br /&gt;
= Best Practices =&lt;br /&gt;
&lt;br /&gt;
== Step-by-Step example==&lt;br /&gt;
&lt;br /&gt;
== Dos and Don&#039;ts ==&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;| Do not run squeue and other slurm commands in loops or &amp;quot;watch&amp;quot; as not to saturate up the slurm daemon with rpc requests&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Policies&amp;diff=15560</id>
		<title>BwUniCluster3.0/Policies</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Policies&amp;diff=15560"/>
		<updated>2025-12-02T08:49:50Z</updated>

		<summary type="html">&lt;p&gt;S Braun: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Policies =&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;File system quotas&#039;&#039;&#039;&lt;br /&gt;
** HOME: &#039;&#039;&#039;500GB&#039;&#039;&#039;, &#039;&#039;&#039;5 million files (inodes)&#039;&#039;&#039;&lt;br /&gt;
** Workspace: &#039;&#039;&#039;40TB&#039;&#039;&#039;, &#039;&#039;&#039;20 million files (inodes)&#039;&#039;&#039;&lt;br /&gt;
** Throttling Policies: The &#039;&#039;&#039;maximum amount of cores&#039;&#039;&#039; used at any given time from jobs running is 1920 per user (aggregated over all running jobs).&lt;br /&gt;
* &#039;&#039;&#039;Username and HOME directory for KIT users&#039;&#039;&#039;&lt;br /&gt;
** Like everyone else, KIT users&#039; usernames now have the two-character prefix of their home location: &#039;&#039;&#039;&amp;lt;code&amp;gt;ka_&amp;lt;/code&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
** The HOME directory for user &#039;&#039;ab1234&#039;&#039; would be: &#039;&#039;&#039;&amp;lt;code&amp;gt;/home/ka/ka_OE/ka_ab1234&amp;lt;/code&amp;gt;&#039;&#039;&#039; (OE: organizational unit)&lt;br /&gt;
** Login with SSH: &#039;&#039;&#039;&amp;lt;code&amp;gt;ssh ka_ab1234@uc3.scc.kit.edu&amp;lt;/code&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
* &#039;&#039;&#039;Access for KIT students&#039;&#039;&#039;&lt;br /&gt;
** KIT students can be granted access with their regular u-student account in the context of a lecture (cf. https://www.scc.kit.edu/servicedesk/formulare.php &amp;amp;rarr; Application Form for Students accounts on bwUniCluster).&lt;br /&gt;
** The account is only enabled &#039;&#039;&#039;during the lecture period&#039;&#039;&#039;. After the end of the semester, the accounts will be deprovisioned and the user data is deleted.&lt;br /&gt;
** A guest and partner account (GuP) is required for all other projects of KIT students on bwUniCluster 3.0.&lt;br /&gt;
* &#039;&#039;&#039;Allowed Activities on Login Nodes&#039;&#039;&#039;&lt;br /&gt;
** To guarantee usability for all the users of clusters you must not run your compute jobs on the login nodes.&lt;br /&gt;
** Compute intensive jobs must be submitted to the queuing system.&amp;lt;br/&amp;gt;&lt;br /&gt;
** &#039;&#039;&#039;Any compute job running on the login nodes will be terminated without any notice.&#039;&#039;&#039;&lt;br /&gt;
** Any long-running compilation or any long-running pre- or post-processing of batch jobs must also be submitted to the queuing system.&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Policies&amp;diff=15554</id>
		<title>BwUniCluster3.0/Policies</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Policies&amp;diff=15554"/>
		<updated>2025-12-02T08:40:12Z</updated>

		<summary type="html">&lt;p&gt;S Braun: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Policies =&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;File system quotas&#039;&#039;&#039;&lt;br /&gt;
** HOME: &#039;&#039;&#039;500GB&#039;&#039;&#039;, &#039;&#039;&#039;5 million files (inodes)&#039;&#039;&#039;&lt;br /&gt;
** Workspace: &#039;&#039;&#039;40TB&#039;&#039;&#039;, &#039;&#039;&#039;20 million files (inodes)&#039;&#039;&#039;&lt;br /&gt;
** Throttling Policies: The &#039;&#039;&#039;maximum amount of cores&#039;&#039;&#039; used at any given time from jobs running is 1920 per user (aggregated over all running jobs).&lt;br /&gt;
* &#039;&#039;&#039;Username and HOME directory for KIT users&#039;&#039;&#039;&lt;br /&gt;
** Like everyone else, KIT users&#039; usernames now have the two-character prefix of their home location: &#039;&#039;&#039;&amp;lt;code&amp;gt;ka_&amp;lt;/code&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
** The HOME directory for user &#039;&#039;ab1234&#039;&#039; would be: &#039;&#039;&#039;&amp;lt;code&amp;gt;/home/ka/ka_OE/ka_ab1234&amp;lt;/code&amp;gt;&#039;&#039;&#039; (OE: organizational unit)&lt;br /&gt;
** Login with SSH: &#039;&#039;&#039;&amp;lt;code&amp;gt;ssh ka_ab1234@uc3.scc.kit.edu&amp;lt;/code&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
* &#039;&#039;&#039;Access for KIT students&#039;&#039;&#039;&lt;br /&gt;
** KIT students can be granted access with their regular u-student account in the context of a lecture (cf. https://www.scc.kit.edu/servicedesk/formulare.php &amp;amp;rarr; Application Form for Students accounts on bwUniCluster).&lt;br /&gt;
** The account is only enabled &#039;&#039;&#039;during the lecture period&#039;&#039;&#039;. After the end of the semester, the accounts will be deprovisioned and the user data is deleted.&lt;br /&gt;
** A guest and partner account (GuP) is required for all other projects of KIT students on bwUniCluster 3.0.&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15552</id>
		<title>BwUniCluster3.0</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15552"/>
		<updated>2025-12-02T08:38:19Z</updated>

		<summary type="html">&lt;p&gt;S Braun: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## Picture of bwUniCluster - right side  ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## About bwUniCluster                    ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
The &#039;&#039;&#039;bwUniCluster 3.0+KIT-GFA-HPC 3&#039;&#039;&#039; is the joint high-performance computer system of Baden-Württemberg&#039;s Universities and Universities of Applied Sciences for &#039;&#039;&#039;general purpose and teaching&#039;&#039;&#039; and located at the Scientific Computing Center (SCC) at Karlsruhe Institute of Technology (KIT). The bwUniCluster 3.0 complements the four bwForClusters and their dedicated scientific areas.&lt;br /&gt;
[[File:DSCF6485_rectangled_perspective.jpg|center|600px|frameless|alt=bwUniCluster3.0 |upright=1| bwUniCluster 3.0 ]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Maintenance Section     ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no upcoming maintenance&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Next maintenance&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
Due to regular maintenance work the HPC System bwUnicluster 2 will not be available from &lt;br /&gt;
&lt;br /&gt;
21.05.2024 at 08:30 AM until 24.05.2024 at 15:00 AM&lt;br /&gt;
&lt;br /&gt;
Please see the [[BwUniCluster2.0/Maintenance/2024-05|maintenance]] page for more information about planned upgrades and other changes&lt;br /&gt;
|}&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: News section            ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no news&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Transition bwUniCluster 2.0 &amp;amp;rarr; bwUniCluster 3.0&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
&lt;br /&gt;
The HPC cluster bwUniCluster 3.0 is the successor of bwUniCluster 2.0. It features accelerated and CPU-only nodes, with the host system of both node types consisting of classic x86 processor architectures.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
To ensure that you can use the new system successfully and set up your working environment with ease, the following points should be noted.&lt;br /&gt;
&lt;br /&gt;
== Registration ==&lt;br /&gt;
All users who already have an entitlement on bwUniCluster 2.0 are authorized to access bwUniCluster 3.0. The user only needs to &#039;&#039;&#039;register for the new service&#039;&#039;&#039; at https://bwidm.scc.kit.edu .&lt;br /&gt;
&lt;br /&gt;
== Changes ==&lt;br /&gt;
&lt;br /&gt;
Hardware, software and the operating system have been updated and adapted to the latest standards. We would like to draw your attention in particular to the changes in policy, which must also be taken into account.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Changes to hardware, software and policy can be looked up here: [[BwUniCluster3.0/Data_Migration_Guide#Summary_of_changes|Summary of Changes]]&lt;br /&gt;
&lt;br /&gt;
== Migration ==&lt;br /&gt;
bwUniCluster 3.0 features a completely new file system. &#039;&#039;&#039;There is no automatic migration of user data!&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
The file systems of the old system and the login nodes will remain in operation for a period of &#039;&#039;&#039;3 months&#039;&#039;&#039; after the new system goes live (till July 6, 2025).&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
In order to move data that is still needed, user software, and user specific settings from the old HOME directory to the new HOME directory, or to new workspaces, instructions are provided here: [[BwUniCluster3.0/Data_Migration_Guide#Migration_of_Data|Data Migration Guide]]&lt;br /&gt;
|}&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Training/Support section##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#eeeefe; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#dedefe; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Training &amp;amp; Support&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
* [[BwUniCluster3.0/Getting_Started|Getting Started]]&lt;br /&gt;
* [https://training.bwhpc.de E-Learning Courses]&lt;br /&gt;
* [[BwUniCluster3.0/Support|Support]]&lt;br /&gt;
* [[BwUniCluster3.0/FAQ|FAQ]]&lt;br /&gt;
* Send [[Feedback|Feedback]] about Wiki pages&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: User Documentation      ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#cef2e0; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | User Documentation&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Access: [[Registration/bwUniCluster|Registration]], [[Registration/Deregistration|Deregistration]], [[BwUniCluster3.0/Policies|Policies]]&lt;br /&gt;
* [[BwUniCluster3.0/Login|Login]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Client|SSH Clients]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Data_Transfer|Data Transfer]]&lt;br /&gt;
* [[BwUniCluster3.0/Hardware_and_Architecture|Hardware and Architecture]]&lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#Compute_resources|Compute Resources]] &lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#File_Systems|File Systems]] &lt;br /&gt;
* [[BwUniCluster3.0/Software|Cluster Specific Software]]&lt;br /&gt;
** [[BwUniCluster3.0/Containers|Using Containers]]&lt;br /&gt;
* [[BwUniCluster3.0/Running_Jobs|Runnning Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Batch_Jobs:_sbatch|Running Batch Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Interactive_Jobs:_salloc|Running Interactive Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Jupyter|Interactive Computing with Jupyter]]&lt;br /&gt;
* [[BwUniCluster3.0/Maintenance|Operational Changes]]&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Acknowledgement         ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#e6e9eb; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#d1dadf; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Cluster Funding&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Please [[BwUniCluster3.0/Acknowledgement|acknowledge]] bwUniCluster 3.0 in your publications.&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Policies&amp;diff=15551</id>
		<title>BwUniCluster3.0/Policies</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Policies&amp;diff=15551"/>
		<updated>2025-12-02T08:37:26Z</updated>

		<summary type="html">&lt;p&gt;S Braun: Created page with &amp;quot;= Policies =&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Policies =&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15549</id>
		<title>BwUniCluster3.0</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15549"/>
		<updated>2025-12-02T08:36:27Z</updated>

		<summary type="html">&lt;p&gt;S Braun: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## Picture of bwUniCluster - right side  ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## About bwUniCluster                    ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
The &#039;&#039;&#039;bwUniCluster 3.0+KIT-GFA-HPC 3&#039;&#039;&#039; is the joint high-performance computer system of Baden-Württemberg&#039;s Universities and Universities of Applied Sciences for &#039;&#039;&#039;general purpose and teaching&#039;&#039;&#039; and located at the Scientific Computing Center (SCC) at Karlsruhe Institute of Technology (KIT). The bwUniCluster 3.0 complements the four bwForClusters and their dedicated scientific areas.&lt;br /&gt;
[[File:DSCF6485_rectangled_perspective.jpg|center|600px|frameless|alt=bwUniCluster3.0 |upright=1| bwUniCluster 3.0 ]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Maintenance Section     ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no upcoming maintenance&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Next maintenance&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
Due to regular maintenance work the HPC System bwUnicluster 2 will not be available from &lt;br /&gt;
&lt;br /&gt;
21.05.2024 at 08:30 AM until 24.05.2024 at 15:00 AM&lt;br /&gt;
&lt;br /&gt;
Please see the [[BwUniCluster2.0/Maintenance/2024-05|maintenance]] page for more information about planned upgrades and other changes&lt;br /&gt;
|}&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: News section            ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no news&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Transition bwUniCluster 2.0 &amp;amp;rarr; bwUniCluster 3.0&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
&lt;br /&gt;
The HPC cluster bwUniCluster 3.0 is the successor of bwUniCluster 2.0. It features accelerated and CPU-only nodes, with the host system of both node types consisting of classic x86 processor architectures.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
To ensure that you can use the new system successfully and set up your working environment with ease, the following points should be noted.&lt;br /&gt;
&lt;br /&gt;
== Registration ==&lt;br /&gt;
All users who already have an entitlement on bwUniCluster 2.0 are authorized to access bwUniCluster 3.0. The user only needs to &#039;&#039;&#039;register for the new service&#039;&#039;&#039; at https://bwidm.scc.kit.edu .&lt;br /&gt;
&lt;br /&gt;
== Changes ==&lt;br /&gt;
&lt;br /&gt;
Hardware, software and the operating system have been updated and adapted to the latest standards. We would like to draw your attention in particular to the changes in policy, which must also be taken into account.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Changes to hardware, software and policy can be looked up here: [[BwUniCluster3.0/Data_Migration_Guide#Summary_of_changes|Summary of Changes]]&lt;br /&gt;
&lt;br /&gt;
== Migration ==&lt;br /&gt;
bwUniCluster 3.0 features a completely new file system. &#039;&#039;&#039;There is no automatic migration of user data!&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
The file systems of the old system and the login nodes will remain in operation for a period of &#039;&#039;&#039;3 months&#039;&#039;&#039; after the new system goes live (till July 6, 2025).&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
In order to move data that is still needed, user software, and user specific settings from the old HOME directory to the new HOME directory, or to new workspaces, instructions are provided here: [[BwUniCluster3.0/Data_Migration_Guide#Migration_of_Data|Data Migration Guide]]&lt;br /&gt;
|}&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Training/Support section##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#eeeefe; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#dedefe; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Training &amp;amp; Support&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
* [[BwUniCluster3.0/Getting_Started|Getting Started]]&lt;br /&gt;
* [https://training.bwhpc.de E-Learning Courses]&lt;br /&gt;
* [[BwUniCluster3.0/Support|Support]]&lt;br /&gt;
* [[BwUniCluster3.0/FAQ|FAQ]]&lt;br /&gt;
* Send [[Feedback|Feedback]] about Wiki pages&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: User Documentation      ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#cef2e0; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | User Documentation&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Access: [[Registration/bwUniCluster|Registration]], [[Registration/Deregistration|Deregistration]], [[Registration/Deregistration|Policies]]&lt;br /&gt;
* [[BwUniCluster3.0/Login|Login]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Client|SSH Clients]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Data_Transfer|Data Transfer]]&lt;br /&gt;
* [[BwUniCluster3.0/Hardware_and_Architecture|Hardware and Architecture]]&lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#Compute_resources|Compute Resources]] &lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#File_Systems|File Systems]] &lt;br /&gt;
* [[BwUniCluster3.0/Software|Cluster Specific Software]]&lt;br /&gt;
** [[BwUniCluster3.0/Containers|Using Containers]]&lt;br /&gt;
* [[BwUniCluster3.0/Running_Jobs|Runnning Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Batch_Jobs:_sbatch|Running Batch Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Interactive_Jobs:_salloc|Running Interactive Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Jupyter|Interactive Computing with Jupyter]]&lt;br /&gt;
* [[BwUniCluster3.0/Maintenance|Operational Changes]]&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Acknowledgement         ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#e6e9eb; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#d1dadf; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Cluster Funding&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Please [[BwUniCluster3.0/Acknowledgement|acknowledge]] bwUniCluster 3.0 in your publications.&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15547</id>
		<title>BwUniCluster3.0</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0&amp;diff=15547"/>
		<updated>2025-12-02T08:35:01Z</updated>

		<summary type="html">&lt;p&gt;S Braun: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## Picture of bwUniCluster - right side  ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## About bwUniCluster                    ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
The &#039;&#039;&#039;bwUniCluster 3.0+KIT-GFA-HPC 3&#039;&#039;&#039; is the joint high-performance computer system of Baden-Württemberg&#039;s Universities and Universities of Applied Sciences for &#039;&#039;&#039;general purpose and teaching&#039;&#039;&#039; and located at the Scientific Computing Center (SCC) at Karlsruhe Institute of Technology (KIT). The bwUniCluster 3.0 complements the four bwForClusters and their dedicated scientific areas.&lt;br /&gt;
[[File:DSCF6485_rectangled_perspective.jpg|center|600px|frameless|alt=bwUniCluster3.0 |upright=1| bwUniCluster 3.0 ]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Maintenance Section     ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no upcoming maintenance&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Next maintenance&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
Due to regular maintenance work the HPC System bwUnicluster 2 will not be available from &lt;br /&gt;
&lt;br /&gt;
21.05.2024 at 08:30 AM until 24.05.2024 at 15:00 AM&lt;br /&gt;
&lt;br /&gt;
Please see the [[BwUniCluster2.0/Maintenance/2024-05|maintenance]] page for more information about planned upgrades and other changes&lt;br /&gt;
|}&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: News section            ##&lt;br /&gt;
###########################################&lt;br /&gt;
## Comment out full section if there no news&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
{| style=&amp;quot;  background:#FEF4AB; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#FFE856; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Transition bwUniCluster 2.0 &amp;amp;rarr; bwUniCluster 3.0&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
&lt;br /&gt;
The HPC cluster bwUniCluster 3.0 is the successor of bwUniCluster 2.0. It features accelerated and CPU-only nodes, with the host system of both node types consisting of classic x86 processor architectures.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
To ensure that you can use the new system successfully and set up your working environment with ease, the following points should be noted.&lt;br /&gt;
&lt;br /&gt;
== Registration ==&lt;br /&gt;
All users who already have an entitlement on bwUniCluster 2.0 are authorized to access bwUniCluster 3.0. The user only needs to &#039;&#039;&#039;register for the new service&#039;&#039;&#039; at https://bwidm.scc.kit.edu .&lt;br /&gt;
&lt;br /&gt;
== Changes ==&lt;br /&gt;
&lt;br /&gt;
Hardware, software and the operating system have been updated and adapted to the latest standards. We would like to draw your attention in particular to the changes in policy, which must also be taken into account.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Changes to hardware, software and policy can be looked up here: [[BwUniCluster3.0/Data_Migration_Guide#Summary_of_changes|Summary of Changes]]&lt;br /&gt;
&lt;br /&gt;
== Migration ==&lt;br /&gt;
bwUniCluster 3.0 features a completely new file system. &#039;&#039;&#039;There is no automatic migration of user data!&#039;&#039;&#039;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
The file systems of the old system and the login nodes will remain in operation for a period of &#039;&#039;&#039;3 months&#039;&#039;&#039; after the new system goes live (till July 6, 2025).&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
In order to move data that is still needed, user software, and user specific settings from the old HOME directory to the new HOME directory, or to new workspaces, instructions are provided here: [[BwUniCluster3.0/Data_Migration_Guide#Migration_of_Data|Data Migration Guide]]&lt;br /&gt;
|}&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Training/Support section##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#eeeefe; width:100%;&amp;quot; &lt;br /&gt;
| style=&amp;quot;padding:8px; background:#dedefe; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Training &amp;amp; Support&lt;br /&gt;
|-&lt;br /&gt;
| &lt;br /&gt;
* [[BwUniCluster3.0/Getting_Started|Getting Started]]&lt;br /&gt;
* [https://training.bwhpc.de E-Learning Courses]&lt;br /&gt;
* [[BwUniCluster3.0/Support|Support]]&lt;br /&gt;
* [[BwUniCluster3.0/FAQ|FAQ]]&lt;br /&gt;
* Send [[Feedback|Feedback]] about Wiki pages&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: User Documentation      ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#cef2e0; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | User Documentation&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Access: [[Registration/bwUniCluster|Registration]], [[Registration/Deregistration|Deregistration]]&lt;br /&gt;
* [[BwUniCluster3.0/Login|Login]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Client|SSH Clients]]&lt;br /&gt;
** [[BwUniCluster3.0/Login/Data_Transfer|Data Transfer]]&lt;br /&gt;
* [[BwUniCluster3.0/Hardware_and_Architecture|Hardware and Architecture]]&lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#Compute_resources|Compute Resources]] &lt;br /&gt;
** [[BwUniCluster3.0/Hardware_and_Architecture#File_Systems|File Systems]] &lt;br /&gt;
* [[BwUniCluster3.0/Software|Cluster Specific Software]]&lt;br /&gt;
** [[BwUniCluster3.0/Containers|Using Containers]]&lt;br /&gt;
* [[BwUniCluster3.0/Running_Jobs|Runnning Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Batch_Jobs:_sbatch|Running Batch Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Running_Jobs#Interactive_Jobs:_salloc|Running Interactive Jobs]]&lt;br /&gt;
** [[BwUniCluster3.0/Jupyter|Interactive Computing with Jupyter]]&lt;br /&gt;
* [[BwUniCluster3.0/Maintenance|Operational Changes]]&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
###########################################&lt;br /&gt;
## bwUniCluster: Acknowledgement         ##&lt;br /&gt;
###########################################&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
{| style=&amp;quot;  background:#e6e9eb; width:100%;&amp;quot;&lt;br /&gt;
| style=&amp;quot;padding:8px; background:#d1dadf; font-size:120%; font-weight:bold;  text-align:left&amp;quot; | Cluster Funding&lt;br /&gt;
|-&lt;br /&gt;
|&lt;br /&gt;
* Please [[BwUniCluster3.0/Acknowledgement|acknowledge]] bwUniCluster 3.0 in your publications.&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=Registration/SSH&amp;diff=15384</id>
		<title>Registration/SSH</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=Registration/SSH&amp;diff=15384"/>
		<updated>2025-11-07T13:21:13Z</updated>

		<summary type="html">&lt;p&gt;S Braun: /* Minimum requirements for SSH Keys */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
This process is only necessary for the bwUniCluster and the bwForCluster Helix and NEMO2.&lt;br /&gt;
On the other clusters, SSH keys can still be copied to the &amp;lt;code&amp;gt;authorized_keys&amp;lt;/code&amp;gt; file.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Registering SSH Keys with your Cluster =&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
Interactive SSH Keys are not valid all the time, but only for a few hours after the last 2-factor login.&lt;br /&gt;
They have to be &amp;quot;unlocked&amp;quot; by entering the OTP and service password.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;SSH Keys&#039;&#039;&#039; are a mechanism for logging into a computer system without having to enter a password. Instead of authenticating yourself with something you know (a password), you prove your identity by showing the server something you have (a cryptographic key).&lt;br /&gt;
&lt;br /&gt;
The usual process is the following:&lt;br /&gt;
&lt;br /&gt;
* The user generates a pair of SSH Keys, a private key and a public key, on their local system. The private key never leaves the local system.&lt;br /&gt;
&lt;br /&gt;
* The user then logs into the remote system using the remote system password and adds the public key to a file called ~/.ssh/authorized_keys .&lt;br /&gt;
&lt;br /&gt;
* All following logins will no longer require the entry of the remote system password because the local system can prove to the remote system that it has a private key matching the public key on file.&lt;br /&gt;
&lt;br /&gt;
While SSH Keys have many advantages, the concept also has &#039;&#039;&#039;a number of issues&#039;&#039;&#039; which make it hard to handle them securely:&lt;br /&gt;
&lt;br /&gt;
* The private key on the local system is supposed to be protected by a strong passphrase. There is no possibility for the server to check if this is the case. Many users do not use a strong passphrase or do not use any passphrase at all. If such a private key is stolen, an attacker can immediately use it to access the remote system.&lt;br /&gt;
&lt;br /&gt;
* There is no concept of validity. Users are not forced to regularly generate new SSH Key pairs and replace the old ones. Often the same key pair is used for many years and the users have no overview of how many systems they have stored their SSH Keys on.&lt;br /&gt;
&lt;br /&gt;
* SSH Keys can be restricted so they can only be used to execute specific commands on the server, or to log in from specified IP addresses. Most users do not do this.&lt;br /&gt;
&lt;br /&gt;
To fix these issues &#039;&#039;&#039;it is no longer possible to self-manage your SSH Keys by adding them to the ~/.ssh/authorized_keys file&#039;&#039;&#039; on bwUniCluster/bwForCluster.&lt;br /&gt;
SSH Keys have to be managed through bwIDM/bwServces instead.&lt;br /&gt;
Existing authorized_keys files are ignored.&lt;br /&gt;
&lt;br /&gt;
== Minimum requirements for SSH Keys ==&lt;br /&gt;
&lt;br /&gt;
Algorithms and Key sizes:&lt;br /&gt;
&lt;br /&gt;
* 2048 bits or more for RSA&lt;br /&gt;
* 521 bits for ECDSA&lt;br /&gt;
* 256 Bits (Default) for ED25519&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Please set a strong passphrase for your private keys.&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
ECDSA-SK and ED25519-SK keys (for use with U2F/FIDO Hardware Tokens like Yubikeys) can currently only be used on NEMO2 and bwUniCluster 3.0.&lt;br /&gt;
&lt;br /&gt;
= Adding a new SSH Key =&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
* Newly added keys are valid for 180 days. After that, they are revoked and placed on a &amp;quot;revocation list&amp;quot; so that they cannot be reused.&lt;br /&gt;
* Copy only the contents of your public ssh key file to bwIDM/bwServices. The file ends with &amp;lt;code&amp;gt;.pub&amp;lt;/code&amp;gt; ( e.g. &amp;lt;code&amp;gt;~/.ssh/&amp;lt;filename&amp;gt;.pub&amp;lt;/code&amp;gt;).&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;SSH keys&#039;&#039;&#039; are generally managed via the &#039;&#039;&#039;My SSH Pubkeys&#039;&#039;&#039; menu entry on the registration pages for the clusters.&lt;br /&gt;
Here you can add and revoke SSH keys. To add a ssh key, please follow these steps:&lt;br /&gt;
&lt;br /&gt;
1. &#039;&#039;&#039;Select the cluster&#039;&#039;&#039; for which you want to create a second factor:&amp;lt;/br&amp;gt; &amp;amp;rarr; [https://login.bwidm.de/user/ssh-keys.xhtml &#039;&#039;&#039;bwUniCluster 3.0&#039;&#039;&#039;]&amp;lt;/br&amp;gt; &amp;amp;rarr; [https://bwservices.uni-heidelberg.de/user/ssh-keys.xhtml &#039;&#039;&#039;bwForCluster Helix&#039;&#039;&#039;]&amp;lt;/br&amp;gt; &amp;amp;rarr; [https://login.bwidm.de/user/ssh-keys.xhtml &#039;&#039;&#039;bwForCluster NEMO 2&#039;&#039;&#039;]&lt;br /&gt;
[[File:BwIDM-twofa.png|center|600px|thumb|My SSH Pubkeys.]]&lt;br /&gt;
&lt;br /&gt;
3. Click the &#039;&#039;&#039;Add SSH Key&#039;&#039;&#039; or &#039;&#039;&#039;SSH Key Hochladen&#039;&#039;&#039; button.&lt;br /&gt;
[[File:Bwunicluster 2.0 access ssh keys empty.png|center|400px|thumb|Add new SSH key.]]&lt;br /&gt;
&lt;br /&gt;
4. A new window will appear.&lt;br /&gt;
Enter a name for the key and paste your SSH public key (file &amp;lt;code&amp;gt;~/.ssh/&amp;lt;filename&amp;gt;.pub&amp;lt;/code&amp;gt;) into the box labelled &amp;quot;SSH Key:&amp;quot;.&lt;br /&gt;
Click on the button labelled &#039;&#039;&#039;Add&#039;&#039;&#039; or &#039;&#039;&#039;Hinzufügen&#039;&#039;&#039;.&lt;br /&gt;
[[File:Ssh-key.png|center|600px|thumb|Add new SSH key.]]&lt;br /&gt;
&lt;br /&gt;
5. If everything worked fine your new key will show up in the user interface:&lt;br /&gt;
[[File:Ssh-success.png|center|800px|thumb|New SSH key added.]]&lt;br /&gt;
&lt;br /&gt;
Once you have added SSH keys to the system, you can bind them to one or more services to use either for interactive logins (&#039;&#039;&#039;Interactive key&#039;&#039;&#039;) or for automatic logins (&#039;&#039;&#039;Command key&#039;&#039;&#039;).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Registering an Interactive Key ==&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
Interactive SSH Keys are not valid all the time, but only for a few hours after the last 2-factor login.&lt;br /&gt;
They have to be &amp;quot;unlocked&amp;quot; by entering the OTP and service password.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Interactive Keys&#039;&#039;&#039; can be used to log into a system for interactive use.&lt;br /&gt;
Perform the following steps to register an interactive key:&lt;br /&gt;
&lt;br /&gt;
1. [[Registration/SSH#Adding_a_new_SSH_Key|&#039;&#039;&#039;Add a new interactive SSH key&#039;&#039;&#039;]] if you have not already done so.&lt;br /&gt;
&lt;br /&gt;
2. Select &#039;&#039;&#039;Registered services/Registrierte Dienste&#039;&#039;&#039; from the top menu and click &#039;&#039;&#039;Set SSH Key/SSH Key setzen&#039;&#039;&#039; for the cluster for which you want to use the SSH key.&lt;br /&gt;
[[File:BwIDM-registered.png|center|600px|thumb|Select Cluster for which you want to use the SSH key.]]&lt;br /&gt;
&lt;br /&gt;
3. The upper block displays the SSH keys currently registered for the service.&lt;br /&gt;
The bottom block displays all the public SSH keys associated with your account.&lt;br /&gt;
Find the SSH key you want to use and click &#039;&#039;&#039;Add/Hinzufügen&#039;&#039;&#039;.&lt;br /&gt;
[[File:Ssh-service-int.png|center|800px|thumb|Add SSH key to service.]]&lt;br /&gt;
&lt;br /&gt;
4. A new window appears.&lt;br /&gt;
Select &#039;&#039;&#039;Interactive&#039;&#039;&#039; as the usage type, enter an optional comment and click &#039;&#039;&#039;Add/Hinzufügen&#039;&#039;&#039;.&lt;br /&gt;
[[File:Ssh-int.png|center|600px|thumb|Add interactive SSH key to service.]]&lt;br /&gt;
&lt;br /&gt;
5. Your SSH key is now registered for interactive use with this service.&lt;br /&gt;
[[File:Ssh-service.png|center|800px|thumb|SSH key is now registered for interactive use.]]&lt;br /&gt;
&lt;br /&gt;
=== SSH Interactive Key valid after successful Login ===&lt;br /&gt;
&lt;br /&gt;
Interactive SSH Keys are not valid all the time, but only for a few hours after the last 2-factor login.&lt;br /&gt;
They have to be &amp;quot;unlocked&amp;quot; by entering the OTP and service password.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; style=&amp;quot;text-align:center;&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! style=&amp;quot;width:50%&amp;quot;| Cluster&lt;br /&gt;
! style=&amp;quot;width:50%&amp;quot;| Interactive SSH Key Validity&lt;br /&gt;
|-&lt;br /&gt;
!scope=&amp;quot;column&amp;quot;| bwUniCluster 3.0&lt;br /&gt;
| 8h&lt;br /&gt;
|-&lt;br /&gt;
!scope=&amp;quot;column&amp;quot;| bwForCluster Helix&lt;br /&gt;
| 12h&lt;br /&gt;
|-&lt;br /&gt;
!scope=&amp;quot;column&amp;quot;| bwForCluster NEMO 2&lt;br /&gt;
| 12h&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Registering a Command Key ==&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
SSH command keys are always valid and do not need to be unlocked with a 2-factor login.&lt;br /&gt;
This makes these keys extremely valuable to a potential attacker and poses a security risk.&lt;br /&gt;
Therefore, additional restrictions apply to these keys:&lt;br /&gt;
* They must be limited to a single command to be executed.&lt;br /&gt;
* They must be limited to a single IP address (e.g., the workflow server) or a small number of IP addresses (e.g., the institution&#039;s subnet).&lt;br /&gt;
* They must be reviewed and approved by a cluster administrator before they can be used.&lt;br /&gt;
* Validity is reduced to one month.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Command Keys&#039;&#039;&#039; can be used for automatic workflows.&lt;br /&gt;
If you want to use rsync, please read the [[Registration/SSH/rrsync|rrsync wiki]].&lt;br /&gt;
&lt;br /&gt;
Perform the following steps to register a &amp;quot;Command key&amp;quot; (in this example we use rrsync):&lt;br /&gt;
&lt;br /&gt;
1. [[Registration/SSH#Adding_a_new_SSH_Key|&#039;&#039;&#039;Add a new &amp;quot;SSH key&amp;quot;&#039;&#039;&#039;]] if you have not already done so.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
2. Select &#039;&#039;&#039;Registered services/Registrierte Dienste&#039;&#039;&#039; from the top menu and click &#039;&#039;&#039;Set SSH Key/SSH Key setzen&#039;&#039;&#039; for the cluster for which you want to use the SSH key.&lt;br /&gt;
[[File:BwIDM-registered.png|center|600px|thumb|Select Cluster for which you want to use the SSH key.]]&lt;br /&gt;
&lt;br /&gt;
3. The upper block displays the SSH keys currently registered for the service.&lt;br /&gt;
The bottom block displays all the public SSH keys associated with your account.&lt;br /&gt;
Find the SSH key you want to use and click &#039;&#039;&#039;Add/Hinzufügen&#039;&#039;&#039;.&lt;br /&gt;
[[File:Ssh-service-com.png|center|800px|thumb|Add SSH key to service.]]&lt;br /&gt;
&lt;br /&gt;
4. A new window appears.&lt;br /&gt;
Select &#039;&#039;&#039;Command&#039;&#039;&#039; as the usage type.&lt;br /&gt;
Type the full command with the full path, including all parameters, in the &#039;&#039;&#039;Command&#039;&#039;&#039; text box.&lt;br /&gt;
Specify a network address, list, or range in the &#039;&#039;&#039;From&#039;&#039;&#039; text field (see [https://man.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man8/sshd.8#from=_pattern-list_ man 8 sshd] for more info).&lt;br /&gt;
Please also provide a comment to speed up the approval process.&lt;br /&gt;
Click &#039;&#039;&#039;Add/Hinzufügen&#039;&#039;&#039;.&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! | Example&lt;br /&gt;
|-&lt;br /&gt;
| If you want to register a command key to be able to transfer data automatically, please use the following string as in the &#039;&#039;&#039;Command&#039;&#039;&#039; text field (please verify the path on the cluster first):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/usr[/local]/bin/rrsync -ro /home/aa/aa_bb/aa_abc1/&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
[[File:Ssh-com.png|center|600px|thumb|Add command SSH key to service.]]&lt;br /&gt;
&lt;br /&gt;
5. After the key has been added, it will be marked as &#039;&#039;&#039;Pending&#039;&#039;&#039;:&lt;br /&gt;
You will receive an e-mail as soon as the key has been approved and can be used.&lt;br /&gt;
[[File:Ssh-service.png|center|800px|thumb|SSH key is now registered for interactive use.]]&lt;br /&gt;
&lt;br /&gt;
== Revoke/Delete SSH Key ==&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
Revoked keys are locked and can no longer be used.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;SSH keys&#039;&#039;&#039; are generally managed via the &#039;&#039;&#039;My SSH Pubkeys&#039;&#039;&#039; menu entry on the registration pages for the clusters.&lt;br /&gt;
Here you can add and revoke SSH keys. To revoke/delete a ssh key, please follow these steps:&lt;br /&gt;
&lt;br /&gt;
1. &#039;&#039;&#039;Select the cluster&#039;&#039;&#039; for which you want to delete the SSH key:&amp;lt;/br&amp;gt; &amp;amp;rarr; [https://login.bwidm.de/user/ssh-keys.xhtml &#039;&#039;&#039;bwUniCluster 3.0&#039;&#039;&#039;]&amp;lt;/br&amp;gt; &amp;amp;rarr; [https://bwservices.uni-heidelberg.de/user/ssh-keys.xhtml &#039;&#039;&#039;bwForCluster Helix&#039;&#039;&#039;]&amp;lt;/br&amp;gt; &amp;amp;rarr; [https://login.bwidm.de/user/ssh-keys.xhtml &#039;&#039;&#039;bwForCluster NEMO 2&#039;&#039;&#039;]&lt;br /&gt;
[[File:BwIDM-twofa.png|center|600px|thumb|My SSH Pubkeys.]]&lt;br /&gt;
&lt;br /&gt;
2. Click &#039;&#039;&#039;REVOKE/ZURÜCKZIEHEN&#039;&#039;&#039; next to the SSH key you want to revoke.&lt;br /&gt;
[[File:Ssh-success.png|center|800px|thumb|Revoke SSH key.]]&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Running_Jobs&amp;diff=15364</id>
		<title>BwUniCluster3.0/Running Jobs</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Running_Jobs&amp;diff=15364"/>
		<updated>2025-10-24T08:07:44Z</updated>

		<summary type="html">&lt;p&gt;S Braun: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
= Purpose and function of a queuing system =&lt;br /&gt;
&lt;br /&gt;
All compute activities on bwUniCluster 3.0 have to be performed on the compute nodes. Compute nodes are only available by requesting the corresponding resources via the queuing system. As soon as the requested resources are available, automated tasks are executed via a batch script or they can be accessed interactively.&amp;lt;br&amp;gt;&lt;br /&gt;
General procedure: Hint to [[Running_Calculations | Running Calculations]]&lt;br /&gt;
&lt;br /&gt;
== Job submission process ==&lt;br /&gt;
&lt;br /&gt;
bwUniCluster 3.0 has installed the workload managing software Slurm. Therefore any job submission by the user is to be executed by commands of the Slurm software. Slurm queues and runs user jobs based on fair sharing policies.&lt;br /&gt;
&lt;br /&gt;
== Slurm ==&lt;br /&gt;
&lt;br /&gt;
HPC Workload Manager on bwUniCluster 3.0 is Slurm.&lt;br /&gt;
Slurm is a cluster management and job scheduling system. Slurm has three key functions. &lt;br /&gt;
* It allocates access to resources (compute cores on nodes) to users for some duration of time so they can perform work. &lt;br /&gt;
* It provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. &lt;br /&gt;
* It arbitrates contention for resources by managing a queue of pending work.&lt;br /&gt;
&lt;br /&gt;
Any kind of calculation on the compute nodes of bwUniCluster 3.0 requires the user to define calculations as a sequence of commands together with required run time, number of CPU cores and main memory and submit all, i.e., the &#039;&#039;&#039;batch job&#039;&#039;&#039;, to a resource and workload managing software.&lt;br /&gt;
&lt;br /&gt;
== Terms and definitions ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Partitions &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Slurm manages job queues for different &#039;&#039;&#039;partitions&#039;&#039;&#039;. Partitions are used to group similar node types (e.g. nodes with and without accelerators) and to enforce different access policies and resource limits.&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 there are different partitions:&lt;br /&gt;
&lt;br /&gt;
* CPU-only nodes&lt;br /&gt;
** 2-socket nodes, consisting of 2 Intel Ice Lake processors with 32 cores each or 2 AMD processors with 48 cores each&lt;br /&gt;
** 2-socket nodes with very high RAM capacity, consisting of 2 AMD processors with 48 cores each&lt;br /&gt;
* GPU-accelerated nodes&lt;br /&gt;
** 2-socket nodes with 4x NVIDIA A100 or 4x NVIDIA H100 GPUs&lt;br /&gt;
** 4-socket node with 4x AMD Instinct accelerator&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Queues &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Job &#039;&#039;&#039;queues&#039;&#039;&#039; are used to manage jobs that request access to shared but limited computing resources of a certain kind (partition).&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 there are different main types of queues:&lt;br /&gt;
* Regular queues&lt;br /&gt;
** cpu: Jobs that request CPU-only nodes.&lt;br /&gt;
** gpu: Jobs that request GPU-accelerated nodes.&lt;br /&gt;
* Development queues (dev)&lt;br /&gt;
** Short, usually interactive jobs that are used for developing, compiling and testing code and workflows. The intention behind development queues is to provide users with immediate access to computer resources without having to wait. This is the place to realize instantaneous heavy compute without affecting other users, as would be the case on the login nodes.&lt;br /&gt;
&lt;br /&gt;
Requested compute resources such as (wall-)time, number of nodes and amount of memory are restricted and must fit into the boundaries imposed by the queues. The request for compute resources on the bwUniCluster 3.0 &amp;lt;font color=red&amp;gt;requires at least the specification of the &#039;&#039;&#039;queue&#039;&#039;&#039; and the &#039;&#039;&#039;time&#039;&#039;&#039;&amp;lt;/font&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Jobs &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Jobs can be run non-interactively as &#039;&#039;&#039;batch jobs&#039;&#039;&#039; or as &#039;&#039;&#039;interactive jobs&#039;&#039;&#039;.&amp;lt;br&amp;gt;&lt;br /&gt;
Submitting a batch job means, that all steps of a compute project are defined in a Bash script. This Bash script is queued and executed as soon as the compute resources are available and allocated. Jobs are enqueued with the &amp;lt;code&amp;gt;sbatch&amp;lt;/code&amp;gt; command.&lt;br /&gt;
For interactive jobs, the resources are requested with the &amp;lt;code&amp;gt;salloc&amp;lt;/code&amp;gt; command. As soon as the computing resources are available and allocated, a command line prompt is returned on a computing node and the user can freely dispose of the resources now available to him.&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
&#039;&#039;&#039;Please remember:&#039;&#039;&#039;&lt;br /&gt;
* &#039;&#039;&#039;Heavy computations are not allowed on the login nodes&#039;&#039;&#039;.&amp;lt;br&amp;gt;Use a developement or a regular job queue instead! Please refer to [[BwUniCluster3.0/Login#Allowed_Activities_on_Login_Nodes|Allowed Activities on Login Nodes]].&lt;br /&gt;
* &#039;&#039;&#039;Development queues&#039;&#039;&#039; are meant for &#039;&#039;&#039;development tasks&#039;&#039;&#039;.&amp;lt;br&amp;gt;Do not misuse this queue for regular, short-running jobs or chain jobs! Only one running job at a time is enabled. Maximum queue length is reduced to 3.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
= Queues on bwUniCluster 3.0 = &lt;br /&gt;
== Policy ==&lt;br /&gt;
&lt;br /&gt;
The computing time is provided in accordance with the &#039;&#039;&#039;fair share policy&#039;&#039;&#039;. The individual investment shares of the respective university and the resources already used by its members are taken into account. Furthermore, the following throttling policy is also active: The &#039;&#039;&#039;maximum amount of physical cores&#039;&#039;&#039; used at any given time from jobs running is &#039;&#039;&#039;1920 per user&#039;&#039;&#039; (aggregated over all running jobs). This number corresponds to 30 nodes on the Ice Lake partition or 20 nodes on the standard partition. The aim is to minimize waiting times and maximize the number of users who can access computing time at the same time.&lt;br /&gt;
&lt;br /&gt;
== Regular Queues ==&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node-Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;cpu_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Ice Lake&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=72:00:00, nodes=30, mem=249600mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;cpu&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Standard&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=72:00:00, nodes=20, mem=380000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;highmem&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;High Memory&lt;br /&gt;
| mem-per-cpu=12090mb&lt;br /&gt;
| mem=380001mb&lt;br /&gt;
| time=72:00:00, nodes=4, mem=2300000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_h100&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=193300mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=72:00:00, nodes=12, mem=760000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_mi300&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU node&amp;lt;br/&amp;gt;AMD GPU x4&lt;br /&gt;
| mem-per-gpu=128200mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=72:00:00, nodes=1, mem=510000mb, ntasks-per-node=40, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_a100_il&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;gpu_h100_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;Ice Lake&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=127500mb&amp;lt;br/&amp;gt;cpus-per-gpu=16&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=48:00:00, nodes=9(A100)/nodes=5(H100) , mem=510000mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|}&lt;br /&gt;
Table 1: Regular Queues&lt;br /&gt;
&lt;br /&gt;
== Short Queues ==&lt;br /&gt;
&amp;lt;p style=&amp;quot;color:red; &amp;quot;&amp;gt;&amp;lt;b&amp;gt;Queues with a short runtime of 30 minutes.&amp;lt;/b&amp;gt;&amp;lt;/p&amp;gt; &lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_a100_short&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;Ice Lake&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=94000mb&amp;lt;br/&amp;gt;cpus-per-gpu=12&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=12, mem=376000mb, ntasks-per-node=48, (threads-per-core=2)&lt;br /&gt;
|}&lt;br /&gt;
Table 2: Short Queues&lt;br /&gt;
&lt;br /&gt;
== Development Queues ==&lt;br /&gt;
Only for development, i.e. debugging or performance optimization ...&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_cpu_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Ice Lake&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=30, nodes=8, mem=249600mb, ntasks-per-node=64, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_cpu&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Standard&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=30, nodes=1, mem=380000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_gpu_h100&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=193300mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=1, mem=760000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_gpu_a100_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&amp;lt;br/&amp;gt;&lt;br /&gt;
| mem-per-gpu=127500mb&amp;lt;br/&amp;gt;cpus-per-gpu=16 &lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=1, mem=510000mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|}&lt;br /&gt;
Table 3: Development Queues&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Default resources of a queue class define number of tasks and memory if not explicitly given with sbatch command. Resource list acronyms &#039;&#039;--time&#039;&#039;, &#039;&#039;--ntasks&#039;&#039;, &#039;&#039;--nodes&#039;&#039;, &#039;&#039;--mem&#039;&#039; and &#039;&#039;--mem-per-cpu&#039;&#039; are described [[BwUniCluster3.0/Running_Jobs/Slurm|here]].&lt;br /&gt;
&lt;br /&gt;
== Check available resources: sinfo_t_idle ==&lt;br /&gt;
The Slurm command sinfo is used to view partition and node information for a system running Slurm. It incorporates down time, reservations, and node state information in determining the available backfill window. The sinfo command can only be used by the administrator.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
SCC has prepared a special script (sinfo_t_idle) to find out how many processors are available for immediate use on the system. It is anticipated that users will use this information to submit jobs that meet these criteria and thus obtain quick job turnaround times. &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* The following command displays what resources are available for immediate use for the whole partition.&lt;br /&gt;
&amp;lt;pre&amp;gt;$ sinfo_t_idle &lt;br /&gt;
Partition dev_cpu                 :      1 nodes idle&lt;br /&gt;
Partition cpu                     :      1 nodes idle&lt;br /&gt;
Partition highmem                 :      2 nodes idle&lt;br /&gt;
Partition dev_gpu_h100            :      0 nodes idle&lt;br /&gt;
Partition gpu_h100                :      0 nodes idle&lt;br /&gt;
Partition gpu_mi300               :      0 nodes idle&lt;br /&gt;
Partition dev_cpu_il              :      7 nodes idle&lt;br /&gt;
Partition cpu_il                  :      2 nodes idle&lt;br /&gt;
Partition dev_gpu_a100_il         :      1 nodes idle&lt;br /&gt;
Partition gpu_a100_il             :      0 nodes idle&lt;br /&gt;
Partition gpu_h100_il             :      1 nodes idle&lt;br /&gt;
Partition gpu_a100_short          :      0 nodes idle&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Running Jobs =&lt;br /&gt;
&lt;br /&gt;
== Slurm Commands (excerpt) ==&lt;br /&gt;
Important Slurm commands for non-administrators working on bwUniCluster 3.0.&lt;br /&gt;
{| width=850px class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! Slurm commands !! Brief explanation&lt;br /&gt;
|-&lt;br /&gt;
| [[#Batch Jobs: sbatch|sbatch]] || Submits a job and puts it into the queue [[https://slurm.schedmd.com/sbatch.html sbatch]] &lt;br /&gt;
|-&lt;br /&gt;
| [[#Interactive Jobs: salloc|salloc]] || Requests resources for an interactive Job [[https://slurm.schedmd.com/salloc.html salloc]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Monitor and manage jobs |scontrol show job]] || Displays detailed job state information [[https://slurm.schedmd.com/scontrol.html scontrol]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#List of your submitted jobs : squeue|squeue]] || Displays information about active, eligible, blocked, and/or recently completed jobs [[https://slurm.schedmd.com/squeue.html squeue]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#List of your submitted jobs : squeue|squeue --start]] || Returns start time of submitted job [[https://slurm.schedmd.com/squeue.html squeue]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Check available resources: sinfo_t_idle|sinfo_t_idle]] || Shows what resources are available for immediate use [[https://slurm.schedmd.com/sinfo.html sinfo]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Canceling own jobs : scancel|scancel]] || Cancels a job [[https://slurm.schedmd.com/scancel.html scancel]]&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
* [https://slurm.schedmd.com/tutorials.html  Slurm Tutorials]&lt;br /&gt;
* [https://slurm.schedmd.com/pdfs/summary.pdf  Slurm command/option summary (2 pages)]&lt;br /&gt;
* [https://slurm.schedmd.com/man_index.html  Slurm Commands]&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Batch Jobs: sbatch ==&lt;br /&gt;
&lt;br /&gt;
Batch jobs are submitted by using the command &#039;&#039;&#039;sbatch&#039;&#039;&#039;. The main purpose of the &#039;&#039;&#039;sbatch&#039;&#039;&#039; command is to specify the resources that are needed to run the job. &#039;&#039;&#039;sbatch&#039;&#039;&#039; will then queue the batch job. However, starting of batch job depends on the availability of the requested resources and the fair sharing value.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* The syntax and use of &#039;&#039;&#039;sbatch&#039;&#039;&#039; can be displayed via:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ man sbatch&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;sbatch&#039;&#039;&#039; options can be used from the command line or in your job script. Different defaults for some of these options are set based on the queue and can be found [[BwUniCluster3.0/Slurm | here]]&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! colspan=&amp;quot;3&amp;quot; | sbatch Options&lt;br /&gt;
|-&lt;br /&gt;
! style=&amp;quot;width:8%&amp;quot;| Command line&lt;br /&gt;
! style=&amp;quot;width:9%&amp;quot;| Script&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Purpose&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -t, --time=&#039;&#039;time&#039;&#039;&lt;br /&gt;
| #SBATCH --time=&#039;&#039;time&#039;&#039;&lt;br /&gt;
| Wall clock time limit.&amp;lt;br&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -N, --nodes=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --nodes=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of nodes to be used.&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -n, --ntasks=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --ntasks=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of tasks to be launched.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --ntasks-per-node=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --ntasks-per-node=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Maximum count of tasks per node.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -c, --cpus-per-task=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --cpus-per-task=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of CPUs required per (MPI-)task.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mem=&#039;&#039;value_in_MB&#039;&#039;&lt;br /&gt;
| #SBATCH --mem=&#039;&#039;value_in_MB&#039;&#039; &lt;br /&gt;
| Memory in MegaByte per node. (You should omit the setting of this option.)&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mem-per-cpu=&#039;&#039;value_in_MB&#039;&#039;&lt;br /&gt;
| #SBATCH --mem-per-cpu=&#039;&#039;value_in_MB&#039;&#039; &lt;br /&gt;
| Minimum Memory required per allocated CPU. (You should omit the setting of this option.)&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --exclusive&lt;br /&gt;
| #SBATCH --exclusive &lt;br /&gt;
| The job allocates all CPUs and GPUs on the nodes. It will not share the node with other running jobs&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mail-type=&#039;&#039;type&#039;&#039;&lt;br /&gt;
| #SBATCH --mail-type=&#039;&#039;type&#039;&#039;&lt;br /&gt;
| Notify user by email when certain event types occur.&amp;lt;br&amp;gt;Valid type values are NONE, BEGIN, END, FAIL, REQUEUE, ALL.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mail-user=&#039;&#039;mail-address&#039;&#039;&lt;br /&gt;
| #SBATCH --mail-user=&#039;&#039;mail-address&#039;&#039;&lt;br /&gt;
|  The specified mail-address receives email notification of state changes as defined by --mail-type.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --output=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --output=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| File in which job output is stored. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --error=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --error=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| File in which job error messages are stored. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -J, --job-name=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --job-name=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| Job name.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --export=[ALL,] &#039;&#039;env-variables&#039;&#039;&lt;br /&gt;
| #SBATCH --export=[ALL,] &#039;&#039;env-variables&#039;&#039;&lt;br /&gt;
| Identifies which environment variables from the submission environment are propagated to the launched application. Default is ALL.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -A, --account=&#039;&#039;group-name&#039;&#039;&lt;br /&gt;
| #SBATCH --account=&#039;&#039;group-name&#039;&#039;&lt;br /&gt;
| Change resources used by this job to specified group. You may need this option if your account is assigned to more than one group. By command &amp;quot;scontrol show job&amp;quot; the project group the job is accounted on can be seen behind &amp;quot;Account=&amp;quot;. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -p, --partition=&#039;&#039;queue-name&#039;&#039;&lt;br /&gt;
| #SBATCH --partition=&#039;&#039;queue-name&#039;&#039;&lt;br /&gt;
| Request a specific queue for the resource allocation.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --reservation=&#039;&#039;reservation-name&#039;&#039;&lt;br /&gt;
| #SBATCH --reservation=&#039;&#039;reservation-name&#039;&#039;&lt;br /&gt;
| Use a specific reservation for the resource allocation.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -C, --constraint=&#039;&#039;LSDF&#039;&#039;&lt;br /&gt;
| #SBATCH --constraint=LSDF&lt;br /&gt;
| Job constraint LSDF filesystems.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -C, --constraint=&#039;&#039;BEEOND (BEEOND_4MDS, BEEOND_MAXMDS)&#039;&#039;&lt;br /&gt;
| #SBATCH --constraint=BEEOND (BEEOND_4MDS, BEEOND_MAXMDS)&lt;br /&gt;
| Job constraint BeeOND filesystem.&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Interactive Jobs: salloc ==&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 you are only allowed to run short jobs (&amp;lt;&amp;lt; 1 hour) with little memory requirements (&amp;lt;&amp;lt; 8 GByte) on the logins nodes. If you want to run longer jobs and/or jobs with a request of more than 8 GByte of memory, you must allocate resources for so-called interactive jobs by usage of the command salloc on a login node. Considering a serial application running on a compute node that requires 5000 MByte of memory and limiting the interactive run to 2 hours the following command has to be executed:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ salloc -p cpu -n 1 -t 120 --mem=5000&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then you will get one core on a compute node within the partition &amp;quot;cpu&amp;quot;. After execution of this command &#039;&#039;&#039;DO NOT CLOSE&#039;&#039;&#039; your current terminal session but wait until the queueing system Slurm has granted you the requested resources on the compute system. You will be logged in automatically on the granted core! To run a serial program on the granted core you only have to type the name of the executable.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./&amp;lt;my_serial_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Please be aware that your serial job must run less than 2 hours in this example, else the job will be killed during runtime by the system. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
You can also start now a graphical X11-terminal connecting you to the dedicated resource that is available for 2 hours. You can start it by the command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ xterm&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that, once the walltime limit has been reached the resources - i.e. the compute node - will automatically be revoked.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
An interactive parallel application running on one compute node or on many compute nodes (e.g. here 5 nodes) with 96 cores each requires usually an amount of memory in GByte (e.g. 50 GByte) and a maximum time (e.g. 1 hour). E.g. 5 nodes can be allocated by the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ salloc -p cpu -N 5 --ntasks-per-node=96 -t 01:00:00  --mem=50gb&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now you can run parallel jobs on 480 cores requiring 50 GByte of memory per node. Please be aware that you will be logged in on core 0 of the first node.&lt;br /&gt;
If you want to have access to another node you have to open a new terminal, connect it also to bwUniCluster 3.0 and type the following commands to&lt;br /&gt;
connect to the running interactive job and then to a specific node:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ srun --jobid=XXXXXXXX --pty /bin/bash&lt;br /&gt;
$ srun --nodelist=uc3nXXX --pty /bin/bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With the command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
the jobid and the nodelist can be shown.&lt;br /&gt;
&lt;br /&gt;
If you want to run MPI-programs, you can do it by simply typing mpirun &amp;lt;program_name&amp;gt;. Then your program will be run on 480 cores. A very simple example for starting a parallel job can be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mpirun &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You can also start the debugger ddt by the commands:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module add devel/ddt&lt;br /&gt;
$ ddt &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The above commands will execute the parallel program &amp;lt;my_mpi_program&amp;gt; on all available cores. You can also start parallel programs on a subset of cores; an example for this can be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mpirun -n 50 &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you are using Intel MPI you must start &amp;lt;my_mpi_program&amp;gt; by the command mpiexec.hydra (instead of mpirun).&lt;br /&gt;
&lt;br /&gt;
== Interactive Computing with Jupyter ==&lt;br /&gt;
&lt;br /&gt;
== Monitor and manage jobs ==&lt;br /&gt;
&lt;br /&gt;
=== List of your submitted jobs : squeue ===&lt;br /&gt;
Displays information about YOUR active, pending and/or recently completed jobs. The command displays all own active and pending jobs. The command squeue is explained in detail on the webpage https://slurm.schedmd.com/squeue.html or via manpage (man squeue).&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;squeue&#039;&#039; example on bwUniCluster 3.0 &amp;lt;small&amp;gt;(Only your own jobs are displayed!)&amp;lt;/small&amp;gt;.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue &lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
              1262       cpu     wrap ka_ab123  R       8:15      1 uc3n002&lt;br /&gt;
              1267 dev_gpu_h     wrap ka_ab123 PD       0:00      1 (Resources)&lt;br /&gt;
              1265   highmem     wrap ka_ab123  R       2:41      1 uc3n084&lt;br /&gt;
$ squeue -l&lt;br /&gt;
             JOBID PARTITION     NAME     USER    STATE       TIME TIME_LIMI  NODES NODELIST(REASON)&lt;br /&gt;
              1262       cpu     wrap ka_ab123  RUNNING       8:55     20:00      1 uc3n002&lt;br /&gt;
              1267 dev_gpu_h     wrap ka_ab123  PENDING       0:00     20:00      1 (Resources)&lt;br /&gt;
              1265   highmem     wrap ka_ab123  RUNNING       3:21     20:00      1 uc3n084&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Detailed job information : scontrol show job ===&lt;br /&gt;
scontrol show job displays detailed job state information and diagnostic output for all or a specified job of yours. Detailed information is available for active, pending and recently completed jobs. The command scontrol is explained in detail on the webpage https://slurm.schedmd.com/scontrol.html or via manpage (man scontrol). &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Display the state of all your jobs in normal mode: scontrol show job&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Display the state of a job with &amp;lt;jobid&amp;gt; in normal mode: scontrol show job &amp;lt;jobid&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here is an example from bwUniCluster 3.0:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue&lt;br /&gt;
JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
1262       cpu     wrap ka_zs040  R       1:12      1 uc3n002&lt;br /&gt;
&lt;br /&gt;
$&lt;br /&gt;
$ # now, see what&#039;s up with my pending job with jobid 1262&lt;br /&gt;
$ &lt;br /&gt;
$ scontrol show job 1262&lt;br /&gt;
&lt;br /&gt;
JobId=1262 JobName=wrap&lt;br /&gt;
   UserId=ka_zs0402(241992) GroupId=ka_scc(12345) MCS_label=N/A&lt;br /&gt;
   Priority=4246 Nice=0 Account=ka QOS=normal&lt;br /&gt;
   JobState=RUNNING Reason=None Dependency=(null)&lt;br /&gt;
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0&lt;br /&gt;
   RunTime=00:00:37 TimeLimit=00:20:00 TimeMin=N/A&lt;br /&gt;
   SubmitTime=2025-04-04T10:01:30 EligibleTime=2025-04-04T10:01:30&lt;br /&gt;
   AccrueTime=2025-04-04T10:01:30&lt;br /&gt;
   StartTime=2025-04-04T10:01:31 EndTime=2025-04-04T10:21:31 Deadline=N/A&lt;br /&gt;
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2025-04-04T10:01:31 Scheduler=Main&lt;br /&gt;
   Partition=cpu AllocNode:Sid=uc3n999:2819841&lt;br /&gt;
   ReqNodeList=(null) ExcNodeList=(null)&lt;br /&gt;
   NodeList=uc3n002&lt;br /&gt;
   BatchHost=uc3n002&lt;br /&gt;
   NumNodes=1 NumCPUs=2 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*&lt;br /&gt;
   ReqTRES=cpu=1,mem=2000M,node=1,billing=1&lt;br /&gt;
   AllocTRES=cpu=2,mem=4000M,node=1,billing=2&lt;br /&gt;
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:1 CoreSpec=*&lt;br /&gt;
   MinCPUsNode=1 MinMemoryCPU=2000M MinTmpDiskNode=0&lt;br /&gt;
   Features=(null) DelayBoot=00:00:00&lt;br /&gt;
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)&lt;br /&gt;
   Command=(null)&lt;br /&gt;
   WorkDir=/pfs/data6/home/ka/ka_scc/ka_zs0402&lt;br /&gt;
   StdErr=/pfs/data6/home/ka/ka_scc/ka_zs0402/slurm-1262.out&lt;br /&gt;
   StdIn=/dev/null&lt;br /&gt;
   StdOut=/pfs/data6/home/ka/ka_scc/ka_zs0402/slurm-1262.out&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Each request to the Slurm workload manager generates a load. &amp;lt;p style=&amp;quot;color:red;&amp;gt;&amp;lt;b&amp;gt;Therefore, do not use &amp;lt;code&amp;gt;squeue&amp;lt;/code&amp;gt; with a simple &amp;lt;code&amp;gt;watch&amp;lt;/code&amp;gt;.&amp;lt;/b&amp;gt;&amp;lt;/p&amp;gt; The smallest allowed time interval is &amp;lt;b&amp;gt;30 seconds&amp;lt;/b&amp;gt;.&amp;lt;br&amp;gt;&lt;br /&gt;
Any violation of this rule will result in the task being terminated without notice.&lt;br /&gt;
&lt;br /&gt;
=== Canceling own jobs : scancel ===&lt;br /&gt;
The scancel command is used to cancel jobs. The command scancel is explained in detail on the webpage https://slurm.schedmd.com/scancel.html or via manpage (man scancel). The command is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ scancel [-i] &amp;lt;job-id&amp;gt;&lt;br /&gt;
$ scancel -t &amp;lt;job_state_name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Slurm Options =&lt;br /&gt;
[[BwUniCluster3.0/Running_Jobs/Slurm | Detailed Slurm usage]]&lt;br /&gt;
&lt;br /&gt;
= Best Practices =&lt;br /&gt;
&lt;br /&gt;
== Step-by-Step example==&lt;br /&gt;
&lt;br /&gt;
== Dos and Don&#039;ts ==&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;| Do not run squeue and other slurm commands in loops or &amp;quot;watch&amp;quot; as not to saturate up the slurm daemon with rpc requests&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Running_Jobs&amp;diff=15363</id>
		<title>BwUniCluster3.0/Running Jobs</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Running_Jobs&amp;diff=15363"/>
		<updated>2025-10-24T08:06:40Z</updated>

		<summary type="html">&lt;p&gt;S Braun: /* Detailed job information : scontrol show job */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
= Purpose and function of a queuing system =&lt;br /&gt;
&lt;br /&gt;
All compute activities on bwUniCluster 3.0 have to be performed on the compute nodes. Compute nodes are only available by requesting the corresponding resources via the queuing system. As soon as the requested resources are available, automated tasks are executed via a batch script or they can be accessed interactively.&amp;lt;br&amp;gt;&lt;br /&gt;
General procedure: Hint to [[Running_Calculations | Running Calculations]]&lt;br /&gt;
&lt;br /&gt;
== Job submission process ==&lt;br /&gt;
&lt;br /&gt;
bwUniCluster 3.0 has installed the workload managing software Slurm. Therefore any job submission by the user is to be executed by commands of the Slurm software. Slurm queues and runs user jobs based on fair sharing policies.&lt;br /&gt;
&lt;br /&gt;
== Slurm ==&lt;br /&gt;
&lt;br /&gt;
HPC Workload Manager on bwUniCluster 3.0 is Slurm.&lt;br /&gt;
Slurm is a cluster management and job scheduling system. Slurm has three key functions. &lt;br /&gt;
* It allocates access to resources (compute cores on nodes) to users for some duration of time so they can perform work. &lt;br /&gt;
* It provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. &lt;br /&gt;
* It arbitrates contention for resources by managing a queue of pending work.&lt;br /&gt;
&lt;br /&gt;
Any kind of calculation on the compute nodes of bwUniCluster 3.0 requires the user to define calculations as a sequence of commands together with required run time, number of CPU cores and main memory and submit all, i.e., the &#039;&#039;&#039;batch job&#039;&#039;&#039;, to a resource and workload managing software.&lt;br /&gt;
&lt;br /&gt;
== Terms and definitions ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Partitions &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Slurm manages job queues for different &#039;&#039;&#039;partitions&#039;&#039;&#039;. Partitions are used to group similar node types (e.g. nodes with and without accelerators) and to enforce different access policies and resource limits.&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 there are different partitions:&lt;br /&gt;
&lt;br /&gt;
* CPU-only nodes&lt;br /&gt;
** 2-socket nodes, consisting of 2 Intel Ice Lake processors with 32 cores each or 2 AMD processors with 48 cores each&lt;br /&gt;
** 2-socket nodes with very high RAM capacity, consisting of 2 AMD processors with 48 cores each&lt;br /&gt;
* GPU-accelerated nodes&lt;br /&gt;
** 2-socket nodes with 4x NVIDIA A100 or 4x NVIDIA H100 GPUs&lt;br /&gt;
** 4-socket node with 4x AMD Instinct accelerator&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Queues &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Job &#039;&#039;&#039;queues&#039;&#039;&#039; are used to manage jobs that request access to shared but limited computing resources of a certain kind (partition).&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 there are different main types of queues:&lt;br /&gt;
* Regular queues&lt;br /&gt;
** cpu: Jobs that request CPU-only nodes.&lt;br /&gt;
** gpu: Jobs that request GPU-accelerated nodes.&lt;br /&gt;
* Development queues (dev)&lt;br /&gt;
** Short, usually interactive jobs that are used for developing, compiling and testing code and workflows. The intention behind development queues is to provide users with immediate access to computer resources without having to wait. This is the place to realize instantaneous heavy compute without affecting other users, as would be the case on the login nodes.&lt;br /&gt;
&lt;br /&gt;
Requested compute resources such as (wall-)time, number of nodes and amount of memory are restricted and must fit into the boundaries imposed by the queues. The request for compute resources on the bwUniCluster 3.0 &amp;lt;font color=red&amp;gt;requires at least the specification of the &#039;&#039;&#039;queue&#039;&#039;&#039; and the &#039;&#039;&#039;time&#039;&#039;&#039;&amp;lt;/font&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Jobs &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Jobs can be run non-interactively as &#039;&#039;&#039;batch jobs&#039;&#039;&#039; or as &#039;&#039;&#039;interactive jobs&#039;&#039;&#039;.&amp;lt;br&amp;gt;&lt;br /&gt;
Submitting a batch job means, that all steps of a compute project are defined in a Bash script. This Bash script is queued and executed as soon as the compute resources are available and allocated. Jobs are enqueued with the &amp;lt;code&amp;gt;sbatch&amp;lt;/code&amp;gt; command.&lt;br /&gt;
For interactive jobs, the resources are requested with the &amp;lt;code&amp;gt;salloc&amp;lt;/code&amp;gt; command. As soon as the computing resources are available and allocated, a command line prompt is returned on a computing node and the user can freely dispose of the resources now available to him.&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
&#039;&#039;&#039;Please remember:&#039;&#039;&#039;&lt;br /&gt;
* &#039;&#039;&#039;Heavy computations are not allowed on the login nodes&#039;&#039;&#039;.&amp;lt;br&amp;gt;Use a developement or a regular job queue instead! Please refer to [[BwUniCluster3.0/Login#Allowed_Activities_on_Login_Nodes|Allowed Activities on Login Nodes]].&lt;br /&gt;
* &#039;&#039;&#039;Development queues&#039;&#039;&#039; are meant for &#039;&#039;&#039;development tasks&#039;&#039;&#039;.&amp;lt;br&amp;gt;Do not misuse this queue for regular, short-running jobs or chain jobs! Only one running job at a time is enabled. Maximum queue length is reduced to 3.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
= Queues on bwUniCluster 3.0 = &lt;br /&gt;
== Policy ==&lt;br /&gt;
&lt;br /&gt;
The computing time is provided in accordance with the &#039;&#039;&#039;fair share policy&#039;&#039;&#039;. The individual investment shares of the respective university and the resources already used by its members are taken into account. Furthermore, the following throttling policy is also active: The &#039;&#039;&#039;maximum amount of physical cores&#039;&#039;&#039; used at any given time from jobs running is &#039;&#039;&#039;1920 per user&#039;&#039;&#039; (aggregated over all running jobs). This number corresponds to 30 nodes on the Ice Lake partition or 20 nodes on the standard partition. The aim is to minimize waiting times and maximize the number of users who can access computing time at the same time.&lt;br /&gt;
&lt;br /&gt;
== Regular Queues ==&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node-Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;cpu_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Ice Lake&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=72:00:00, nodes=30, mem=249600mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;cpu&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Standard&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=72:00:00, nodes=20, mem=380000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;highmem&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;High Memory&lt;br /&gt;
| mem-per-cpu=12090mb&lt;br /&gt;
| mem=380001mb&lt;br /&gt;
| time=72:00:00, nodes=4, mem=2300000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_h100&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=193300mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=72:00:00, nodes=12, mem=760000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_mi300&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU node&amp;lt;br/&amp;gt;AMD GPU x4&lt;br /&gt;
| mem-per-gpu=128200mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=72:00:00, nodes=1, mem=510000mb, ntasks-per-node=40, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_a100_il&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;gpu_h100_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;Ice Lake&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=127500mb&amp;lt;br/&amp;gt;cpus-per-gpu=16&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=48:00:00, nodes=9(A100)/nodes=5(H100) , mem=510000mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|}&lt;br /&gt;
Table 1: Regular Queues&lt;br /&gt;
&lt;br /&gt;
== Short Queues ==&lt;br /&gt;
&amp;lt;p style=&amp;quot;color:red; &amp;quot;&amp;gt;&amp;lt;b&amp;gt;Queues with a short runtime of 30 minutes.&amp;lt;/b&amp;gt;&amp;lt;/p&amp;gt; &lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_a100_short&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;Ice Lake&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=94000mb&amp;lt;br/&amp;gt;cpus-per-gpu=12&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=12, mem=376000mb, ntasks-per-node=48, (threads-per-core=2)&lt;br /&gt;
|}&lt;br /&gt;
Table 2: Short Queues&lt;br /&gt;
&lt;br /&gt;
== Development Queues ==&lt;br /&gt;
Only for development, i.e. debugging or performance optimization ...&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_cpu_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Ice Lake&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=30, nodes=8, mem=249600mb, ntasks-per-node=64, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_cpu&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Standard&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=30, nodes=1, mem=380000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_gpu_h100&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=193300mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=1, mem=760000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_gpu_a100_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&amp;lt;br/&amp;gt;&lt;br /&gt;
| mem-per-gpu=127500mb&amp;lt;br/&amp;gt;cpus-per-gpu=16 &lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=1, mem=510000mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|}&lt;br /&gt;
Table 3: Development Queues&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Default resources of a queue class define number of tasks and memory if not explicitly given with sbatch command. Resource list acronyms &#039;&#039;--time&#039;&#039;, &#039;&#039;--ntasks&#039;&#039;, &#039;&#039;--nodes&#039;&#039;, &#039;&#039;--mem&#039;&#039; and &#039;&#039;--mem-per-cpu&#039;&#039; are described [[BwUniCluster3.0/Running_Jobs/Slurm|here]].&lt;br /&gt;
&lt;br /&gt;
== Check available resources: sinfo_t_idle ==&lt;br /&gt;
The Slurm command sinfo is used to view partition and node information for a system running Slurm. It incorporates down time, reservations, and node state information in determining the available backfill window. The sinfo command can only be used by the administrator.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
SCC has prepared a special script (sinfo_t_idle) to find out how many processors are available for immediate use on the system. It is anticipated that users will use this information to submit jobs that meet these criteria and thus obtain quick job turnaround times. &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* The following command displays what resources are available for immediate use for the whole partition.&lt;br /&gt;
&amp;lt;pre&amp;gt;$ sinfo_t_idle &lt;br /&gt;
Partition dev_cpu                 :      1 nodes idle&lt;br /&gt;
Partition cpu                     :      1 nodes idle&lt;br /&gt;
Partition highmem                 :      2 nodes idle&lt;br /&gt;
Partition dev_gpu_h100            :      0 nodes idle&lt;br /&gt;
Partition gpu_h100                :      0 nodes idle&lt;br /&gt;
Partition gpu_mi300               :      0 nodes idle&lt;br /&gt;
Partition dev_cpu_il              :      7 nodes idle&lt;br /&gt;
Partition cpu_il                  :      2 nodes idle&lt;br /&gt;
Partition dev_gpu_a100_il         :      1 nodes idle&lt;br /&gt;
Partition gpu_a100_il             :      0 nodes idle&lt;br /&gt;
Partition gpu_h100_il             :      1 nodes idle&lt;br /&gt;
Partition gpu_a100_short          :      0 nodes idle&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Running Jobs =&lt;br /&gt;
&lt;br /&gt;
== Slurm Commands (excerpt) ==&lt;br /&gt;
Important Slurm commands for non-administrators working on bwUniCluster 3.0.&lt;br /&gt;
{| width=850px class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! Slurm commands !! Brief explanation&lt;br /&gt;
|-&lt;br /&gt;
| [[#Batch Jobs: sbatch|sbatch]] || Submits a job and puts it into the queue [[https://slurm.schedmd.com/sbatch.html sbatch]] &lt;br /&gt;
|-&lt;br /&gt;
| [[#Interactive Jobs: salloc|salloc]] || Requests resources for an interactive Job [[https://slurm.schedmd.com/salloc.html salloc]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Monitor and manage jobs |scontrol show job]] || Displays detailed job state information [[https://slurm.schedmd.com/scontrol.html scontrol]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#List of your submitted jobs : squeue|squeue]] || Displays information about active, eligible, blocked, and/or recently completed jobs [[https://slurm.schedmd.com/squeue.html squeue]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#List of your submitted jobs : squeue|squeue --start]] || Returns start time of submitted job [[https://slurm.schedmd.com/squeue.html squeue]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Check available resources: sinfo_t_idle|sinfo_t_idle]] || Shows what resources are available for immediate use [[https://slurm.schedmd.com/sinfo.html sinfo]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Canceling own jobs : scancel|scancel]] || Cancels a job [[https://slurm.schedmd.com/scancel.html scancel]]&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
* [https://slurm.schedmd.com/tutorials.html  Slurm Tutorials]&lt;br /&gt;
* [https://slurm.schedmd.com/pdfs/summary.pdf  Slurm command/option summary (2 pages)]&lt;br /&gt;
* [https://slurm.schedmd.com/man_index.html  Slurm Commands]&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Batch Jobs: sbatch ==&lt;br /&gt;
&lt;br /&gt;
Batch jobs are submitted by using the command &#039;&#039;&#039;sbatch&#039;&#039;&#039;. The main purpose of the &#039;&#039;&#039;sbatch&#039;&#039;&#039; command is to specify the resources that are needed to run the job. &#039;&#039;&#039;sbatch&#039;&#039;&#039; will then queue the batch job. However, starting of batch job depends on the availability of the requested resources and the fair sharing value.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* The syntax and use of &#039;&#039;&#039;sbatch&#039;&#039;&#039; can be displayed via:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ man sbatch&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;sbatch&#039;&#039;&#039; options can be used from the command line or in your job script. Different defaults for some of these options are set based on the queue and can be found [[BwUniCluster3.0/Slurm | here]]&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! colspan=&amp;quot;3&amp;quot; | sbatch Options&lt;br /&gt;
|-&lt;br /&gt;
! style=&amp;quot;width:8%&amp;quot;| Command line&lt;br /&gt;
! style=&amp;quot;width:9%&amp;quot;| Script&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Purpose&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -t, --time=&#039;&#039;time&#039;&#039;&lt;br /&gt;
| #SBATCH --time=&#039;&#039;time&#039;&#039;&lt;br /&gt;
| Wall clock time limit.&amp;lt;br&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -N, --nodes=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --nodes=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of nodes to be used.&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -n, --ntasks=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --ntasks=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of tasks to be launched.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --ntasks-per-node=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --ntasks-per-node=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Maximum count of tasks per node.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -c, --cpus-per-task=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --cpus-per-task=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of CPUs required per (MPI-)task.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mem=&#039;&#039;value_in_MB&#039;&#039;&lt;br /&gt;
| #SBATCH --mem=&#039;&#039;value_in_MB&#039;&#039; &lt;br /&gt;
| Memory in MegaByte per node. (You should omit the setting of this option.)&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mem-per-cpu=&#039;&#039;value_in_MB&#039;&#039;&lt;br /&gt;
| #SBATCH --mem-per-cpu=&#039;&#039;value_in_MB&#039;&#039; &lt;br /&gt;
| Minimum Memory required per allocated CPU. (You should omit the setting of this option.)&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --exclusive&lt;br /&gt;
| #SBATCH --exclusive &lt;br /&gt;
| The job allocates all CPUs and GPUs on the nodes. It will not share the node with other running jobs&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mail-type=&#039;&#039;type&#039;&#039;&lt;br /&gt;
| #SBATCH --mail-type=&#039;&#039;type&#039;&#039;&lt;br /&gt;
| Notify user by email when certain event types occur.&amp;lt;br&amp;gt;Valid type values are NONE, BEGIN, END, FAIL, REQUEUE, ALL.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mail-user=&#039;&#039;mail-address&#039;&#039;&lt;br /&gt;
| #SBATCH --mail-user=&#039;&#039;mail-address&#039;&#039;&lt;br /&gt;
|  The specified mail-address receives email notification of state changes as defined by --mail-type.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --output=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --output=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| File in which job output is stored. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --error=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --error=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| File in which job error messages are stored. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -J, --job-name=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --job-name=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| Job name.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --export=[ALL,] &#039;&#039;env-variables&#039;&#039;&lt;br /&gt;
| #SBATCH --export=[ALL,] &#039;&#039;env-variables&#039;&#039;&lt;br /&gt;
| Identifies which environment variables from the submission environment are propagated to the launched application. Default is ALL.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -A, --account=&#039;&#039;group-name&#039;&#039;&lt;br /&gt;
| #SBATCH --account=&#039;&#039;group-name&#039;&#039;&lt;br /&gt;
| Change resources used by this job to specified group. You may need this option if your account is assigned to more than one group. By command &amp;quot;scontrol show job&amp;quot; the project group the job is accounted on can be seen behind &amp;quot;Account=&amp;quot;. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -p, --partition=&#039;&#039;queue-name&#039;&#039;&lt;br /&gt;
| #SBATCH --partition=&#039;&#039;queue-name&#039;&#039;&lt;br /&gt;
| Request a specific queue for the resource allocation.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --reservation=&#039;&#039;reservation-name&#039;&#039;&lt;br /&gt;
| #SBATCH --reservation=&#039;&#039;reservation-name&#039;&#039;&lt;br /&gt;
| Use a specific reservation for the resource allocation.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -C, --constraint=&#039;&#039;LSDF&#039;&#039;&lt;br /&gt;
| #SBATCH --constraint=LSDF&lt;br /&gt;
| Job constraint LSDF filesystems.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -C, --constraint=&#039;&#039;BEEOND (BEEOND_4MDS, BEEOND_MAXMDS)&#039;&#039;&lt;br /&gt;
| #SBATCH --constraint=BEEOND (BEEOND_4MDS, BEEOND_MAXMDS)&lt;br /&gt;
| Job constraint BeeOND filesystem.&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Interactive Jobs: salloc ==&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 you are only allowed to run short jobs (&amp;lt;&amp;lt; 1 hour) with little memory requirements (&amp;lt;&amp;lt; 8 GByte) on the logins nodes. If you want to run longer jobs and/or jobs with a request of more than 8 GByte of memory, you must allocate resources for so-called interactive jobs by usage of the command salloc on a login node. Considering a serial application running on a compute node that requires 5000 MByte of memory and limiting the interactive run to 2 hours the following command has to be executed:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ salloc -p cpu -n 1 -t 120 --mem=5000&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then you will get one core on a compute node within the partition &amp;quot;cpu&amp;quot;. After execution of this command &#039;&#039;&#039;DO NOT CLOSE&#039;&#039;&#039; your current terminal session but wait until the queueing system Slurm has granted you the requested resources on the compute system. You will be logged in automatically on the granted core! To run a serial program on the granted core you only have to type the name of the executable.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./&amp;lt;my_serial_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Please be aware that your serial job must run less than 2 hours in this example, else the job will be killed during runtime by the system. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
You can also start now a graphical X11-terminal connecting you to the dedicated resource that is available for 2 hours. You can start it by the command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ xterm&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that, once the walltime limit has been reached the resources - i.e. the compute node - will automatically be revoked.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
An interactive parallel application running on one compute node or on many compute nodes (e.g. here 5 nodes) with 96 cores each requires usually an amount of memory in GByte (e.g. 50 GByte) and a maximum time (e.g. 1 hour). E.g. 5 nodes can be allocated by the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ salloc -p cpu -N 5 --ntasks-per-node=96 -t 01:00:00  --mem=50gb&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now you can run parallel jobs on 480 cores requiring 50 GByte of memory per node. Please be aware that you will be logged in on core 0 of the first node.&lt;br /&gt;
If you want to have access to another node you have to open a new terminal, connect it also to bwUniCluster 3.0 and type the following commands to&lt;br /&gt;
connect to the running interactive job and then to a specific node:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ srun --jobid=XXXXXXXX --pty /bin/bash&lt;br /&gt;
$ srun --nodelist=uc3nXXX --pty /bin/bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With the command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
the jobid and the nodelist can be shown.&lt;br /&gt;
&lt;br /&gt;
If you want to run MPI-programs, you can do it by simply typing mpirun &amp;lt;program_name&amp;gt;. Then your program will be run on 480 cores. A very simple example for starting a parallel job can be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mpirun &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You can also start the debugger ddt by the commands:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module add devel/ddt&lt;br /&gt;
$ ddt &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The above commands will execute the parallel program &amp;lt;my_mpi_program&amp;gt; on all available cores. You can also start parallel programs on a subset of cores; an example for this can be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mpirun -n 50 &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you are using Intel MPI you must start &amp;lt;my_mpi_program&amp;gt; by the command mpiexec.hydra (instead of mpirun).&lt;br /&gt;
&lt;br /&gt;
== Interactive Computing with Jupyter ==&lt;br /&gt;
&lt;br /&gt;
== Monitor and manage jobs ==&lt;br /&gt;
&lt;br /&gt;
=== List of your submitted jobs : squeue ===&lt;br /&gt;
Displays information about YOUR active, pending and/or recently completed jobs. The command displays all own active and pending jobs. The command squeue is explained in detail on the webpage https://slurm.schedmd.com/squeue.html or via manpage (man squeue).&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;squeue&#039;&#039; example on bwUniCluster 3.0 &amp;lt;small&amp;gt;(Only your own jobs are displayed!)&amp;lt;/small&amp;gt;.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue &lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
              1262       cpu     wrap ka_ab123  R       8:15      1 uc3n002&lt;br /&gt;
              1267 dev_gpu_h     wrap ka_ab123 PD       0:00      1 (Resources)&lt;br /&gt;
              1265   highmem     wrap ka_ab123  R       2:41      1 uc3n084&lt;br /&gt;
$ squeue -l&lt;br /&gt;
             JOBID PARTITION     NAME     USER    STATE       TIME TIME_LIMI  NODES NODELIST(REASON)&lt;br /&gt;
              1262       cpu     wrap ka_ab123  RUNNING       8:55     20:00      1 uc3n002&lt;br /&gt;
              1267 dev_gpu_h     wrap ka_ab123  PENDING       0:00     20:00      1 (Resources)&lt;br /&gt;
              1265   highmem     wrap ka_ab123  RUNNING       3:21     20:00      1 uc3n084&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Detailed job information : scontrol show job ===&lt;br /&gt;
scontrol show job displays detailed job state information and diagnostic output for all or a specified job of yours. Detailed information is available for active, pending and recently completed jobs. The command scontrol is explained in detail on the webpage https://slurm.schedmd.com/scontrol.html or via manpage (man scontrol). &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Display the state of all your jobs in normal mode: scontrol show job&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Display the state of a job with &amp;lt;jobid&amp;gt; in normal mode: scontrol show job &amp;lt;jobid&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here is an example from bwUniCluster 3.0:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue&lt;br /&gt;
JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
1262       cpu     wrap ka_zs040  R       1:12      1 uc3n002&lt;br /&gt;
&lt;br /&gt;
$&lt;br /&gt;
$ # now, see what&#039;s up with my pending job with jobid 1262&lt;br /&gt;
$ &lt;br /&gt;
$ scontrol show job 1262&lt;br /&gt;
&lt;br /&gt;
JobId=1262 JobName=wrap&lt;br /&gt;
   UserId=ka_zs0402(241992) GroupId=ka_scc(12345) MCS_label=N/A&lt;br /&gt;
   Priority=4246 Nice=0 Account=ka QOS=normal&lt;br /&gt;
   JobState=RUNNING Reason=None Dependency=(null)&lt;br /&gt;
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0&lt;br /&gt;
   RunTime=00:00:37 TimeLimit=00:20:00 TimeMin=N/A&lt;br /&gt;
   SubmitTime=2025-04-04T10:01:30 EligibleTime=2025-04-04T10:01:30&lt;br /&gt;
   AccrueTime=2025-04-04T10:01:30&lt;br /&gt;
   StartTime=2025-04-04T10:01:31 EndTime=2025-04-04T10:21:31 Deadline=N/A&lt;br /&gt;
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2025-04-04T10:01:31 Scheduler=Main&lt;br /&gt;
   Partition=cpu AllocNode:Sid=uc3n999:2819841&lt;br /&gt;
   ReqNodeList=(null) ExcNodeList=(null)&lt;br /&gt;
   NodeList=uc3n002&lt;br /&gt;
   BatchHost=uc3n002&lt;br /&gt;
   NumNodes=1 NumCPUs=2 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*&lt;br /&gt;
   ReqTRES=cpu=1,mem=2000M,node=1,billing=1&lt;br /&gt;
   AllocTRES=cpu=2,mem=4000M,node=1,billing=2&lt;br /&gt;
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:1 CoreSpec=*&lt;br /&gt;
   MinCPUsNode=1 MinMemoryCPU=2000M MinTmpDiskNode=0&lt;br /&gt;
   Features=(null) DelayBoot=00:00:00&lt;br /&gt;
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)&lt;br /&gt;
   Command=(null)&lt;br /&gt;
   WorkDir=/pfs/data6/home/ka/ka_scc/ka_zs0402&lt;br /&gt;
   StdErr=/pfs/data6/home/ka/ka_scc/ka_zs0402/slurm-1262.out&lt;br /&gt;
   StdIn=/dev/null&lt;br /&gt;
   StdOut=/pfs/data6/home/ka/ka_scc/ka_zs0402/slurm-1262.out&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Each request to the Slurm workload manager generates a load. &amp;lt;p style=&amp;quot;color:red;&amp;gt; Therefore, do not use &amp;lt;code&amp;gt;squeue&amp;lt;/code&amp;gt; with a simple &amp;lt;code&amp;gt;watch&amp;lt;/code&amp;gt;.&amp;lt;/p&amp;gt; The smallest allowed time interval is &amp;lt;b&amp;gt;30 seconds&amp;lt;/b&amp;gt;.&amp;lt;br&amp;gt;&lt;br /&gt;
Any violation of this rule will result in the task being terminated without notice.&lt;br /&gt;
&lt;br /&gt;
=== Canceling own jobs : scancel ===&lt;br /&gt;
The scancel command is used to cancel jobs. The command scancel is explained in detail on the webpage https://slurm.schedmd.com/scancel.html or via manpage (man scancel). The command is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ scancel [-i] &amp;lt;job-id&amp;gt;&lt;br /&gt;
$ scancel -t &amp;lt;job_state_name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Slurm Options =&lt;br /&gt;
[[BwUniCluster3.0/Running_Jobs/Slurm | Detailed Slurm usage]]&lt;br /&gt;
&lt;br /&gt;
= Best Practices =&lt;br /&gt;
&lt;br /&gt;
== Step-by-Step example==&lt;br /&gt;
&lt;br /&gt;
== Dos and Don&#039;ts ==&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;| Do not run squeue and other slurm commands in loops or &amp;quot;watch&amp;quot; as not to saturate up the slurm daemon with rpc requests&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Running_Jobs&amp;diff=15362</id>
		<title>BwUniCluster3.0/Running Jobs</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Running_Jobs&amp;diff=15362"/>
		<updated>2025-10-24T08:05:14Z</updated>

		<summary type="html">&lt;p&gt;S Braun: /* Detailed job information : scontrol show job */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
= Purpose and function of a queuing system =&lt;br /&gt;
&lt;br /&gt;
All compute activities on bwUniCluster 3.0 have to be performed on the compute nodes. Compute nodes are only available by requesting the corresponding resources via the queuing system. As soon as the requested resources are available, automated tasks are executed via a batch script or they can be accessed interactively.&amp;lt;br&amp;gt;&lt;br /&gt;
General procedure: Hint to [[Running_Calculations | Running Calculations]]&lt;br /&gt;
&lt;br /&gt;
== Job submission process ==&lt;br /&gt;
&lt;br /&gt;
bwUniCluster 3.0 has installed the workload managing software Slurm. Therefore any job submission by the user is to be executed by commands of the Slurm software. Slurm queues and runs user jobs based on fair sharing policies.&lt;br /&gt;
&lt;br /&gt;
== Slurm ==&lt;br /&gt;
&lt;br /&gt;
HPC Workload Manager on bwUniCluster 3.0 is Slurm.&lt;br /&gt;
Slurm is a cluster management and job scheduling system. Slurm has three key functions. &lt;br /&gt;
* It allocates access to resources (compute cores on nodes) to users for some duration of time so they can perform work. &lt;br /&gt;
* It provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. &lt;br /&gt;
* It arbitrates contention for resources by managing a queue of pending work.&lt;br /&gt;
&lt;br /&gt;
Any kind of calculation on the compute nodes of bwUniCluster 3.0 requires the user to define calculations as a sequence of commands together with required run time, number of CPU cores and main memory and submit all, i.e., the &#039;&#039;&#039;batch job&#039;&#039;&#039;, to a resource and workload managing software.&lt;br /&gt;
&lt;br /&gt;
== Terms and definitions ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Partitions &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Slurm manages job queues for different &#039;&#039;&#039;partitions&#039;&#039;&#039;. Partitions are used to group similar node types (e.g. nodes with and without accelerators) and to enforce different access policies and resource limits.&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 there are different partitions:&lt;br /&gt;
&lt;br /&gt;
* CPU-only nodes&lt;br /&gt;
** 2-socket nodes, consisting of 2 Intel Ice Lake processors with 32 cores each or 2 AMD processors with 48 cores each&lt;br /&gt;
** 2-socket nodes with very high RAM capacity, consisting of 2 AMD processors with 48 cores each&lt;br /&gt;
* GPU-accelerated nodes&lt;br /&gt;
** 2-socket nodes with 4x NVIDIA A100 or 4x NVIDIA H100 GPUs&lt;br /&gt;
** 4-socket node with 4x AMD Instinct accelerator&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Queues &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Job &#039;&#039;&#039;queues&#039;&#039;&#039; are used to manage jobs that request access to shared but limited computing resources of a certain kind (partition).&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 there are different main types of queues:&lt;br /&gt;
* Regular queues&lt;br /&gt;
** cpu: Jobs that request CPU-only nodes.&lt;br /&gt;
** gpu: Jobs that request GPU-accelerated nodes.&lt;br /&gt;
* Development queues (dev)&lt;br /&gt;
** Short, usually interactive jobs that are used for developing, compiling and testing code and workflows. The intention behind development queues is to provide users with immediate access to computer resources without having to wait. This is the place to realize instantaneous heavy compute without affecting other users, as would be the case on the login nodes.&lt;br /&gt;
&lt;br /&gt;
Requested compute resources such as (wall-)time, number of nodes and amount of memory are restricted and must fit into the boundaries imposed by the queues. The request for compute resources on the bwUniCluster 3.0 &amp;lt;font color=red&amp;gt;requires at least the specification of the &#039;&#039;&#039;queue&#039;&#039;&#039; and the &#039;&#039;&#039;time&#039;&#039;&#039;&amp;lt;/font&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Jobs &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Jobs can be run non-interactively as &#039;&#039;&#039;batch jobs&#039;&#039;&#039; or as &#039;&#039;&#039;interactive jobs&#039;&#039;&#039;.&amp;lt;br&amp;gt;&lt;br /&gt;
Submitting a batch job means, that all steps of a compute project are defined in a Bash script. This Bash script is queued and executed as soon as the compute resources are available and allocated. Jobs are enqueued with the &amp;lt;code&amp;gt;sbatch&amp;lt;/code&amp;gt; command.&lt;br /&gt;
For interactive jobs, the resources are requested with the &amp;lt;code&amp;gt;salloc&amp;lt;/code&amp;gt; command. As soon as the computing resources are available and allocated, a command line prompt is returned on a computing node and the user can freely dispose of the resources now available to him.&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
&#039;&#039;&#039;Please remember:&#039;&#039;&#039;&lt;br /&gt;
* &#039;&#039;&#039;Heavy computations are not allowed on the login nodes&#039;&#039;&#039;.&amp;lt;br&amp;gt;Use a developement or a regular job queue instead! Please refer to [[BwUniCluster3.0/Login#Allowed_Activities_on_Login_Nodes|Allowed Activities on Login Nodes]].&lt;br /&gt;
* &#039;&#039;&#039;Development queues&#039;&#039;&#039; are meant for &#039;&#039;&#039;development tasks&#039;&#039;&#039;.&amp;lt;br&amp;gt;Do not misuse this queue for regular, short-running jobs or chain jobs! Only one running job at a time is enabled. Maximum queue length is reduced to 3.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
= Queues on bwUniCluster 3.0 = &lt;br /&gt;
== Policy ==&lt;br /&gt;
&lt;br /&gt;
The computing time is provided in accordance with the &#039;&#039;&#039;fair share policy&#039;&#039;&#039;. The individual investment shares of the respective university and the resources already used by its members are taken into account. Furthermore, the following throttling policy is also active: The &#039;&#039;&#039;maximum amount of physical cores&#039;&#039;&#039; used at any given time from jobs running is &#039;&#039;&#039;1920 per user&#039;&#039;&#039; (aggregated over all running jobs). This number corresponds to 30 nodes on the Ice Lake partition or 20 nodes on the standard partition. The aim is to minimize waiting times and maximize the number of users who can access computing time at the same time.&lt;br /&gt;
&lt;br /&gt;
== Regular Queues ==&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node-Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;cpu_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Ice Lake&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=72:00:00, nodes=30, mem=249600mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;cpu&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Standard&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=72:00:00, nodes=20, mem=380000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;highmem&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;High Memory&lt;br /&gt;
| mem-per-cpu=12090mb&lt;br /&gt;
| mem=380001mb&lt;br /&gt;
| time=72:00:00, nodes=4, mem=2300000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_h100&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=193300mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=72:00:00, nodes=12, mem=760000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_mi300&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU node&amp;lt;br/&amp;gt;AMD GPU x4&lt;br /&gt;
| mem-per-gpu=128200mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=72:00:00, nodes=1, mem=510000mb, ntasks-per-node=40, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_a100_il&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;gpu_h100_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;Ice Lake&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=127500mb&amp;lt;br/&amp;gt;cpus-per-gpu=16&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=48:00:00, nodes=9(A100)/nodes=5(H100) , mem=510000mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|}&lt;br /&gt;
Table 1: Regular Queues&lt;br /&gt;
&lt;br /&gt;
== Short Queues ==&lt;br /&gt;
&amp;lt;p style=&amp;quot;color:red; &amp;quot;&amp;gt;&amp;lt;b&amp;gt;Queues with a short runtime of 30 minutes.&amp;lt;/b&amp;gt;&amp;lt;/p&amp;gt; &lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_a100_short&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;Ice Lake&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=94000mb&amp;lt;br/&amp;gt;cpus-per-gpu=12&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=12, mem=376000mb, ntasks-per-node=48, (threads-per-core=2)&lt;br /&gt;
|}&lt;br /&gt;
Table 2: Short Queues&lt;br /&gt;
&lt;br /&gt;
== Development Queues ==&lt;br /&gt;
Only for development, i.e. debugging or performance optimization ...&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_cpu_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Ice Lake&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=30, nodes=8, mem=249600mb, ntasks-per-node=64, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_cpu&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Standard&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=30, nodes=1, mem=380000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_gpu_h100&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=193300mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=1, mem=760000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_gpu_a100_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&amp;lt;br/&amp;gt;&lt;br /&gt;
| mem-per-gpu=127500mb&amp;lt;br/&amp;gt;cpus-per-gpu=16 &lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=1, mem=510000mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|}&lt;br /&gt;
Table 3: Development Queues&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Default resources of a queue class define number of tasks and memory if not explicitly given with sbatch command. Resource list acronyms &#039;&#039;--time&#039;&#039;, &#039;&#039;--ntasks&#039;&#039;, &#039;&#039;--nodes&#039;&#039;, &#039;&#039;--mem&#039;&#039; and &#039;&#039;--mem-per-cpu&#039;&#039; are described [[BwUniCluster3.0/Running_Jobs/Slurm|here]].&lt;br /&gt;
&lt;br /&gt;
== Check available resources: sinfo_t_idle ==&lt;br /&gt;
The Slurm command sinfo is used to view partition and node information for a system running Slurm. It incorporates down time, reservations, and node state information in determining the available backfill window. The sinfo command can only be used by the administrator.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
SCC has prepared a special script (sinfo_t_idle) to find out how many processors are available for immediate use on the system. It is anticipated that users will use this information to submit jobs that meet these criteria and thus obtain quick job turnaround times. &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* The following command displays what resources are available for immediate use for the whole partition.&lt;br /&gt;
&amp;lt;pre&amp;gt;$ sinfo_t_idle &lt;br /&gt;
Partition dev_cpu                 :      1 nodes idle&lt;br /&gt;
Partition cpu                     :      1 nodes idle&lt;br /&gt;
Partition highmem                 :      2 nodes idle&lt;br /&gt;
Partition dev_gpu_h100            :      0 nodes idle&lt;br /&gt;
Partition gpu_h100                :      0 nodes idle&lt;br /&gt;
Partition gpu_mi300               :      0 nodes idle&lt;br /&gt;
Partition dev_cpu_il              :      7 nodes idle&lt;br /&gt;
Partition cpu_il                  :      2 nodes idle&lt;br /&gt;
Partition dev_gpu_a100_il         :      1 nodes idle&lt;br /&gt;
Partition gpu_a100_il             :      0 nodes idle&lt;br /&gt;
Partition gpu_h100_il             :      1 nodes idle&lt;br /&gt;
Partition gpu_a100_short          :      0 nodes idle&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Running Jobs =&lt;br /&gt;
&lt;br /&gt;
== Slurm Commands (excerpt) ==&lt;br /&gt;
Important Slurm commands for non-administrators working on bwUniCluster 3.0.&lt;br /&gt;
{| width=850px class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! Slurm commands !! Brief explanation&lt;br /&gt;
|-&lt;br /&gt;
| [[#Batch Jobs: sbatch|sbatch]] || Submits a job and puts it into the queue [[https://slurm.schedmd.com/sbatch.html sbatch]] &lt;br /&gt;
|-&lt;br /&gt;
| [[#Interactive Jobs: salloc|salloc]] || Requests resources for an interactive Job [[https://slurm.schedmd.com/salloc.html salloc]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Monitor and manage jobs |scontrol show job]] || Displays detailed job state information [[https://slurm.schedmd.com/scontrol.html scontrol]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#List of your submitted jobs : squeue|squeue]] || Displays information about active, eligible, blocked, and/or recently completed jobs [[https://slurm.schedmd.com/squeue.html squeue]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#List of your submitted jobs : squeue|squeue --start]] || Returns start time of submitted job [[https://slurm.schedmd.com/squeue.html squeue]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Check available resources: sinfo_t_idle|sinfo_t_idle]] || Shows what resources are available for immediate use [[https://slurm.schedmd.com/sinfo.html sinfo]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Canceling own jobs : scancel|scancel]] || Cancels a job [[https://slurm.schedmd.com/scancel.html scancel]]&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
* [https://slurm.schedmd.com/tutorials.html  Slurm Tutorials]&lt;br /&gt;
* [https://slurm.schedmd.com/pdfs/summary.pdf  Slurm command/option summary (2 pages)]&lt;br /&gt;
* [https://slurm.schedmd.com/man_index.html  Slurm Commands]&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Batch Jobs: sbatch ==&lt;br /&gt;
&lt;br /&gt;
Batch jobs are submitted by using the command &#039;&#039;&#039;sbatch&#039;&#039;&#039;. The main purpose of the &#039;&#039;&#039;sbatch&#039;&#039;&#039; command is to specify the resources that are needed to run the job. &#039;&#039;&#039;sbatch&#039;&#039;&#039; will then queue the batch job. However, starting of batch job depends on the availability of the requested resources and the fair sharing value.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* The syntax and use of &#039;&#039;&#039;sbatch&#039;&#039;&#039; can be displayed via:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ man sbatch&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;sbatch&#039;&#039;&#039; options can be used from the command line or in your job script. Different defaults for some of these options are set based on the queue and can be found [[BwUniCluster3.0/Slurm | here]]&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! colspan=&amp;quot;3&amp;quot; | sbatch Options&lt;br /&gt;
|-&lt;br /&gt;
! style=&amp;quot;width:8%&amp;quot;| Command line&lt;br /&gt;
! style=&amp;quot;width:9%&amp;quot;| Script&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Purpose&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -t, --time=&#039;&#039;time&#039;&#039;&lt;br /&gt;
| #SBATCH --time=&#039;&#039;time&#039;&#039;&lt;br /&gt;
| Wall clock time limit.&amp;lt;br&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -N, --nodes=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --nodes=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of nodes to be used.&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -n, --ntasks=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --ntasks=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of tasks to be launched.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --ntasks-per-node=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --ntasks-per-node=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Maximum count of tasks per node.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -c, --cpus-per-task=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --cpus-per-task=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of CPUs required per (MPI-)task.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mem=&#039;&#039;value_in_MB&#039;&#039;&lt;br /&gt;
| #SBATCH --mem=&#039;&#039;value_in_MB&#039;&#039; &lt;br /&gt;
| Memory in MegaByte per node. (You should omit the setting of this option.)&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mem-per-cpu=&#039;&#039;value_in_MB&#039;&#039;&lt;br /&gt;
| #SBATCH --mem-per-cpu=&#039;&#039;value_in_MB&#039;&#039; &lt;br /&gt;
| Minimum Memory required per allocated CPU. (You should omit the setting of this option.)&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --exclusive&lt;br /&gt;
| #SBATCH --exclusive &lt;br /&gt;
| The job allocates all CPUs and GPUs on the nodes. It will not share the node with other running jobs&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mail-type=&#039;&#039;type&#039;&#039;&lt;br /&gt;
| #SBATCH --mail-type=&#039;&#039;type&#039;&#039;&lt;br /&gt;
| Notify user by email when certain event types occur.&amp;lt;br&amp;gt;Valid type values are NONE, BEGIN, END, FAIL, REQUEUE, ALL.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mail-user=&#039;&#039;mail-address&#039;&#039;&lt;br /&gt;
| #SBATCH --mail-user=&#039;&#039;mail-address&#039;&#039;&lt;br /&gt;
|  The specified mail-address receives email notification of state changes as defined by --mail-type.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --output=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --output=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| File in which job output is stored. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --error=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --error=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| File in which job error messages are stored. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -J, --job-name=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --job-name=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| Job name.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --export=[ALL,] &#039;&#039;env-variables&#039;&#039;&lt;br /&gt;
| #SBATCH --export=[ALL,] &#039;&#039;env-variables&#039;&#039;&lt;br /&gt;
| Identifies which environment variables from the submission environment are propagated to the launched application. Default is ALL.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -A, --account=&#039;&#039;group-name&#039;&#039;&lt;br /&gt;
| #SBATCH --account=&#039;&#039;group-name&#039;&#039;&lt;br /&gt;
| Change resources used by this job to specified group. You may need this option if your account is assigned to more than one group. By command &amp;quot;scontrol show job&amp;quot; the project group the job is accounted on can be seen behind &amp;quot;Account=&amp;quot;. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -p, --partition=&#039;&#039;queue-name&#039;&#039;&lt;br /&gt;
| #SBATCH --partition=&#039;&#039;queue-name&#039;&#039;&lt;br /&gt;
| Request a specific queue for the resource allocation.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --reservation=&#039;&#039;reservation-name&#039;&#039;&lt;br /&gt;
| #SBATCH --reservation=&#039;&#039;reservation-name&#039;&#039;&lt;br /&gt;
| Use a specific reservation for the resource allocation.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -C, --constraint=&#039;&#039;LSDF&#039;&#039;&lt;br /&gt;
| #SBATCH --constraint=LSDF&lt;br /&gt;
| Job constraint LSDF filesystems.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -C, --constraint=&#039;&#039;BEEOND (BEEOND_4MDS, BEEOND_MAXMDS)&#039;&#039;&lt;br /&gt;
| #SBATCH --constraint=BEEOND (BEEOND_4MDS, BEEOND_MAXMDS)&lt;br /&gt;
| Job constraint BeeOND filesystem.&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Interactive Jobs: salloc ==&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 you are only allowed to run short jobs (&amp;lt;&amp;lt; 1 hour) with little memory requirements (&amp;lt;&amp;lt; 8 GByte) on the logins nodes. If you want to run longer jobs and/or jobs with a request of more than 8 GByte of memory, you must allocate resources for so-called interactive jobs by usage of the command salloc on a login node. Considering a serial application running on a compute node that requires 5000 MByte of memory and limiting the interactive run to 2 hours the following command has to be executed:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ salloc -p cpu -n 1 -t 120 --mem=5000&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then you will get one core on a compute node within the partition &amp;quot;cpu&amp;quot;. After execution of this command &#039;&#039;&#039;DO NOT CLOSE&#039;&#039;&#039; your current terminal session but wait until the queueing system Slurm has granted you the requested resources on the compute system. You will be logged in automatically on the granted core! To run a serial program on the granted core you only have to type the name of the executable.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./&amp;lt;my_serial_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Please be aware that your serial job must run less than 2 hours in this example, else the job will be killed during runtime by the system. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
You can also start now a graphical X11-terminal connecting you to the dedicated resource that is available for 2 hours. You can start it by the command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ xterm&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that, once the walltime limit has been reached the resources - i.e. the compute node - will automatically be revoked.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
An interactive parallel application running on one compute node or on many compute nodes (e.g. here 5 nodes) with 96 cores each requires usually an amount of memory in GByte (e.g. 50 GByte) and a maximum time (e.g. 1 hour). E.g. 5 nodes can be allocated by the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ salloc -p cpu -N 5 --ntasks-per-node=96 -t 01:00:00  --mem=50gb&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now you can run parallel jobs on 480 cores requiring 50 GByte of memory per node. Please be aware that you will be logged in on core 0 of the first node.&lt;br /&gt;
If you want to have access to another node you have to open a new terminal, connect it also to bwUniCluster 3.0 and type the following commands to&lt;br /&gt;
connect to the running interactive job and then to a specific node:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ srun --jobid=XXXXXXXX --pty /bin/bash&lt;br /&gt;
$ srun --nodelist=uc3nXXX --pty /bin/bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With the command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
the jobid and the nodelist can be shown.&lt;br /&gt;
&lt;br /&gt;
If you want to run MPI-programs, you can do it by simply typing mpirun &amp;lt;program_name&amp;gt;. Then your program will be run on 480 cores. A very simple example for starting a parallel job can be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mpirun &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You can also start the debugger ddt by the commands:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module add devel/ddt&lt;br /&gt;
$ ddt &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The above commands will execute the parallel program &amp;lt;my_mpi_program&amp;gt; on all available cores. You can also start parallel programs on a subset of cores; an example for this can be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mpirun -n 50 &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you are using Intel MPI you must start &amp;lt;my_mpi_program&amp;gt; by the command mpiexec.hydra (instead of mpirun).&lt;br /&gt;
&lt;br /&gt;
== Interactive Computing with Jupyter ==&lt;br /&gt;
&lt;br /&gt;
== Monitor and manage jobs ==&lt;br /&gt;
&lt;br /&gt;
=== List of your submitted jobs : squeue ===&lt;br /&gt;
Displays information about YOUR active, pending and/or recently completed jobs. The command displays all own active and pending jobs. The command squeue is explained in detail on the webpage https://slurm.schedmd.com/squeue.html or via manpage (man squeue).&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;squeue&#039;&#039; example on bwUniCluster 3.0 &amp;lt;small&amp;gt;(Only your own jobs are displayed!)&amp;lt;/small&amp;gt;.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue &lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
              1262       cpu     wrap ka_ab123  R       8:15      1 uc3n002&lt;br /&gt;
              1267 dev_gpu_h     wrap ka_ab123 PD       0:00      1 (Resources)&lt;br /&gt;
              1265   highmem     wrap ka_ab123  R       2:41      1 uc3n084&lt;br /&gt;
$ squeue -l&lt;br /&gt;
             JOBID PARTITION     NAME     USER    STATE       TIME TIME_LIMI  NODES NODELIST(REASON)&lt;br /&gt;
              1262       cpu     wrap ka_ab123  RUNNING       8:55     20:00      1 uc3n002&lt;br /&gt;
              1267 dev_gpu_h     wrap ka_ab123  PENDING       0:00     20:00      1 (Resources)&lt;br /&gt;
              1265   highmem     wrap ka_ab123  RUNNING       3:21     20:00      1 uc3n084&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Detailed job information : scontrol show job ===&lt;br /&gt;
scontrol show job displays detailed job state information and diagnostic output for all or a specified job of yours. Detailed information is available for active, pending and recently completed jobs. The command scontrol is explained in detail on the webpage https://slurm.schedmd.com/scontrol.html or via manpage (man scontrol). &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Display the state of all your jobs in normal mode: scontrol show job&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Display the state of a job with &amp;lt;jobid&amp;gt; in normal mode: scontrol show job &amp;lt;jobid&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here is an example from bwUniCluster 3.0:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue&lt;br /&gt;
JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
1262       cpu     wrap ka_zs040  R       1:12      1 uc3n002&lt;br /&gt;
&lt;br /&gt;
$&lt;br /&gt;
$ # now, see what&#039;s up with my pending job with jobid 1262&lt;br /&gt;
$ &lt;br /&gt;
$ scontrol show job 1262&lt;br /&gt;
&lt;br /&gt;
JobId=1262 JobName=wrap&lt;br /&gt;
   UserId=ka_zs0402(241992) GroupId=ka_scc(12345) MCS_label=N/A&lt;br /&gt;
   Priority=4246 Nice=0 Account=ka QOS=normal&lt;br /&gt;
   JobState=RUNNING Reason=None Dependency=(null)&lt;br /&gt;
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0&lt;br /&gt;
   RunTime=00:00:37 TimeLimit=00:20:00 TimeMin=N/A&lt;br /&gt;
   SubmitTime=2025-04-04T10:01:30 EligibleTime=2025-04-04T10:01:30&lt;br /&gt;
   AccrueTime=2025-04-04T10:01:30&lt;br /&gt;
   StartTime=2025-04-04T10:01:31 EndTime=2025-04-04T10:21:31 Deadline=N/A&lt;br /&gt;
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2025-04-04T10:01:31 Scheduler=Main&lt;br /&gt;
   Partition=cpu AllocNode:Sid=uc3n999:2819841&lt;br /&gt;
   ReqNodeList=(null) ExcNodeList=(null)&lt;br /&gt;
   NodeList=uc3n002&lt;br /&gt;
   BatchHost=uc3n002&lt;br /&gt;
   NumNodes=1 NumCPUs=2 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*&lt;br /&gt;
   ReqTRES=cpu=1,mem=2000M,node=1,billing=1&lt;br /&gt;
   AllocTRES=cpu=2,mem=4000M,node=1,billing=2&lt;br /&gt;
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:1 CoreSpec=*&lt;br /&gt;
   MinCPUsNode=1 MinMemoryCPU=2000M MinTmpDiskNode=0&lt;br /&gt;
   Features=(null) DelayBoot=00:00:00&lt;br /&gt;
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)&lt;br /&gt;
   Command=(null)&lt;br /&gt;
   WorkDir=/pfs/data6/home/ka/ka_scc/ka_zs0402&lt;br /&gt;
   StdErr=/pfs/data6/home/ka/ka_scc/ka_zs0402/slurm-1262.out&lt;br /&gt;
   StdIn=/dev/null&lt;br /&gt;
   StdOut=/pfs/data6/home/ka/ka_scc/ka_zs0402/slurm-1262.out&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Each request to the Slurm workload manager generates a load. Therefore, do not use &amp;lt;code&amp;gt;squeue&amp;lt;/code&amp;gt; with a simple &amp;lt;code&amp;gt;watch&amp;lt;/code&amp;gt;. The smallest allowed time interval is &amp;lt;b&amp;gt;30 seconds&amp;lt;/b&amp;gt;.&amp;lt;br&amp;gt;&lt;br /&gt;
Any violation of this rule will result in the task being terminated without notice.&lt;br /&gt;
&lt;br /&gt;
=== Canceling own jobs : scancel ===&lt;br /&gt;
The scancel command is used to cancel jobs. The command scancel is explained in detail on the webpage https://slurm.schedmd.com/scancel.html or via manpage (man scancel). The command is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ scancel [-i] &amp;lt;job-id&amp;gt;&lt;br /&gt;
$ scancel -t &amp;lt;job_state_name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Slurm Options =&lt;br /&gt;
[[BwUniCluster3.0/Running_Jobs/Slurm | Detailed Slurm usage]]&lt;br /&gt;
&lt;br /&gt;
= Best Practices =&lt;br /&gt;
&lt;br /&gt;
== Step-by-Step example==&lt;br /&gt;
&lt;br /&gt;
== Dos and Don&#039;ts ==&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;| Do not run squeue and other slurm commands in loops or &amp;quot;watch&amp;quot; as not to saturate up the slurm daemon with rpc requests&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Running_Jobs&amp;diff=15302</id>
		<title>BwUniCluster3.0/Running Jobs</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Running_Jobs&amp;diff=15302"/>
		<updated>2025-09-23T15:08:16Z</updated>

		<summary type="html">&lt;p&gt;S Braun: /* Dos and Don&amp;#039;ts */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
= Purpose and function of a queuing system =&lt;br /&gt;
&lt;br /&gt;
All compute activities on bwUniCluster 3.0 have to be performed on the compute nodes. Compute nodes are only available by requesting the corresponding resources via the queuing system. As soon as the requested resources are available, automated tasks are executed via a batch script or they can be accessed interactively.&amp;lt;br&amp;gt;&lt;br /&gt;
General procedure: Hint to [[Running_Calculations | Running Calculations]]&lt;br /&gt;
&lt;br /&gt;
== Job submission process ==&lt;br /&gt;
&lt;br /&gt;
bwUniCluster 3.0 has installed the workload managing software Slurm. Therefore any job submission by the user is to be executed by commands of the Slurm software. Slurm queues and runs user jobs based on fair sharing policies.&lt;br /&gt;
&lt;br /&gt;
== Slurm ==&lt;br /&gt;
&lt;br /&gt;
HPC Workload Manager on bwUniCluster 3.0 is Slurm.&lt;br /&gt;
Slurm is a cluster management and job scheduling system. Slurm has three key functions. &lt;br /&gt;
* It allocates access to resources (compute cores on nodes) to users for some duration of time so they can perform work. &lt;br /&gt;
* It provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. &lt;br /&gt;
* It arbitrates contention for resources by managing a queue of pending work.&lt;br /&gt;
&lt;br /&gt;
Any kind of calculation on the compute nodes of bwUniCluster 3.0 requires the user to define calculations as a sequence of commands together with required run time, number of CPU cores and main memory and submit all, i.e., the &#039;&#039;&#039;batch job&#039;&#039;&#039;, to a resource and workload managing software.&lt;br /&gt;
&lt;br /&gt;
== Terms and definitions ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Partitions &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Slurm manages job queues for different &#039;&#039;&#039;partitions&#039;&#039;&#039;. Partitions are used to group similar node types (e.g. nodes with and without accelerators) and to enforce different access policies and resource limits.&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 there are different partitions:&lt;br /&gt;
&lt;br /&gt;
* CPU-only nodes&lt;br /&gt;
** 2-socket nodes, consisting of 2 Intel Ice Lake processors with 32 cores each or 2 AMD processors with 48 cores each&lt;br /&gt;
** 2-socket nodes with very high RAM capacity, consisting of 2 AMD processors with 48 cores each&lt;br /&gt;
* GPU-accelerated nodes&lt;br /&gt;
** 2-socket nodes with 4x NVIDIA A100 or 4x NVIDIA H100 GPUs&lt;br /&gt;
** 4-socket node with 4x AMD Instinct accelerator&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Queues &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Job &#039;&#039;&#039;queues&#039;&#039;&#039; are used to manage jobs that request access to shared but limited computing resources of a certain kind (partition).&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 there are different main types of queues:&lt;br /&gt;
* Regular queues&lt;br /&gt;
** cpu: Jobs that request CPU-only nodes.&lt;br /&gt;
** gpu: Jobs that request GPU-accelerated nodes.&lt;br /&gt;
* Development queues (dev)&lt;br /&gt;
** Short, usually interactive jobs that are used for developing, compiling and testing code and workflows. The intention behind development queues is to provide users with immediate access to computer resources without having to wait. This is the place to realize instantaneous heavy compute without affecting other users, as would be the case on the login nodes.&lt;br /&gt;
&lt;br /&gt;
Requested compute resources such as (wall-)time, number of nodes and amount of memory are restricted and must fit into the boundaries imposed by the queues. The request for compute resources on the bwUniCluster 3.0 &amp;lt;font color=red&amp;gt;requires at least the specification of the &#039;&#039;&#039;queue&#039;&#039;&#039; and the &#039;&#039;&#039;time&#039;&#039;&#039;&amp;lt;/font&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Jobs &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Jobs can be run non-interactively as &#039;&#039;&#039;batch jobs&#039;&#039;&#039; or as &#039;&#039;&#039;interactive jobs&#039;&#039;&#039;.&amp;lt;br&amp;gt;&lt;br /&gt;
Submitting a batch job means, that all steps of a compute project are defined in a Bash script. This Bash script is queued and executed as soon as the compute resources are available and allocated. Jobs are enqueued with the &amp;lt;code&amp;gt;sbatch&amp;lt;/code&amp;gt; command.&lt;br /&gt;
For interactive jobs, the resources are requested with the &amp;lt;code&amp;gt;salloc&amp;lt;/code&amp;gt; command. As soon as the computing resources are available and allocated, a command line prompt is returned on a computing node and the user can freely dispose of the resources now available to him.&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
&#039;&#039;&#039;Please remember:&#039;&#039;&#039;&lt;br /&gt;
* &#039;&#039;&#039;Heavy computations are not allowed on the login nodes&#039;&#039;&#039;.&amp;lt;br&amp;gt;Use a developement or a regular job queue instead! Please refer to [[BwUniCluster3.0/Login#Allowed_Activities_on_Login_Nodes|Allowed Activities on Login Nodes]].&lt;br /&gt;
* &#039;&#039;&#039;Development queues&#039;&#039;&#039; are meant for &#039;&#039;&#039;development tasks&#039;&#039;&#039;.&amp;lt;br&amp;gt;Do not misuse this queue for regular, short-running jobs or chain jobs! Only one running job at a time is enabled. Maximum queue length is reduced to 3.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
= Queues on bwUniCluster 3.0 = &lt;br /&gt;
== Policy ==&lt;br /&gt;
&lt;br /&gt;
The computing time is provided in accordance with the &#039;&#039;&#039;fair share policy&#039;&#039;&#039;. The individual investment shares of the respective university and the resources already used by its members are taken into account. Furthermore, the following throttling policy is also active: The &#039;&#039;&#039;maximum amount of physical cores&#039;&#039;&#039; used at any given time from jobs running is &#039;&#039;&#039;1920 per user&#039;&#039;&#039; (aggregated over all running jobs). This number corresponds to 30 nodes on the Ice Lake partition or 20 nodes on the standard partition. The aim is to minimize waiting times and maximize the number of users who can access computing time at the same time.&lt;br /&gt;
&lt;br /&gt;
== Regular Queues ==&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node-Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;cpu_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Ice Lake&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=72:00:00, nodes=30, mem=249600mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;cpu&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Standard&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=72:00:00, nodes=20, mem=380000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;highmem&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;High Memory&lt;br /&gt;
| mem-per-cpu=12090mb&lt;br /&gt;
| mem=380001mb&lt;br /&gt;
| time=72:00:00, nodes=4, mem=2300000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_h100&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=193300mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=72:00:00, nodes=12, mem=760000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_mi300&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU node&amp;lt;br/&amp;gt;AMD GPU x4&lt;br /&gt;
| mem-per-gpu=128200mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=72:00:00, nodes=1, mem=510000mb, ntasks-per-node=40, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_a100_il&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;gpu_h100_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;Ice Lake&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=127500mb&amp;lt;br/&amp;gt;cpus-per-gpu=16&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=48:00:00, nodes=9(A100)/nodes=5(H100) , mem=510000mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|}&lt;br /&gt;
Table 1: Regular Queues&lt;br /&gt;
&lt;br /&gt;
== Short Queues ==&lt;br /&gt;
&amp;lt;p style=&amp;quot;color:red; &amp;quot;&amp;gt;&amp;lt;b&amp;gt;Queues with a short runtime of 30 minutes.&amp;lt;/b&amp;gt;&amp;lt;/p&amp;gt; &lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_a100_short&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;Ice Lake&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=94000mb&amp;lt;br/&amp;gt;cpus-per-gpu=12&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=12, mem=376000mb, ntasks-per-node=48, (threads-per-core=2)&lt;br /&gt;
|}&lt;br /&gt;
Table 2: Short Queues&lt;br /&gt;
&lt;br /&gt;
== Development Queues ==&lt;br /&gt;
Only for development, i.e. debugging or performance optimization ...&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_cpu_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Ice Lake&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=30, nodes=8, mem=249600mb, ntasks-per-node=64, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_cpu&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Standard&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=30, nodes=1, mem=380000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_gpu_h100&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=193300mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=1, mem=760000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_gpu_a100_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&amp;lt;br/&amp;gt;&lt;br /&gt;
| mem-per-gpu=127500mb&amp;lt;br/&amp;gt;cpus-per-gpu=16 &lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=1, mem=510000mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|}&lt;br /&gt;
Table 3: Development Queues&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Default resources of a queue class define number of tasks and memory if not explicitly given with sbatch command. Resource list acronyms &#039;&#039;--time&#039;&#039;, &#039;&#039;--ntasks&#039;&#039;, &#039;&#039;--nodes&#039;&#039;, &#039;&#039;--mem&#039;&#039; and &#039;&#039;--mem-per-cpu&#039;&#039; are described [[BwUniCluster3.0/Running_Jobs/Slurm|here]].&lt;br /&gt;
&lt;br /&gt;
== Check available resources: sinfo_t_idle ==&lt;br /&gt;
The Slurm command sinfo is used to view partition and node information for a system running Slurm. It incorporates down time, reservations, and node state information in determining the available backfill window. The sinfo command can only be used by the administrator.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
SCC has prepared a special script (sinfo_t_idle) to find out how many processors are available for immediate use on the system. It is anticipated that users will use this information to submit jobs that meet these criteria and thus obtain quick job turnaround times. &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* The following command displays what resources are available for immediate use for the whole partition.&lt;br /&gt;
&amp;lt;pre&amp;gt;$ sinfo_t_idle &lt;br /&gt;
Partition dev_cpu                 :      1 nodes idle&lt;br /&gt;
Partition cpu                     :      1 nodes idle&lt;br /&gt;
Partition highmem                 :      2 nodes idle&lt;br /&gt;
Partition dev_gpu_h100            :      0 nodes idle&lt;br /&gt;
Partition gpu_h100                :      0 nodes idle&lt;br /&gt;
Partition gpu_mi300               :      0 nodes idle&lt;br /&gt;
Partition dev_cpu_il              :      7 nodes idle&lt;br /&gt;
Partition cpu_il                  :      2 nodes idle&lt;br /&gt;
Partition dev_gpu_a100_il         :      1 nodes idle&lt;br /&gt;
Partition gpu_a100_il             :      0 nodes idle&lt;br /&gt;
Partition gpu_h100_il             :      1 nodes idle&lt;br /&gt;
Partition gpu_a100_short          :      0 nodes idle&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Running Jobs =&lt;br /&gt;
&lt;br /&gt;
== Slurm Commands (excerpt) ==&lt;br /&gt;
Important Slurm commands for non-administrators working on bwUniCluster 3.0.&lt;br /&gt;
{| width=850px class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! Slurm commands !! Brief explanation&lt;br /&gt;
|-&lt;br /&gt;
| [[#Batch Jobs: sbatch|sbatch]] || Submits a job and puts it into the queue [[https://slurm.schedmd.com/sbatch.html sbatch]] &lt;br /&gt;
|-&lt;br /&gt;
| [[#Interactive Jobs: salloc|salloc]] || Requests resources for an interactive Job [[https://slurm.schedmd.com/salloc.html salloc]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Monitor and manage jobs |scontrol show job]] || Displays detailed job state information [[https://slurm.schedmd.com/scontrol.html scontrol]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#List of your submitted jobs : squeue|squeue]] || Displays information about active, eligible, blocked, and/or recently completed jobs [[https://slurm.schedmd.com/squeue.html squeue]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#List of your submitted jobs : squeue|squeue --start]] || Returns start time of submitted job [[https://slurm.schedmd.com/squeue.html squeue]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Check available resources: sinfo_t_idle|sinfo_t_idle]] || Shows what resources are available for immediate use [[https://slurm.schedmd.com/sinfo.html sinfo]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Canceling own jobs : scancel|scancel]] || Cancels a job [[https://slurm.schedmd.com/scancel.html scancel]]&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
* [https://slurm.schedmd.com/tutorials.html  Slurm Tutorials]&lt;br /&gt;
* [https://slurm.schedmd.com/pdfs/summary.pdf  Slurm command/option summary (2 pages)]&lt;br /&gt;
* [https://slurm.schedmd.com/man_index.html  Slurm Commands]&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Batch Jobs: sbatch ==&lt;br /&gt;
&lt;br /&gt;
Batch jobs are submitted by using the command &#039;&#039;&#039;sbatch&#039;&#039;&#039;. The main purpose of the &#039;&#039;&#039;sbatch&#039;&#039;&#039; command is to specify the resources that are needed to run the job. &#039;&#039;&#039;sbatch&#039;&#039;&#039; will then queue the batch job. However, starting of batch job depends on the availability of the requested resources and the fair sharing value.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* The syntax and use of &#039;&#039;&#039;sbatch&#039;&#039;&#039; can be displayed via:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ man sbatch&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;sbatch&#039;&#039;&#039; options can be used from the command line or in your job script. Different defaults for some of these options are set based on the queue and can be found [[BwUniCluster3.0/Slurm | here]]&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! colspan=&amp;quot;3&amp;quot; | sbatch Options&lt;br /&gt;
|-&lt;br /&gt;
! style=&amp;quot;width:8%&amp;quot;| Command line&lt;br /&gt;
! style=&amp;quot;width:9%&amp;quot;| Script&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Purpose&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -t, --time=&#039;&#039;time&#039;&#039;&lt;br /&gt;
| #SBATCH --time=&#039;&#039;time&#039;&#039;&lt;br /&gt;
| Wall clock time limit.&amp;lt;br&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -N, --nodes=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --nodes=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of nodes to be used.&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -n, --ntasks=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --ntasks=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of tasks to be launched.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --ntasks-per-node=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --ntasks-per-node=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Maximum count of tasks per node.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -c, --cpus-per-task=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --cpus-per-task=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of CPUs required per (MPI-)task.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mem=&#039;&#039;value_in_MB&#039;&#039;&lt;br /&gt;
| #SBATCH --mem=&#039;&#039;value_in_MB&#039;&#039; &lt;br /&gt;
| Memory in MegaByte per node. (You should omit the setting of this option.)&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mem-per-cpu=&#039;&#039;value_in_MB&#039;&#039;&lt;br /&gt;
| #SBATCH --mem-per-cpu=&#039;&#039;value_in_MB&#039;&#039; &lt;br /&gt;
| Minimum Memory required per allocated CPU. (You should omit the setting of this option.)&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --exclusive&lt;br /&gt;
| #SBATCH --exclusive &lt;br /&gt;
| The job allocates all CPUs and GPUs on the nodes. It will not share the node with other running jobs&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mail-type=&#039;&#039;type&#039;&#039;&lt;br /&gt;
| #SBATCH --mail-type=&#039;&#039;type&#039;&#039;&lt;br /&gt;
| Notify user by email when certain event types occur.&amp;lt;br&amp;gt;Valid type values are NONE, BEGIN, END, FAIL, REQUEUE, ALL.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mail-user=&#039;&#039;mail-address&#039;&#039;&lt;br /&gt;
| #SBATCH --mail-user=&#039;&#039;mail-address&#039;&#039;&lt;br /&gt;
|  The specified mail-address receives email notification of state changes as defined by --mail-type.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --output=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --output=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| File in which job output is stored. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --error=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --error=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| File in which job error messages are stored. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -J, --job-name=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --job-name=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| Job name.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --export=[ALL,] &#039;&#039;env-variables&#039;&#039;&lt;br /&gt;
| #SBATCH --export=[ALL,] &#039;&#039;env-variables&#039;&#039;&lt;br /&gt;
| Identifies which environment variables from the submission environment are propagated to the launched application. Default is ALL.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -A, --account=&#039;&#039;group-name&#039;&#039;&lt;br /&gt;
| #SBATCH --account=&#039;&#039;group-name&#039;&#039;&lt;br /&gt;
| Change resources used by this job to specified group. You may need this option if your account is assigned to more than one group. By command &amp;quot;scontrol show job&amp;quot; the project group the job is accounted on can be seen behind &amp;quot;Account=&amp;quot;. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -p, --partition=&#039;&#039;queue-name&#039;&#039;&lt;br /&gt;
| #SBATCH --partition=&#039;&#039;queue-name&#039;&#039;&lt;br /&gt;
| Request a specific queue for the resource allocation.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --reservation=&#039;&#039;reservation-name&#039;&#039;&lt;br /&gt;
| #SBATCH --reservation=&#039;&#039;reservation-name&#039;&#039;&lt;br /&gt;
| Use a specific reservation for the resource allocation.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -C, --constraint=&#039;&#039;LSDF&#039;&#039;&lt;br /&gt;
| #SBATCH --constraint=LSDF&lt;br /&gt;
| Job constraint LSDF filesystems.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -C, --constraint=&#039;&#039;BEEOND (BEEOND_4MDS, BEEOND_MAXMDS)&#039;&#039;&lt;br /&gt;
| #SBATCH --constraint=BEEOND (BEEOND_4MDS, BEEOND_MAXMDS)&lt;br /&gt;
| Job constraint BeeOND filesystem.&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Interactive Jobs: salloc ==&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 you are only allowed to run short jobs (&amp;lt;&amp;lt; 1 hour) with little memory requirements (&amp;lt;&amp;lt; 8 GByte) on the logins nodes. If you want to run longer jobs and/or jobs with a request of more than 8 GByte of memory, you must allocate resources for so-called interactive jobs by usage of the command salloc on a login node. Considering a serial application running on a compute node that requires 5000 MByte of memory and limiting the interactive run to 2 hours the following command has to be executed:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ salloc -p cpu -n 1 -t 120 --mem=5000&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then you will get one core on a compute node within the partition &amp;quot;cpu&amp;quot;. After execution of this command &#039;&#039;&#039;DO NOT CLOSE&#039;&#039;&#039; your current terminal session but wait until the queueing system Slurm has granted you the requested resources on the compute system. You will be logged in automatically on the granted core! To run a serial program on the granted core you only have to type the name of the executable.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./&amp;lt;my_serial_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Please be aware that your serial job must run less than 2 hours in this example, else the job will be killed during runtime by the system. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
You can also start now a graphical X11-terminal connecting you to the dedicated resource that is available for 2 hours. You can start it by the command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ xterm&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that, once the walltime limit has been reached the resources - i.e. the compute node - will automatically be revoked.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
An interactive parallel application running on one compute node or on many compute nodes (e.g. here 5 nodes) with 96 cores each requires usually an amount of memory in GByte (e.g. 50 GByte) and a maximum time (e.g. 1 hour). E.g. 5 nodes can be allocated by the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ salloc -p cpu -N 5 --ntasks-per-node=96 -t 01:00:00  --mem=50gb&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now you can run parallel jobs on 480 cores requiring 50 GByte of memory per node. Please be aware that you will be logged in on core 0 of the first node.&lt;br /&gt;
If you want to have access to another node you have to open a new terminal, connect it also to bwUniCluster 3.0 and type the following commands to&lt;br /&gt;
connect to the running interactive job and then to a specific node:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ srun --jobid=XXXXXXXX --pty /bin/bash&lt;br /&gt;
$ srun --nodelist=uc3nXXX --pty /bin/bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With the command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
the jobid and the nodelist can be shown.&lt;br /&gt;
&lt;br /&gt;
If you want to run MPI-programs, you can do it by simply typing mpirun &amp;lt;program_name&amp;gt;. Then your program will be run on 480 cores. A very simple example for starting a parallel job can be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mpirun &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You can also start the debugger ddt by the commands:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module add devel/ddt&lt;br /&gt;
$ ddt &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The above commands will execute the parallel program &amp;lt;my_mpi_program&amp;gt; on all available cores. You can also start parallel programs on a subset of cores; an example for this can be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mpirun -n 50 &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you are using Intel MPI you must start &amp;lt;my_mpi_program&amp;gt; by the command mpiexec.hydra (instead of mpirun).&lt;br /&gt;
&lt;br /&gt;
== Interactive Computing with Jupyter ==&lt;br /&gt;
&lt;br /&gt;
== Monitor and manage jobs ==&lt;br /&gt;
&lt;br /&gt;
=== List of your submitted jobs : squeue ===&lt;br /&gt;
Displays information about YOUR active, pending and/or recently completed jobs. The command displays all own active and pending jobs. The command squeue is explained in detail on the webpage https://slurm.schedmd.com/squeue.html or via manpage (man squeue).&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;squeue&#039;&#039; example on bwUniCluster 3.0 &amp;lt;small&amp;gt;(Only your own jobs are displayed!)&amp;lt;/small&amp;gt;.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue &lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
              1262       cpu     wrap ka_ab123  R       8:15      1 uc3n002&lt;br /&gt;
              1267 dev_gpu_h     wrap ka_ab123 PD       0:00      1 (Resources)&lt;br /&gt;
              1265   highmem     wrap ka_ab123  R       2:41      1 uc3n084&lt;br /&gt;
$ squeue -l&lt;br /&gt;
             JOBID PARTITION     NAME     USER    STATE       TIME TIME_LIMI  NODES NODELIST(REASON)&lt;br /&gt;
              1262       cpu     wrap ka_ab123  RUNNING       8:55     20:00      1 uc3n002&lt;br /&gt;
              1267 dev_gpu_h     wrap ka_ab123  PENDING       0:00     20:00      1 (Resources)&lt;br /&gt;
              1265   highmem     wrap ka_ab123  RUNNING       3:21     20:00      1 uc3n084&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Detailed job information : scontrol show job ===&lt;br /&gt;
scontrol show job displays detailed job state information and diagnostic output for all or a specified job of yours. Detailed information is available for active, pending and recently completed jobs. The command scontrol is explained in detail on the webpage https://slurm.schedmd.com/scontrol.html or via manpage (man scontrol). &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Display the state of all your jobs in normal mode: scontrol show job&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Display the state of a job with &amp;lt;jobid&amp;gt; in normal mode: scontrol show job &amp;lt;jobid&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Here is an example from bwUniCluster 3.0.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue&lt;br /&gt;
JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
1262       cpu     wrap ka_zs040  R       1:12      1 uc3n002&lt;br /&gt;
&lt;br /&gt;
$&lt;br /&gt;
$ # now, see what&#039;s up with my pending job with jobid 1262&lt;br /&gt;
$ &lt;br /&gt;
$ scontrol show job 1262&lt;br /&gt;
&lt;br /&gt;
JobId=1262 JobName=wrap&lt;br /&gt;
   UserId=ka_zs0402(241992) GroupId=ka_scc(12345) MCS_label=N/A&lt;br /&gt;
   Priority=4246 Nice=0 Account=ka QOS=normal&lt;br /&gt;
   JobState=RUNNING Reason=None Dependency=(null)&lt;br /&gt;
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0&lt;br /&gt;
   RunTime=00:00:37 TimeLimit=00:20:00 TimeMin=N/A&lt;br /&gt;
   SubmitTime=2025-04-04T10:01:30 EligibleTime=2025-04-04T10:01:30&lt;br /&gt;
   AccrueTime=2025-04-04T10:01:30&lt;br /&gt;
   StartTime=2025-04-04T10:01:31 EndTime=2025-04-04T10:21:31 Deadline=N/A&lt;br /&gt;
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2025-04-04T10:01:31 Scheduler=Main&lt;br /&gt;
   Partition=cpu AllocNode:Sid=uc3n999:2819841&lt;br /&gt;
   ReqNodeList=(null) ExcNodeList=(null)&lt;br /&gt;
   NodeList=uc3n002&lt;br /&gt;
   BatchHost=uc3n002&lt;br /&gt;
   NumNodes=1 NumCPUs=2 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*&lt;br /&gt;
   ReqTRES=cpu=1,mem=2000M,node=1,billing=1&lt;br /&gt;
   AllocTRES=cpu=2,mem=4000M,node=1,billing=2&lt;br /&gt;
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:1 CoreSpec=*&lt;br /&gt;
   MinCPUsNode=1 MinMemoryCPU=2000M MinTmpDiskNode=0&lt;br /&gt;
   Features=(null) DelayBoot=00:00:00&lt;br /&gt;
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)&lt;br /&gt;
   Command=(null)&lt;br /&gt;
   WorkDir=/pfs/data6/home/ka/ka_scc/ka_zs0402&lt;br /&gt;
   StdErr=/pfs/data6/home/ka/ka_scc/ka_zs0402/slurm-1262.out&lt;br /&gt;
   StdIn=/dev/null&lt;br /&gt;
   StdOut=/pfs/data6/home/ka/ka_scc/ka_zs0402/slurm-1262.out&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
=== Canceling own jobs : scancel ===&lt;br /&gt;
The scancel command is used to cancel jobs. The command scancel is explained in detail on the webpage https://slurm.schedmd.com/scancel.html or via manpage (man scancel). The command is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ scancel [-i] &amp;lt;job-id&amp;gt;&lt;br /&gt;
$ scancel -t &amp;lt;job_state_name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Slurm Options =&lt;br /&gt;
[[BwUniCluster3.0/Running_Jobs/Slurm | Detailed Slurm usage]]&lt;br /&gt;
&lt;br /&gt;
= Best Practices =&lt;br /&gt;
&lt;br /&gt;
== Step-by-Step example==&lt;br /&gt;
&lt;br /&gt;
== Dos and Don&#039;ts ==&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;| Do not run squeue and other slurm commands in loops or &amp;quot;watch&amp;quot; as not to saturate up the slurm daemon with rpc requests&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Hardware_and_Architecture&amp;diff=15265</id>
		<title>BwUniCluster3.0/Hardware and Architecture</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Hardware_and_Architecture&amp;diff=15265"/>
		<updated>2025-09-02T09:45:21Z</updated>

		<summary type="html">&lt;p&gt;S Braun: /* Compute nodes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Architecture of bwUniCluster 3.0 =&lt;br /&gt;
&lt;br /&gt;
The &#039;&#039;&#039;bwUniCluster 3.0&#039;&#039;&#039; is a parallel computer with distributed memory. &lt;br /&gt;
It consists of the bwUniCluster 3.0 components procured in 2024 and also includes the additional compute nodes which were procured as an extension to the bwUniCluster 2.0 in 2022.&lt;br /&gt;
 &lt;br /&gt;
Each node of the system consists of two Intel Xeon or AMD EPYC processors, local memory, local storage, network adapters and optional accelerators (NVIDIA A100 and H100, AMD Instinct MI300A). All nodes are connected via a fast InfiniBand interconnect.&lt;br /&gt;
&lt;br /&gt;
The parallel file system (Lustre) is connected to the InfiniBand switch of the compute cluster. This provides a fast and scalable parallel file &lt;br /&gt;
system.&lt;br /&gt;
&lt;br /&gt;
The operating system on each node is Red Hat Enterprise Linux (RHEL) 9.4.&lt;br /&gt;
&lt;br /&gt;
The individual nodes of the system act in different roles. From an end users point of view the different groups of nodes are login nodes and compute nodes. File server nodes and administrative server nodes are not accessible by users.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Login Nodes&#039;&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The login nodes are the only nodes directly accessible by end users. These nodes are used for interactive login, file management, program development, and interactive pre- and post-processing.&lt;br /&gt;
There are two nodes dedicated to this service, but they can all be reached from a single address: &amp;lt;code&amp;gt;uc3.scc.kit.edu&amp;lt;/code&amp;gt;. A DNS round-robin alias distributes login sessions to the login nodes.&lt;br /&gt;
To prevent login nodes from being used for activities that are not permitted there and that affect the user experience of other users, &#039;&#039;&#039;long-running and/or compute-intensive tasks are periodically terminated without any prior warning&#039;&#039;&#039;. Please refer to [[BwUniCluster3.0/Login#Allowed_Activities_on_Login_Nodes|Allowed Activities on Login Nodes]].&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Compute Nodes&#039;&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
The majority of nodes are compute nodes which are managed by a batch system. Users submit their jobs to the SLURM batch system and a job is executed when the required resources become available (depending on its fair-share priority).&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;File Systems&#039;&#039;&#039;&amp;lt;br&amp;gt;&lt;br /&gt;
bwUniCluster 3.0 comprises two parallel file systems based on Lustre.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[File:uc3.png|Optionen|center|Überschrift|800px]]&lt;br /&gt;
&lt;br /&gt;
= Compute Resources =&lt;br /&gt;
&lt;br /&gt;
== Login nodes ==&lt;br /&gt;
&lt;br /&gt;
After a successful [[BwUniCluster3.0/Login|login]], users find themselves on one of the so called login nodes. Technically, these largely correspond to a standard CPU node, i.e. users have two AMD EPYC 9454 processors with a total of 96 cores at their disposal. Login nodes are the bridgehead for accessing computing resources.&lt;br /&gt;
Data and software are organized here, computing jobs are initiated and managed, and computing resources allocated for interactive use can also be accessed from here.&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#ffa500; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#ffa500; text-align:left&amp;quot;|&lt;br /&gt;
&#039;&#039;&#039;Any compute intensive job running on the login nodes will be terminated without any notice.&#039;&#039;&#039;&amp;lt;br/&amp;gt;&lt;br /&gt;
Please refer to [[BwUniCluster3.0/Login#Allowed_Activities_on_Login_Nodes|Allowed Activities on Login Nodes]].&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Compute nodes ==&lt;br /&gt;
All compute activities on bwUniCluster 3.0 have to be performed on the compute nodes. Compute nodes are only available by requesting the corresponding resources via the queuing system. As soon as the requested resources are available, automated tasks are executed via a batch script or they can be accessed interactively. Please refer to [[BwUniCluster3.0/Running_Jobs|Running Jobs]] on how to request resources.&amp;lt;br&amp;gt;&lt;br /&gt;
The following compute node types are available:&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;b&amp;gt;CPU nodes&amp;lt;/b&amp;gt;&lt;br /&gt;
* &#039;&#039;&#039;Standard&#039;&#039;&#039;: Two AMD EPYC 9454 processors per node with a total of 96 physical CPU cores or 192 logical cores (Hyper-Threading) per node. The nodes have been procured in 2024.&lt;br /&gt;
* &#039;&#039;&#039;Ice Lake&#039;&#039;&#039;: Two Intel Xeon Platinum 8358 processors per node with a total of 64 physical CPU cores or 128 logical cores (Hyper-Threading) per node. The nodes have been procured in 2022 as an extension to bwUniCluster 2.0.&lt;br /&gt;
* &#039;&#039;&#039;High Memory&#039;&#039;&#039;: Similar to the standard nodes, but with six times larger memory.&lt;br /&gt;
&amp;lt;b&amp;gt;GPU nodes&amp;lt;/b&amp;gt;&lt;br /&gt;
* &#039;&#039;&#039;NVIDIA GPU x4&#039;&#039;&#039;: Similar to the standard nodes, but with larger memory and four NVIDIA H100 GPUs.&lt;br /&gt;
* &#039;&#039;&#039;AMD GPU x4&#039;&#039;&#039;: AMD&#039;s accelerated processing unit (APU) MI300A with 4 CPU sockets and 4 compute units which share the same high-bandwidth memory (HBM).&lt;br /&gt;
* &#039;&#039;&#039;Ice Lake NVIDIA GPU x4&#039;&#039;&#039;: Similar to the Ice Lake nodes, but with larger memory and four NVIDIA A100 or H100 GPUs.&lt;br /&gt;
* &#039;&#039;&#039;Cascade Lake NVIDIA GPU x4&#039;&#039;&#039;: Nodes with four NVIDIA A100 GPUs.&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:10%&amp;quot;| Node Type&lt;br /&gt;
! style=&amp;quot;width:10%&amp;quot;| CPU nodes&amp;lt;br/&amp;gt;Ice Lake&lt;br /&gt;
! style=&amp;quot;width:10%&amp;quot;| CPU nodes&amp;lt;br/&amp;gt;Standard&lt;br /&gt;
! style=&amp;quot;width:10%&amp;quot;| CPU nodes&amp;lt;br/&amp;gt;High Memory&lt;br /&gt;
! style=&amp;quot;width:10%&amp;quot;| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
! style=&amp;quot;width:10%&amp;quot;| GPU node&amp;lt;br/&amp;gt;AMD GPU x4&lt;br /&gt;
! style=&amp;quot;width:10%&amp;quot;| GPU nodes&amp;lt;br/&amp;gt;Ice Lake&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
! style=&amp;quot;width:10%&amp;quot;| GPU nodes&amp;lt;br/&amp;gt;Cascade Lake&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
! style=&amp;quot;width:10%&amp;quot;| Login nodes&lt;br /&gt;
|-&lt;br /&gt;
!scope=&amp;quot;column&amp;quot;| Availability in [[BwUniCluster3.0/Running_Jobs#Queues_on_bwUniCluster_3.0| queues]]&lt;br /&gt;
| &amp;lt;code&amp;gt;cpu_il&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;dev_cpu_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| &amp;lt;code&amp;gt;cpu&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;dev_cpu&amp;lt;/code&amp;gt;&lt;br /&gt;
| &amp;lt;code&amp;gt;highmem&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;dev_highmem&amp;lt;/code&amp;gt;&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_h100&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;dev_gpu_h100&amp;lt;/code&amp;gt;&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_mi300&amp;lt;/code&amp;gt;&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_a100_il&amp;lt;/code&amp;gt; / &amp;lt;code&amp;gt;gpu_h100_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_a100_short&amp;lt;/code&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
| -&lt;br /&gt;
|-&lt;br /&gt;
!scope=&amp;quot;column&amp;quot;| Number of nodes&lt;br /&gt;
| 272&lt;br /&gt;
| 70&lt;br /&gt;
| 4&lt;br /&gt;
| 12&lt;br /&gt;
| 1&lt;br /&gt;
| 15&lt;br /&gt;
| 19&lt;br /&gt;
| 2&lt;br /&gt;
|-&lt;br /&gt;
!scope=&amp;quot;column&amp;quot;| Processors&lt;br /&gt;
| Intel Xeon Platinum 8358&lt;br /&gt;
| AMD EPYC 9454&lt;br /&gt;
| AMD EPYC 9454&lt;br /&gt;
| AMD EPYC 9454&lt;br /&gt;
| AMD Zen 4&lt;br /&gt;
| Intel Xeon Platinum 8358&lt;br /&gt;
| Intel Xeon Gold 6248R&lt;br /&gt;
| AMD EPYC 9454&lt;br /&gt;
|-&lt;br /&gt;
!scope=&amp;quot;column&amp;quot;| Number of sockets&lt;br /&gt;
| 2&lt;br /&gt;
| 2&lt;br /&gt;
| 2&lt;br /&gt;
| 2&lt;br /&gt;
| 4&lt;br /&gt;
| 2&lt;br /&gt;
| 2&lt;br /&gt;
| 2&lt;br /&gt;
|-&lt;br /&gt;
!scope=&amp;quot;column&amp;quot;| Total number of cores&lt;br /&gt;
| 64&lt;br /&gt;
| 96&lt;br /&gt;
| 96&lt;br /&gt;
| 96&lt;br /&gt;
| 96 (4x 24)&lt;br /&gt;
| 64&lt;br /&gt;
| 48&lt;br /&gt;
| 96&lt;br /&gt;
|-&lt;br /&gt;
!scope=&amp;quot;column&amp;quot;| Main memory&lt;br /&gt;
| 256 GB&lt;br /&gt;
| 384 GB&lt;br /&gt;
| 2.3 TB&lt;br /&gt;
| 768 GB&lt;br /&gt;
| 4x 128 GB HBM3&lt;br /&gt;
| 512 GB&lt;br /&gt;
| 384 GB&lt;br /&gt;
| 384 GB&lt;br /&gt;
|-&lt;br /&gt;
!scope=&amp;quot;column&amp;quot;| Local SSD&lt;br /&gt;
| 1.8 TB NVMe&lt;br /&gt;
| 3.84 TB NVMe&lt;br /&gt;
| 15.36 TB NVMe&lt;br /&gt;
| 15.36 TB NVMe&lt;br /&gt;
| 7.68 TB NVMe&lt;br /&gt;
| 6.4 TB NVMe&lt;br /&gt;
| 1.92 TB SATA SSD&lt;br /&gt;
| 7.68 TB SATA SSD&lt;br /&gt;
|-&lt;br /&gt;
!scope=&amp;quot;column&amp;quot;| Accelerators&lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| 4x NVIDIA H100 &lt;br /&gt;
| 4x AMD Instinct MI300A&lt;br /&gt;
| 4x NVIDIA A100 / H100 &lt;br /&gt;
| 4x NVIDIA A100&lt;br /&gt;
| -&lt;br /&gt;
|-&lt;br /&gt;
!scope=&amp;quot;column&amp;quot;| Accelerator memory&lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| -&lt;br /&gt;
| 94 GB&lt;br /&gt;
| APU&lt;br /&gt;
| 80 GB / 94 GB&lt;br /&gt;
| 40 GB&lt;br /&gt;
| -&lt;br /&gt;
|-&lt;br /&gt;
!scope=&amp;quot;column&amp;quot;| Interconnect&lt;br /&gt;
| IB HDR200 &lt;br /&gt;
| IB 2x NDR200&lt;br /&gt;
| IB 2x NDR200&lt;br /&gt;
| IB 4x NDR200&lt;br /&gt;
| IB 2x NDR200&lt;br /&gt;
| IB 2x HDR200 &lt;br /&gt;
| IB 4x EDR&lt;br /&gt;
| IB 1x NDR200&lt;br /&gt;
|}&lt;br /&gt;
Table 1: Hardware overview and properties&lt;br /&gt;
&lt;br /&gt;
= File Systems =&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 the following file systems are available:&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;$HOME&#039;&#039;&#039;&amp;lt;br&amp;gt;The HOME directory is created automatically after account activation, and the environment variable $HOME holds its name. HOME is the place, where users find themselves after login.&lt;br /&gt;
* &#039;&#039;&#039;Workspaces&#039;&#039;&#039;&amp;lt;br&amp;gt;Users can create so-called workspaces for non-permanent data with temporary lifetime.&lt;br /&gt;
* &#039;&#039;&#039;Workspaces on flash storage&#039;&#039;&#039;&amp;lt;br&amp;gt;A further workspace file system based on flash-only storage is available for special requirements and certain users.&lt;br /&gt;
* &#039;&#039;&#039;$TMPDIR&#039;&#039;&#039;&amp;lt;br&amp;gt;The directory $TMPDIR is only available and visible on the local node during the runtime of a compute job. It is located on fast SSD storage devices.&lt;br /&gt;
* &#039;&#039;&#039;BeeOND&#039;&#039;&#039; (BeeGFS On-Demand)&amp;lt;br&amp;gt;On request a parallel on-demand file system (BeeOND) is created which uses the SSDs of the nodes which were allocated to the batch job.&lt;br /&gt;
* &#039;&#039;&#039;LSDF Online Storage&#039;&#039;&#039;&amp;lt;br&amp;gt;On request the external LSDF Online Storage is mounted on the nodes which were allocated to the batch job. On the login nodes, LSDF is automatically mounted.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Which file system to use?&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
You should separate your data and store it on the appropriate file system.&lt;br /&gt;
Permanently needed data like software or important results should be stored in $HOME but capacity restrictions (quotas) apply.&lt;br /&gt;
In case you accidentally deleted data on $HOME there is a chance that we can restore it from backup.&lt;br /&gt;
Permanent data which is not needed for months or exceeds the capacity restrictions should be sent to the LSDF Online Storage or to the archive and deleted from the file systems. Temporary data which is only needed on a single node and which does not exceed the disk space shown in Table 1 above should be stored&lt;br /&gt;
below $TMPDIR. Data which is read many times on a single node, e.g. if you are doing AI training, &lt;br /&gt;
should be copied to $TMPDIR and read from there. Temporary data which is used from many nodes &lt;br /&gt;
of your batch job and which is only needed during job runtime should be stored on a &lt;br /&gt;
parallel on-demand file system BeeOND. Temporary data which can be recomputed or which is the &lt;br /&gt;
result of one job and input for another job should be stored in workspaces. The lifetime &lt;br /&gt;
of data in workspaces is limited and depends on the lifetime of the workspace which can be &lt;br /&gt;
several months.&lt;br /&gt;
&lt;br /&gt;
For further details please check: [[BwUniCluster3.0/Hardware_and_Architecture/Filesystem_Details|File System Details]]&lt;br /&gt;
&lt;br /&gt;
== $HOME ==&lt;br /&gt;
&lt;br /&gt;
The $HOME directories of bwUniCluster 3.0 users are located on the parallel file system Lustre.&lt;br /&gt;
You have access to your $HOME directory from all nodes of UC3. A regular backup of these directories &lt;br /&gt;
to tape archive is done automatically. The directory $HOME is used to hold those files that are&lt;br /&gt;
permanently used like source codes, configuration files, executable programs etc.&lt;br /&gt;
&lt;br /&gt;
[[BwUniCluster3.0/Hardware_and_Architecture/Filesystem_Details#$HOME|Detailed information on $HOME]]&lt;br /&gt;
&lt;br /&gt;
== Workspaces ==&lt;br /&gt;
&lt;br /&gt;
On UC3 workspaces should be used to store large non-permanent data sets, e.g. restart files or output&lt;br /&gt;
data that has to be post-processed. The file system used for workspaces is also the parallel file system Lustre. This file system is especially designed for parallel access and for a high throughput to large&lt;br /&gt;
files. It is able to provide high data transfer rates of up to 40 GB/s write and read performance when data access is parallel. &lt;br /&gt;
&lt;br /&gt;
On UC3 there is a default user quota limit of 40 TiB and 20 million inodes (files and directories) per user.&lt;br /&gt;
&lt;br /&gt;
[[BwUniCluster3.0/Hardware_and_Architecture/Filesystem_Details#Workspaces|Detailed information on Workspaces]]&lt;br /&gt;
&lt;br /&gt;
== Workspaces on flash storage ==&lt;br /&gt;
&lt;br /&gt;
Another workspace file system based on flash-only storage is available for special requirements and certain users.&lt;br /&gt;
If possible, this file system should be used from the Ice Lake nodes of bwUniCluster 3.0 (queue &#039;&#039;cpu_il&#039;&#039;). &lt;br /&gt;
It provides high IOPS rates and better performance for small files. The quota limts are lower than on the &lt;br /&gt;
normal workspace file system.&lt;br /&gt;
&lt;br /&gt;
[[BwUniCluster3.0/Hardware_and_Architecture/Filesystem_Details#Workspaces_on_flash_storage|Detailed information on Workspaces on flash storage]]&lt;br /&gt;
&lt;br /&gt;
== $TMPDIR ==&lt;br /&gt;
&lt;br /&gt;
The environment variable $TMPDIR contains the name of a directory which is located on the local SSD of each node. &lt;br /&gt;
This directory should be used for temporary files being accessed from the local node. It should &lt;br /&gt;
also be used if you read the same data many times from a single node, e.g. if you are doing AI training. &lt;br /&gt;
Because of the extremely fast local SSD storage devices performance with small files is much better than on the parallel file systems. &lt;br /&gt;
&lt;br /&gt;
[[BwUniCluster3.0/Hardware_and_Architecture/Filesystem_Details#$TMPDIR|Detailed information on $TMPDIR]]&lt;br /&gt;
&lt;br /&gt;
== BeeOND (BeeGFS On-Demand) ==&lt;br /&gt;
&lt;br /&gt;
Users have the possibility to request a private BeeOND (on-demand BeeGFS) parallel filesystem for each job. The file system is created during job startup and purged when your job completes.&lt;br /&gt;
&lt;br /&gt;
[[BwUniCluster3.0/Hardware_and_Architecture/Filesystem_Details#BeeOND_(BeeGFS_On-Demand)|Detailed information on BeeOND]]&lt;br /&gt;
&lt;br /&gt;
== LSDF Online Storage ==&lt;br /&gt;
&lt;br /&gt;
The LSDF Online Storage allows dedicated users to store scientific measurement data and simulation results. BwUniCluster 3.0 has an extremely fast network connection to the LSDF Online Storage. This file system provides external access via different protocols and is only available for certain users.&lt;br /&gt;
&lt;br /&gt;
[[BwUniCluster3.0/Hardware_and_Architecture/Filesystem_Details#LSDF_Online_Storage|Detailed information on LSDF Online Storage]]&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=Talk:Development/Python&amp;diff=15241</id>
		<title>Talk:Development/Python</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=Talk:Development/Python&amp;diff=15241"/>
		<updated>2025-08-21T11:29:08Z</updated>

		<summary type="html">&lt;p&gt;S Braun: Created page with &amp;quot;Samuel: Wir sollten uv mit in die Tools Liste aufnehmen.&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Samuel: Wir sollten uv mit in die Tools Liste aufnehmen.&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=Workspace&amp;diff=15209</id>
		<title>Workspace</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=Workspace&amp;diff=15209"/>
		<updated>2025-08-15T11:07:02Z</updated>

		<summary type="html">&lt;p&gt;S Braun: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Workspace tools&#039;&#039;&#039; provide temporary scratch space so calles &#039;&#039;&#039;workspaces&#039;&#039;&#039; for your calculation on a central file storage. They are meant to keep data for a limited time – but usually longer than the time of a single job run. &lt;br /&gt;
&lt;br /&gt;
== No Backup ==&lt;br /&gt;
&lt;br /&gt;
Workspaces are not meant for permanent storage, hence data in workspaces is not backed up and may be lost in case of problems on the storage system. Please copy/move important results to $HOME or some disks outside the cluster.&lt;br /&gt;
&lt;br /&gt;
== Create workspace ==&lt;br /&gt;
To create a workspace you need to state &#039;&#039;name&#039;&#039; of your workspace and &#039;&#039;lifetime&#039;&#039; in days. A maximum value for &#039;&#039;lifetime&#039;&#039; and a maximum number of renewals is defined on each cluster.  Execution of:&lt;br /&gt;
&lt;br /&gt;
   $ ws_allocate mySpace 30&lt;br /&gt;
&lt;br /&gt;
e.g. returns:&lt;br /&gt;
 &lt;br /&gt;
   Workspace created. Duration is 720 hours. &lt;br /&gt;
   Further extensions available: 3&lt;br /&gt;
   /work/workspace/scratch/username-mySpace-0&lt;br /&gt;
&lt;br /&gt;
For more information read the program&#039;s help, i.e. &#039;&#039;$ ws_allocate -h&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
== List all your workspaces ==&lt;br /&gt;
To list all your workspaces, execute:&lt;br /&gt;
&lt;br /&gt;
   $ ws_list&lt;br /&gt;
&lt;br /&gt;
which will return:&lt;br /&gt;
* Workspace ID&lt;br /&gt;
* Workspace location&lt;br /&gt;
* available extensions&lt;br /&gt;
* creation date and remaining time&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Find workspace location ==&lt;br /&gt;
&lt;br /&gt;
Workspace location/path can be prompted for any workspace &#039;&#039;ID&#039;&#039; using &#039;&#039;&#039;ws_find&#039;&#039;&#039;, in case of workspace &#039;&#039;mySpace&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
   $ ws_find mySpace&lt;br /&gt;
&lt;br /&gt;
returns the one-liner:&lt;br /&gt;
&lt;br /&gt;
   /work/workspace/scratch/username-mySpace-0&lt;br /&gt;
&lt;br /&gt;
 &lt;br /&gt;
&lt;br /&gt;
== Extend lifetime of your workspace ==&lt;br /&gt;
&lt;br /&gt;
Any workspace&#039;s lifetime can be only extended a cluster-specific number of times. There several commands to extend workspace lifetime&lt;br /&gt;
#&amp;lt;pre&amp;gt;$ ws_extend mySpace 40&amp;lt;/pre&amp;gt; which extends workspace ID &#039;&#039;mySpace&#039;&#039; by &#039;&#039;40&#039;&#039; days from now,&lt;br /&gt;
#&amp;lt;pre&amp;gt;$ ws_extend mySpace&amp;lt;/pre&amp;gt; which extends workspace ID &#039;&#039;mySpace&#039;&#039; by the number days used previously&lt;br /&gt;
#&amp;lt;pre&amp;gt;$ ws_allocate -x mySpace 40&amp;lt;/pre&amp;gt; which extends workspace ID &#039;&#039;mySpace&#039;&#039; by &#039;&#039;40&#039;&#039; days from now.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Setting Permissions for Sharing Files ==&lt;br /&gt;
The examples will assume you want to change the directory in $DIR. If you want to share a workspace, DIR could be set with &amp;lt;code&amp;gt;DIR=$(ws_find my_workspace)&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Regular Unix Permissions ===&lt;br /&gt;
&lt;br /&gt;
Making workspaces world readable/writable using standard unix access rights with &amp;lt;tt&amp;gt;chmod&amp;lt;/tt&amp;gt; is only feasible if you are in a research group and you and your co-workers share a common  (&amp;quot;bwXXXXX&amp;quot;) unix group. It is strongly discouraged to make files readable or even writable to everyone or to large common groups. &lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
!style=&amp;quot;width:45%&amp;quot; | Command&lt;br /&gt;
!style=&amp;quot;width:55%&amp;quot; | Action&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;tt&amp;gt;chgrp -R bw16e001 &amp;quot;$DIR&amp;quot;&amp;lt;/tt&amp;gt;&lt;br /&gt;
&amp;lt;tt&amp;gt;chmod -R g+rX &amp;quot;$DIR&amp;quot;&amp;lt;/tt&amp;gt;&lt;br /&gt;
|Set group ownership and grant read access to group for files in workspace via unix rights to the group &amp;quot;bw16e001&amp;quot; (has to be re-done if files are added)&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;tt&amp;gt;chgrp -R bw16e001 &amp;quot;$DIR&amp;quot;&amp;lt;/tt&amp;gt; &lt;br /&gt;
&amp;lt;tt&amp;gt;chmod -R g+rswX &amp;quot;$DIR&amp;quot;&amp;lt;/tt&amp;gt;&lt;br /&gt;
|Set group ownership and grant read/write access to group for files in workspace via unix rights (has to be re-done if files are added). Group will be inherited by new files, but rights for the group will have to be re-set with chmod for every new file&lt;br /&gt;
|- &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Options used:&lt;br /&gt;
* -R: recursive&lt;br /&gt;
* g+rwx&lt;br /&gt;
** g: group&lt;br /&gt;
** + add permissions (- to remove)&lt;br /&gt;
** rwx: read, write, execute&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;ACL&amp;quot;s: Access Crontrol Lists ===&lt;br /&gt;
ACLs  allow a much more detailed distribution of permissions but are a bit more complicated and not visible in detail via &amp;quot;ls&amp;quot;. They have the additional advantage that you can set a &amp;quot;default&amp;quot; ACL for a directory, (with a &amp;lt;tt&amp;gt;-d&amp;lt;/tt&amp;gt; flag or a &amp;lt;tt&amp;gt;d:&amp;lt;/tt&amp;gt; prefix) which will cause all newly created files to inherit the ACLs from the directory. Regular unix permissions only have limited support (only group ownership, not access rights) for this via the suid bit.&lt;br /&gt;
&lt;br /&gt;
Best practices with respect to ACL usage:&lt;br /&gt;
# Take into account that ACL take precedence over standard unix access rights&lt;br /&gt;
# The owner of a workspace is responsible for its content and management&lt;br /&gt;
&lt;br /&gt;
Please note that &amp;lt;tt&amp;gt;ls&amp;lt;/tt&amp;gt; (List directory contents) shows ACLs on directories and files only when run as &amp;lt;tt&amp;gt;ls -l&amp;lt;/tt&amp;gt; as in long format, as &amp;quot;plus&amp;quot; sign after the standard unix access rights. &lt;br /&gt;
&lt;br /&gt;
Examples with regard to &amp;quot;my_workspace&amp;quot;:&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
!style=&amp;quot;width:45%&amp;quot; | Command&lt;br /&gt;
!style=&amp;quot;width:55%&amp;quot; | Action&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;tt&amp;gt;getfacl &amp;quot;$DIR&amp;quot;&amp;lt;/tt&amp;gt;&lt;br /&gt;
|List access rights on $DIR&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;tt&amp;gt;setfacl -Rm u:fr_xy1:rX,d:u:fr_xy1:rX &amp;quot;$DIR&amp;quot;&amp;lt;/tt&amp;gt;&lt;br /&gt;
|Grant user &amp;quot;fr_xy1&amp;quot; read-only access to $DIR&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;tt&amp;gt;setfacl -Rm u:fr_me0000:rwX,d:u:fr_me0000:rwX &amp;quot;$DIR&amp;quot;&amp;lt;/tt&amp;gt;&lt;br /&gt;
&amp;lt;tt&amp;gt;setfacl -Rm u:fr_xy1:rwX,d:u:fr_xy1:rwX &amp;quot;$DIR&amp;quot;&amp;lt;/tt&amp;gt;&lt;br /&gt;
|Grant your own user &amp;quot;fr_me0000&amp;quot; and &amp;quot;fr_xy1&amp;quot; inheritable read and write access to $DIR, so you can also read/write files put into the workspace by a coworker&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;tt&amp;gt;setfacl -Rm g:bw16e001:rX,d:g:bw16e001:rX &amp;quot;$DIR&amp;quot;&amp;lt;/tt&amp;gt;&lt;br /&gt;
|Grant group (Rechenvorhaben) &amp;quot;bw16e001&amp;quot; read-only access to $DIR&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;tt&amp;gt;setfacl -Rb &amp;quot;$DIR&amp;quot;&amp;lt;/tt&amp;gt;&lt;br /&gt;
|Remove all ACL rights. Standard Unix access rights apply again.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Options used:&lt;br /&gt;
* -R: recursive&lt;br /&gt;
* -m: modify&lt;br /&gt;
* u:username:rwX u: next name is a user; rwX read, write, eXecute (only where execute is set for user)&lt;br /&gt;
&lt;br /&gt;
== Delete a Workspace ==&lt;br /&gt;
&lt;br /&gt;
   $ ws_release mySpace # Manually erase your workspace mySpace&lt;br /&gt;
&lt;br /&gt;
Note: workspaces are kept for some time after release. To immediately delete and free space e.g. for quota reasons, delete the files with rm before release.&lt;br /&gt;
&lt;br /&gt;
Newer versions of workspace tools have a --delete-data flag that immediately deletes data. Note that deleted data from workspaces is permanently lost.&lt;br /&gt;
&lt;br /&gt;
== Restore an Expired Workspace ==&lt;br /&gt;
&lt;br /&gt;
For a certain (system-specific) grace time following workspace expiration, a workspace can be restored by performing the following steps:&lt;br /&gt;
&lt;br /&gt;
(1) Display restorable workspaces.&lt;br /&gt;
 ws_restore -l&lt;br /&gt;
&lt;br /&gt;
(2) Create a new workspace as the target for the restore:&lt;br /&gt;
 ws_allocate restored 60&lt;br /&gt;
&lt;br /&gt;
(3) Restore:&lt;br /&gt;
 ws_restore &amp;lt;full_name_of_expired_workspace&amp;gt; restored&lt;br /&gt;
&lt;br /&gt;
The expired workspace has to be specified using the &#039;&#039;&#039;full name&#039;&#039;&#039;, including username prefix and timestamp suffix (otherwise, it cannot be uniquely identified).&lt;br /&gt;
The target workspace, on the other hand, must be given with just its short name as listed by &amp;lt;code&amp;gt;ws_list&amp;lt;/code&amp;gt;, without the username prefix.&lt;br /&gt;
&lt;br /&gt;
If the workspace is no visible/restorable, it has been &#039;&#039;&#039;permanently deleted&#039;&#039;&#039; and cannot be restored, not even by us. Please always remember, that workspaces are intended solely for temporary work data, and there is no backup of data in the workspaces.&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=Workspace&amp;diff=15208</id>
		<title>Workspace</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=Workspace&amp;diff=15208"/>
		<updated>2025-08-15T11:06:29Z</updated>

		<summary type="html">&lt;p&gt;S Braun: /* Create workspace */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;Workspace tools&#039;&#039;&#039; provide temporary scratch space so calles &#039;&#039;&#039;workspaces&#039;&#039;&#039; for your calculation on a central file storage. They are meant to keep data for a limited time – but usually longer than the time of a single job run. &lt;br /&gt;
&lt;br /&gt;
== No Backup ==&lt;br /&gt;
&lt;br /&gt;
Workspaces are not meant for permanent storage, hence data in workspaces is not backed up and may be lost in case of problems on the storage system. Please copy/move important results to $HOME or some disks outside the cluster.&lt;br /&gt;
&lt;br /&gt;
== Create workspace ==&lt;br /&gt;
To create a workspace you need to state &#039;&#039;name&#039;&#039; of your workspace and &#039;&#039;lifetime&#039;&#039; in days. A maximum value for &#039;&#039;lifetime&#039;&#039; and a maximum number of renewals is defined on each cluster.  Execution of:&lt;br /&gt;
&lt;br /&gt;
   $ ws_allocate mySpace 30&lt;br /&gt;
&lt;br /&gt;
e.g. returns:&lt;br /&gt;
 &lt;br /&gt;
   Workspace created. Duration is 720 hours. &lt;br /&gt;
   Further extensions available: 3&lt;br /&gt;
   /work/workspace/scratch/username-mySpace-0&lt;br /&gt;
&lt;br /&gt;
For more information read the program&#039;s help, i.e. &#039;&#039;$ ws_allocate -h&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
== List all your workspaces ==&lt;br /&gt;
To list all your workspaces, execute:&lt;br /&gt;
&lt;br /&gt;
   $ ws_list&lt;br /&gt;
&lt;br /&gt;
which will return:&lt;br /&gt;
* Workspace ID&lt;br /&gt;
* Workspace location&lt;br /&gt;
* available extensions&lt;br /&gt;
* creation date and remaining time&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Find workspace location ==&lt;br /&gt;
&lt;br /&gt;
Workspace location/path can be prompted for any workspace &#039;&#039;ID&#039;&#039; using &#039;&#039;&#039;ws_find&#039;&#039;&#039;, in case of workspace &#039;&#039;blah&#039;&#039;:&lt;br /&gt;
&lt;br /&gt;
   $ ws_find blah&lt;br /&gt;
&lt;br /&gt;
returns the one-liner:&lt;br /&gt;
&lt;br /&gt;
   /work/workspace/scratch/username-blah-0&lt;br /&gt;
&lt;br /&gt;
 &lt;br /&gt;
&lt;br /&gt;
== Extend lifetime of your workspace ==&lt;br /&gt;
&lt;br /&gt;
Any workspace&#039;s lifetime can be only extended a cluster-specific number of times. There several commands to extend workspace lifetime&lt;br /&gt;
#&amp;lt;pre&amp;gt;$ ws_extend blah 40&amp;lt;/pre&amp;gt; which extends workspace ID &#039;&#039;blah&#039;&#039; by &#039;&#039;40&#039;&#039; days from now,&lt;br /&gt;
#&amp;lt;pre&amp;gt;$ ws_extend blah&amp;lt;/pre&amp;gt; which extends workspace ID &#039;&#039;blah&#039;&#039; by the number days used previously&lt;br /&gt;
#&amp;lt;pre&amp;gt;$ ws_allocate -x blah 40&amp;lt;/pre&amp;gt; which extends workspace ID &#039;&#039;blah&#039;&#039; by &#039;&#039;40&#039;&#039; days from now.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Setting Permissions for Sharing Files ==&lt;br /&gt;
The examples will assume you want to change the directory in $DIR. If you want to share a workspace, DIR could be set with &amp;lt;code&amp;gt;DIR=$(ws_find my_workspace)&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Regular Unix Permissions ===&lt;br /&gt;
&lt;br /&gt;
Making workspaces world readable/writable using standard unix access rights with &amp;lt;tt&amp;gt;chmod&amp;lt;/tt&amp;gt; is only feasible if you are in a research group and you and your co-workers share a common  (&amp;quot;bwXXXXX&amp;quot;) unix group. It is strongly discouraged to make files readable or even writable to everyone or to large common groups. &lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
!style=&amp;quot;width:45%&amp;quot; | Command&lt;br /&gt;
!style=&amp;quot;width:55%&amp;quot; | Action&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;tt&amp;gt;chgrp -R bw16e001 &amp;quot;$DIR&amp;quot;&amp;lt;/tt&amp;gt;&lt;br /&gt;
&amp;lt;tt&amp;gt;chmod -R g+rX &amp;quot;$DIR&amp;quot;&amp;lt;/tt&amp;gt;&lt;br /&gt;
|Set group ownership and grant read access to group for files in workspace via unix rights to the group &amp;quot;bw16e001&amp;quot; (has to be re-done if files are added)&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;tt&amp;gt;chgrp -R bw16e001 &amp;quot;$DIR&amp;quot;&amp;lt;/tt&amp;gt; &lt;br /&gt;
&amp;lt;tt&amp;gt;chmod -R g+rswX &amp;quot;$DIR&amp;quot;&amp;lt;/tt&amp;gt;&lt;br /&gt;
|Set group ownership and grant read/write access to group for files in workspace via unix rights (has to be re-done if files are added). Group will be inherited by new files, but rights for the group will have to be re-set with chmod for every new file&lt;br /&gt;
|- &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Options used:&lt;br /&gt;
* -R: recursive&lt;br /&gt;
* g+rwx&lt;br /&gt;
** g: group&lt;br /&gt;
** + add permissions (- to remove)&lt;br /&gt;
** rwx: read, write, execute&lt;br /&gt;
&lt;br /&gt;
=== &amp;quot;ACL&amp;quot;s: Access Crontrol Lists ===&lt;br /&gt;
ACLs  allow a much more detailed distribution of permissions but are a bit more complicated and not visible in detail via &amp;quot;ls&amp;quot;. They have the additional advantage that you can set a &amp;quot;default&amp;quot; ACL for a directory, (with a &amp;lt;tt&amp;gt;-d&amp;lt;/tt&amp;gt; flag or a &amp;lt;tt&amp;gt;d:&amp;lt;/tt&amp;gt; prefix) which will cause all newly created files to inherit the ACLs from the directory. Regular unix permissions only have limited support (only group ownership, not access rights) for this via the suid bit.&lt;br /&gt;
&lt;br /&gt;
Best practices with respect to ACL usage:&lt;br /&gt;
# Take into account that ACL take precedence over standard unix access rights&lt;br /&gt;
# The owner of a workspace is responsible for its content and management&lt;br /&gt;
&lt;br /&gt;
Please note that &amp;lt;tt&amp;gt;ls&amp;lt;/tt&amp;gt; (List directory contents) shows ACLs on directories and files only when run as &amp;lt;tt&amp;gt;ls -l&amp;lt;/tt&amp;gt; as in long format, as &amp;quot;plus&amp;quot; sign after the standard unix access rights. &lt;br /&gt;
&lt;br /&gt;
Examples with regard to &amp;quot;my_workspace&amp;quot;:&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
!style=&amp;quot;width:45%&amp;quot; | Command&lt;br /&gt;
!style=&amp;quot;width:55%&amp;quot; | Action&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;tt&amp;gt;getfacl &amp;quot;$DIR&amp;quot;&amp;lt;/tt&amp;gt;&lt;br /&gt;
|List access rights on $DIR&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;tt&amp;gt;setfacl -Rm u:fr_xy1:rX,d:u:fr_xy1:rX &amp;quot;$DIR&amp;quot;&amp;lt;/tt&amp;gt;&lt;br /&gt;
|Grant user &amp;quot;fr_xy1&amp;quot; read-only access to $DIR&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;tt&amp;gt;setfacl -Rm u:fr_me0000:rwX,d:u:fr_me0000:rwX &amp;quot;$DIR&amp;quot;&amp;lt;/tt&amp;gt;&lt;br /&gt;
&amp;lt;tt&amp;gt;setfacl -Rm u:fr_xy1:rwX,d:u:fr_xy1:rwX &amp;quot;$DIR&amp;quot;&amp;lt;/tt&amp;gt;&lt;br /&gt;
|Grant your own user &amp;quot;fr_me0000&amp;quot; and &amp;quot;fr_xy1&amp;quot; inheritable read and write access to $DIR, so you can also read/write files put into the workspace by a coworker&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;tt&amp;gt;setfacl -Rm g:bw16e001:rX,d:g:bw16e001:rX &amp;quot;$DIR&amp;quot;&amp;lt;/tt&amp;gt;&lt;br /&gt;
|Grant group (Rechenvorhaben) &amp;quot;bw16e001&amp;quot; read-only access to $DIR&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;tt&amp;gt;setfacl -Rb &amp;quot;$DIR&amp;quot;&amp;lt;/tt&amp;gt;&lt;br /&gt;
|Remove all ACL rights. Standard Unix access rights apply again.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Options used:&lt;br /&gt;
* -R: recursive&lt;br /&gt;
* -m: modify&lt;br /&gt;
* u:username:rwX u: next name is a user; rwX read, write, eXecute (only where execute is set for user)&lt;br /&gt;
&lt;br /&gt;
== Delete a Workspace ==&lt;br /&gt;
&lt;br /&gt;
   $ ws_release blah # Manually erase your workspace blah&lt;br /&gt;
&lt;br /&gt;
Note: workspaces are kept for some time after release. To immediately delete and free space e.g. for quota reasons, delete the files with rm before release.&lt;br /&gt;
&lt;br /&gt;
Newer versions of workspace tools have a --delete-data flag that immediately deletes data. Note that deleted data from workspaces is permanently lost.&lt;br /&gt;
&lt;br /&gt;
== Restore an Expired Workspace ==&lt;br /&gt;
&lt;br /&gt;
For a certain (system-specific) grace time following workspace expiration, a workspace can be restored by performing the following steps:&lt;br /&gt;
&lt;br /&gt;
(1) Display restorable workspaces.&lt;br /&gt;
 ws_restore -l&lt;br /&gt;
&lt;br /&gt;
(2) Create a new workspace as the target for the restore:&lt;br /&gt;
 ws_allocate restored 60&lt;br /&gt;
&lt;br /&gt;
(3) Restore:&lt;br /&gt;
 ws_restore &amp;lt;full_name_of_expired_workspace&amp;gt; restored&lt;br /&gt;
&lt;br /&gt;
The expired workspace has to be specified using the &#039;&#039;&#039;full name&#039;&#039;&#039;, including username prefix and timestamp suffix (otherwise, it cannot be uniquely identified).&lt;br /&gt;
The target workspace, on the other hand, must be given with just its short name as listed by &amp;lt;code&amp;gt;ws_list&amp;lt;/code&amp;gt;, without the username prefix.&lt;br /&gt;
&lt;br /&gt;
If the workspace is no visible/restorable, it has been &#039;&#039;&#039;permanently deleted&#039;&#039;&#039; and cannot be restored, not even by us. Please always remember, that workspaces are intended solely for temporary work data, and there is no backup of data in the workspaces.&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster2.0/Software/Start_vnc_desktop&amp;diff=15207</id>
		<title>BwUniCluster2.0/Software/Start vnc desktop</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster2.0/Software/Start_vnc_desktop&amp;diff=15207"/>
		<updated>2025-08-14T11:14:15Z</updated>

		<summary type="html">&lt;p&gt;S Braun: S Braun moved page BwUniCluster2.0/Software/Start vnc desktop to BwUniCluster3.0/Software/Start vnc desktop&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;#REDIRECT [[BwUniCluster3.0/Software/Start vnc desktop]]&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Software/Start_vnc_desktop&amp;diff=15206</id>
		<title>BwUniCluster3.0/Software/Start vnc desktop</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Software/Start_vnc_desktop&amp;diff=15206"/>
		<updated>2025-08-14T11:14:15Z</updated>

		<summary type="html">&lt;p&gt;S Braun: S Braun moved page BwUniCluster2.0/Software/Start vnc desktop to BwUniCluster3.0/Software/Start vnc desktop&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;The Linux 3D graphics stack is based on &#039;&#039;X11&#039;&#039; and &#039;&#039;OpenGL&#039;&#039;. This has some&lt;br /&gt;
drawbacks in conjunction with remote visualization:&lt;br /&gt;
&lt;br /&gt;
* Rendering takes place on the client, not the cluster&lt;br /&gt;
* Whole 3D model must be transferred via network to the client&lt;br /&gt;
* Some OpenGL extensions are not supported when using indirect / client side rendering instead of direct / hardware based rendering&lt;br /&gt;
* Many round trips in the X11 protocol negatively influence interactivity&lt;br /&gt;
* X11 is not available on non-Linux platforms&lt;br /&gt;
* Compatibility problems between client and cluster can occur&lt;br /&gt;
&lt;br /&gt;
To avoid these drawbacks,  &amp;lt;code&amp;gt;start_vnc_desktop&amp;lt;/code&amp;gt; is provided.&lt;br /&gt;
It combines the three open source  products [http://www.turbovnc.org/ TurboVNC], [http://www.virtualgl.org/ VirtualGL] and [http://openswr.org/ OpenSWR].&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;Virtual Network Computing (VNC)&#039;&#039; is a graphical desktop sharing system.&lt;br /&gt;
VNC is platform-independent - there are clients and servers for many&lt;br /&gt;
GUI-based operating systems. The VNC server is the program on the&lt;br /&gt;
machine that shares its screen. The VNC client (or viewer) is the&lt;br /&gt;
program that watches, controls, and interacts with the server. For more&lt;br /&gt;
details see: [https://en.wikipedia.org/wiki/VNC Wikipedia]&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;VirtualGL&#039;&#039; redirects the 3D rendering commands from Linux OpenGL&lt;br /&gt;
applications to 3D accelerator hardware in the cluster. For more details&lt;br /&gt;
see: [https://en.wikipedia.org/wiki/VirtualGL Wikipedia]&lt;br /&gt;
&lt;br /&gt;
When no 3D accelerator hardware is available &#039;&#039;OpenSWR&#039;&#039;, a high&lt;br /&gt;
performance, highly scalable software rasterizer for OpenGL can carry&lt;br /&gt;
out the rendering task. For more details see:  [http://openswr.org OpenSWR]&lt;br /&gt;
&lt;br /&gt;
This script takes a two step approach to start a VNC server in the&lt;br /&gt;
cluster environment:&lt;br /&gt;
&lt;br /&gt;
In the first step the batch system is used to allocate resources where a&lt;br /&gt;
VNC server can be started.&lt;br /&gt;
&lt;br /&gt;
In the second step the VNC server is launched on the resources granted&lt;br /&gt;
by the batch system. When VNC server is successfully started all&lt;br /&gt;
required login credentials and connection parameters will be reported.&lt;br /&gt;
To connect to this VNC server a VNC client installation on the local&lt;br /&gt;
desktop is required. &lt;br /&gt;
&lt;br /&gt;
= Script usage =&lt;br /&gt;
&lt;br /&gt;
* After login the script can simply be called from the command line:&amp;lt;pre&amp;gt;start_vnc_desktop&amp;lt;/pre&amp;gt;&lt;br /&gt;
* To get help on the available options use:&amp;lt;pre&amp;gt;start_vnc_desktop --help&amp;lt;/pre&amp;gt;&lt;br /&gt;
* Hardware rendering is currently only available on FH2 and bwUniCluster, it can be requested with:&amp;lt;pre&amp;gt;start_vnc_desktop --hw-rendering&amp;lt;/pre&amp;gt;&lt;br /&gt;
* Software rendering is available on all clusters, it can be requested with: &amp;lt;pre&amp;gt;start_vnc_desktop --sw-rendering&amp;lt;/pre&amp;gt;&lt;br /&gt;
* There is only a limited number of nodes with hardware rendering support, software rendering runs on all nodes.&lt;br /&gt;
* For large 3D data sets the software renderer may be faster.&lt;br /&gt;
* If neither &amp;lt;code&amp;gt;--hw-rendering&amp;lt;/code&amp;gt; nor &amp;lt;code&amp;gt;--sw-rendering&amp;lt;/code&amp;gt; is selected no 3D rendering support is available.&lt;br /&gt;
&lt;br /&gt;
= VNC client =&lt;br /&gt;
&lt;br /&gt;
In general every VNC client can be used to connect to the VNC server.&lt;br /&gt;
However for best performance and compatibility the use of the&lt;br /&gt;
[http://www.turbovnc.org/ TurboVNC] client is recommended.&lt;br /&gt;
Below you find the necessary steps for different client operation systems.&lt;br /&gt;
&lt;br /&gt;
; Debian, Ubuntu:&lt;br /&gt;
* Download: [https://sourceforge.net/projects/turbovnc/files Download Site] -&amp;gt; latest version -&amp;gt; turbovnc_&amp;lt;VERSION&amp;gt;_amd64.deb&lt;br /&gt;
* Install: &amp;lt;pre&amp;gt; sudo apt-get install ./turbovnc_&amp;lt;VERSION&amp;gt;_amd64.deb&amp;lt;/pre&amp;gt;&lt;br /&gt;
* Execute: &amp;lt;pre&amp;gt;/opt/TurboVNC/bin/vncviewer&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
; Red Hat Enterprise Linux, Fedora:&lt;br /&gt;
* Download: [https://sourceforge.net/projects/turbovnc/files Download Site]  -&amp;gt; latest version -&amp;gt; turbovnc-&amp;lt;VERSION&amp;gt;.x86_64.rpm&lt;br /&gt;
* Install: &amp;lt;pre&amp;gt;sudo yum install ./turbovnc-&amp;lt;VERSION&amp;gt;.x86_64.rpm&amp;lt;/pre&amp;gt;&lt;br /&gt;
* Execute: &amp;lt;pre&amp;gt;/opt/TurboVNC/bin/vncviewer&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
; SUSE Linux Enterprise, openSUSE:&lt;br /&gt;
* Download [https://sourceforge.net/projects/turbovnc/files Download Site]  -&amp;gt; latest version -&amp;gt; turbovnc-&amp;lt;VERSION&amp;gt;.x86_64.rpm&lt;br /&gt;
* Install: &amp;lt;pre&amp;gt;sudo zypper install ./turbovnc-&amp;lt;VERSION&amp;gt;.x86_64.rpm&amp;lt;/pre&amp;gt;&lt;br /&gt;
* Execute: &amp;lt;pre&amp;gt;/opt/TurboVNC/bin/vncviewer&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
; ArchLinux:&lt;br /&gt;
* Download: Can be installed from the AUR&lt;br /&gt;
* Install: &amp;lt;pre&amp;gt;pacaur -S turbovnc&amp;lt;/pre&amp;gt;&lt;br /&gt;
* Execute: &amp;lt;pre&amp;gt;vncviewer&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
; Windows:&lt;br /&gt;
* Download: [https://sourceforge.net/projects/turbovnc/files Download Site] -&amp;gt; latest version -&amp;gt; TurboVNC64-&amp;lt;VERSION&amp;gt;.exe for 64-bit, TurboVNC-&amp;lt;VERSION&amp;gt;.exe for 32-bit&lt;br /&gt;
* Install: Double click on TurboVNC64-&amp;lt;VERSION&amp;gt;.exe / TurboVNC-&amp;lt;VERSION&amp;gt;.exe. Install in default directory (or chose a different one, if preferred)&lt;br /&gt;
* Execute:  Java TurboVNCviewer (vncviewer-javaw.bat in installation directory)&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster2.0/Software/Python_Dask&amp;diff=15205</id>
		<title>BwUniCluster2.0/Software/Python Dask</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster2.0/Software/Python_Dask&amp;diff=15205"/>
		<updated>2025-08-14T11:13:43Z</updated>

		<summary type="html">&lt;p&gt;S Braun: S Braun moved page BwUniCluster2.0/Software/Python Dask to BwUniCluster3.0/Software/Python Dask&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;#REDIRECT [[BwUniCluster3.0/Software/Python Dask]]&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Software/Python_Dask&amp;diff=15204</id>
		<title>BwUniCluster3.0/Software/Python Dask</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Software/Python_Dask&amp;diff=15204"/>
		<updated>2025-08-14T11:13:43Z</updated>

		<summary type="html">&lt;p&gt;S Braun: S Braun moved page BwUniCluster2.0/Software/Python Dask to BwUniCluster3.0/Software/Python Dask&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!--{| style=&amp;quot;border-style: solid; border-width: 1px&amp;quot;&lt;br /&gt;
! Navigation: [[BwHPC_Best_Practices_Repository|bwHPC BPR]] / [[BwUniCluster_User_Guide|bwUniCluster]] &lt;br /&gt;
|}--&amp;gt;&lt;br /&gt;
This guide explains how to use Python Dask and dask-jobqueue on bwUniCluster2.0.&lt;br /&gt;
&lt;br /&gt;
== Installation and Usage ==&lt;br /&gt;
Please have a look at our [https://github.com/hpcraink/workshop-parallel-jupyter Workshop] on how to use Dask on bwUniCluster2.0 (2_Grundlagen: Environment erstellen and 6_Dask). This is currently only available in German.&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster2.0/Software/OpenFoam&amp;diff=15203</id>
		<title>BwUniCluster2.0/Software/OpenFoam</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster2.0/Software/OpenFoam&amp;diff=15203"/>
		<updated>2025-08-14T11:13:06Z</updated>

		<summary type="html">&lt;p&gt;S Braun: S Braun moved page BwUniCluster2.0/Software/OpenFoam to BwUniCluster3.0/Software/OpenFoam&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;#REDIRECT [[BwUniCluster3.0/Software/OpenFoam]]&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Software/OpenFoam&amp;diff=15202</id>
		<title>BwUniCluster3.0/Software/OpenFoam</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Software/OpenFoam&amp;diff=15202"/>
		<updated>2025-08-14T11:13:06Z</updated>

		<summary type="html">&lt;p&gt;S Braun: S Braun moved page BwUniCluster2.0/Software/OpenFoam to BwUniCluster3.0/Software/OpenFoam&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Softwarepage|cae/openfoam}}&lt;br /&gt;
&lt;br /&gt;
{| width=600px class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Description !! Content&lt;br /&gt;
|-&lt;br /&gt;
| module load&lt;br /&gt;
| cae/openfoam&lt;br /&gt;
|-&lt;br /&gt;
| License&lt;br /&gt;
| [https://www.openfoam.org/licence.php GNU General Public Licence]&lt;br /&gt;
|-&lt;br /&gt;
| Citing&lt;br /&gt;
| n/a&lt;br /&gt;
|-&lt;br /&gt;
| Links&lt;br /&gt;
| [https://www.openfoam.org/ Homepage] &amp;amp;#124; [https://www.openfoam.org/docs/ Documentation]&lt;br /&gt;
|-&lt;br /&gt;
| Graphical Interface&lt;br /&gt;
| No&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
= Description =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;OpenFOAM&#039;&#039;&#039; (Open-source Field Operation And Manipulation) is a free, open-source CFD software package with an extensive range of features to solve anything from complex fluid flows involving chemical reactions, turbulence and heat transfer, to solid dynamics and electromagnetics.&lt;br /&gt;
&lt;br /&gt;
= Adding OpenFOAM to Your Environment =&lt;br /&gt;
&lt;br /&gt;
After loading the desired module, type to activate the OpenFOAM applications&lt;br /&gt;
&amp;lt;pre&amp;gt;$ source $FOAM_INIT&amp;lt;/pre&amp;gt;&lt;br /&gt;
or simply&lt;br /&gt;
&amp;lt;pre&amp;gt;$ foamInit&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Parallel run with OpenFOAM  =&lt;br /&gt;
For a better performance on running OpenFOAM jobs in parallel on bwUniCluster, it is recommended to have the decomposed data in local folders on each node.  &lt;br /&gt;
&lt;br /&gt;
Therefore you may use *HPC scripts, wich will copy your data to the node specific folders after running the decomposePar, and copy it back to the local case folder before running reconstructPar.&lt;br /&gt;
&lt;br /&gt;
Don&#039;t forget to allocate enough wall-time for decomposition and reconstruction of your cases. As the data will be processed directly on  the nodes, and may be lost if the job is cancelled before  the data is copied back into the case folder.&lt;br /&gt;
&lt;br /&gt;
Following commands will do that for you: &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;$ decomposeParHPC&lt;br /&gt;
$ reconstructParHPC&lt;br /&gt;
$ reconstructParMeshHPC&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
instead of:&lt;br /&gt;
&amp;lt;pre&amp;gt;$ decomposePar&lt;br /&gt;
$ reconstructPar&lt;br /&gt;
$ recontructParMesh&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For example, if you want to run&amp;lt;span style=&amp;quot;background:#edeae2;margin:10px;padding:1px;border:1px dotted #808080&amp;quot;&amp;gt;snappyHexMesh&amp;lt;/span&amp;gt;in parallel, you may use the following commands:&lt;br /&gt;
&amp;lt;pre&amp;gt;$ decomposeParMeshHPC&lt;br /&gt;
$ mpirun --bind-to core --map-by core -report-bindings snappyHexMesh -overwrite -parallel&lt;br /&gt;
$ reconstructParMeshHPC -constant&amp;lt;/pre&amp;gt;&lt;br /&gt;
instead of:&lt;br /&gt;
&amp;lt;pre&amp;gt;$ decomposePar&lt;br /&gt;
$ mpirun --bind-to core --map-by core -report-bindings snappyHexMesh -overwrite -parallel&lt;br /&gt;
$ reconstructParMesh -constant&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For running jobs on multiple nodes, OpenFOAM needs passwordless communication between the nodes, to copy data into the local folders.&lt;br /&gt;
&lt;br /&gt;
A small trick using ssh-keygen once will let your nodes to communicate freely over rsh. &lt;br /&gt;
&lt;br /&gt;
Do it once (if you didn&#039;t do it already in the past):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ssh-keygen&lt;br /&gt;
$ cat $HOME/.ssh/id_rsa.pub &amp;gt;&amp;gt; $HOME/.ssh/authorized_keys&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Building an OpenFOAM batch file for parallel processing =&lt;br /&gt;
== General information == &lt;br /&gt;
Before running OpenFOAM jobs in parallel, it is necessary to decompose the geometry domain into segments, equal to the number of processors (or threads) you intend to use. &lt;br /&gt;
&lt;br /&gt;
That means, for example, if you want to run a case on 8 processors, you will have to decompose the mesh in 8 segments, first. Then, you start the solver in &#039;&#039;parallel&#039;&#039;, letting &#039;&#039;OpenFOAM&#039;&#039; to run calculations concurrently on these segments, one processor responding for one segment of the mesh, sharing the data with all other processors in between. &lt;br /&gt;
&lt;br /&gt;
There is, of course, a mechanism that connects properly the calculations, so you don&#039;t loose your data or generate wrong results. &lt;br /&gt;
&lt;br /&gt;
Decomposition and segments building process is handled by&amp;lt;span style=&amp;quot;background:#edeae2;margin:10px;padding:1px;border:1px dotted #808080&amp;quot;&amp;gt;decomposePar&amp;lt;/span&amp;gt;utility. &lt;br /&gt;
&lt;br /&gt;
The number of subdomains, in which the geometry will be decomposed, is specified in &amp;quot;&#039;&#039;system/decomposeParDict&#039;&#039;&amp;quot;, as well as the decomposition method to use. &lt;br /&gt;
&lt;br /&gt;
The automatic decomposition method is &amp;quot;&#039;&#039;scotch&#039;&#039;&amp;quot;. It trims the mesh, collecting as many cells as possible per processor, trying to avoid having empty segments or segments with not enough cells. If you want your mesh to be divided in other way, specifying the number of segments it should be cut in x, y or z direction, for example, you can use &amp;quot;simple&amp;quot; or &amp;quot;hierarchical&amp;quot; methods. &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Wrapper script generation == &lt;br /&gt;
&#039;&#039;&#039;Attention:&#039;&#039;&#039; &amp;lt;span style=&amp;quot;background:#edeae2;margin:10px;padding:1px;border:1px dotted #808080&amp;quot;&amp;gt;openfoam&amp;lt;/span&amp;gt; module loads automatically the necessary &amp;lt;span style=&amp;quot;background:#edeae2;margin:10px;padding:1px;border:1px dotted #808080&amp;quot;&amp;gt;openmpi&amp;lt;/span&amp;gt; module for parallel run, do &#039;&#039;&#039;NOT&#039;&#039;&#039; load another version of mpi, as it may conflict with the loaded &amp;lt;span style=&amp;quot;background:#edeae2;margin:10px;padding:1px;border:1px dotted #808080&amp;quot;&amp;gt;openfoam&amp;lt;/span&amp;gt; version. &lt;br /&gt;
&lt;br /&gt;
A job-script to submit a batch job called &#039;&#039;job_openfoam.sh&#039;&#039; that runs &#039;&#039;icoFoam&#039;&#039; solver with OpenFoam version 8, on 80 processors, on a &#039;&#039;multiple&#039;&#039; partition with a total wall clock time of 6 hours looks like: &lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--b)--&amp;gt; &lt;br /&gt;
{| style=&amp;quot;width: 100%; border:1px solid #d0cfcc; background:#f2f7ff;border-spacing: 5px;&amp;quot;&lt;br /&gt;
| style=&amp;quot;width:280px; white-space:nowrap; color:#000;&amp;quot; |&lt;br /&gt;
&amp;lt;source lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
# Allocate nodes&lt;br /&gt;
#SBATCH --nodes=2&lt;br /&gt;
# Number of tasks per node&lt;br /&gt;
#SBATCH --ntasks-per-node=40&lt;br /&gt;
# Queue class https://wiki.bwhpc.de/e/BwUniCluster_2.0_Batch_Queues&lt;br /&gt;
#SBATCH --partition=multiple&lt;br /&gt;
# Maximum job run time&lt;br /&gt;
#SBATCH --time=4:00:00&lt;br /&gt;
# Give the job a reasonable name&lt;br /&gt;
#SBATCH --job-name=openfoam&lt;br /&gt;
# File name for standard output (%j will be replaced by job id)&lt;br /&gt;
#SBATCH --output=logs-%j.out&lt;br /&gt;
# File name for error output&lt;br /&gt;
#SBATCH --error=logs-%j.err&lt;br /&gt;
&lt;br /&gt;
# User defined variables&lt;br /&gt;
FOAM_VERSION=&amp;quot;8&amp;quot;&lt;br /&gt;
EXECUTABLE=&amp;quot;icoFoam&amp;quot;&lt;br /&gt;
MPIRUN_OPTIONS=&amp;quot;--bind-to core --map-by core --report-bindings&amp;quot;&lt;br /&gt;
&lt;br /&gt;
module load ${FOAM_VERSION}&lt;br /&gt;
foamInit&lt;br /&gt;
&lt;br /&gt;
# remove decomposePar if you already decomposed your case beforehand &lt;br /&gt;
decomposeParHPC &amp;amp;&amp;amp;&lt;br /&gt;
&lt;br /&gt;
# starting the solver in parallel. Name of the solver is given in the &amp;quot;EXECUTABLE&amp;quot; variable&lt;br /&gt;
mpirun ${MPIRUN_OPTIONS} ${EXECUTABLE} -parallel &amp;amp;&amp;amp;&lt;br /&gt;
&lt;br /&gt;
reconstructParHPC&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;Attention:&#039;&#039;&#039; The script above will run a parallel OpenFOAM Job with pre-installed OpenMPI. If you are using an OpenFOAM version wich comes with pre-installed Intel MPI (like, for example&amp;lt;span style=&amp;quot;background:#edeae2;margin:10px;padding:1px;border:1px dotted #808080&amp;quot;&amp;gt;cae/openfoam/v1712-impi&amp;lt;/span&amp;gt;) you will have to modify the batch script to use all the advantages of Intel MPI for parallel calculations. For details see:  &lt;br /&gt;
* [[Batch_Jobs_-_bwUniCluster_Features|Batch Jobs Features]]&lt;br /&gt;
&lt;br /&gt;
= Using I/O and reducing the amount of data and files =&lt;br /&gt;
In OpenFOAM, you can control which variables or fields are written at specific times. For example, for post-processing purposes, you might need only a subset of variables. In order to control which files will be written, there is a function object called &amp;quot;writeObjects&amp;quot;. &lt;br /&gt;
&lt;br /&gt;
An example controlDict file may look like this: At the top of the file (entry &amp;quot;writeControl&amp;quot;) you specify that ALL fields (variables) required for restarting are saved every 12 wall-clock hours. Then, additionally, at the bottom of the controlDict in the &amp;quot;functions&amp;quot; block, you can add a function object of type &amp;quot;writeObjects&amp;quot;. With this function object, you can control the output of specific fields independent of the entry at the top of the file: &lt;br /&gt;
&amp;lt;!--b)--&amp;gt; &lt;br /&gt;
{| style=&amp;quot;width: 100%; border:1px solid #d0cfcc; background:#f2f7ff;border-spacing: 5px;&amp;quot;&lt;br /&gt;
| style=&amp;quot;width:280px; white-space:nowrap; color:#000;&amp;quot; |&lt;br /&gt;
&amp;lt;source lang=&amp;quot;text&amp;quot;&amp;gt;&lt;br /&gt;
/*--------------------------------*- C++ -*----------------------------------*\&lt;br /&gt;
| =========                 |                                                 |&lt;br /&gt;
| \\      /  F ield         | OpenFOAM: The Open Source CFD Toolbox           |&lt;br /&gt;
|  \\    /   O peration     | Version:  4.1.x                                 |&lt;br /&gt;
|   \\  /    A nd           | Web:      www.OpenFOAM.org                      |&lt;br /&gt;
|    \\/     M anipulation  |                                                 |&lt;br /&gt;
\*---------------------------------------------------------------------------*/&lt;br /&gt;
FoamFile&lt;br /&gt;
{&lt;br /&gt;
    version     2.0;&lt;br /&gt;
    format      ascii;&lt;br /&gt;
    class       dictionary;&lt;br /&gt;
    location    &amp;quot;system&amp;quot;;&lt;br /&gt;
    object      controlDict;&lt;br /&gt;
}&lt;br /&gt;
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //&lt;br /&gt;
&lt;br /&gt;
startFrom       latestTime;&lt;br /&gt;
startTime       0;&lt;br /&gt;
stopAt  	endTime;&lt;br /&gt;
endTime         1e2;&lt;br /&gt;
deltaT          1e-5;&lt;br /&gt;
&lt;br /&gt;
writeControl    clockTime;&lt;br /&gt;
writeInterval   43200; // write ALL fields necessary to restart your simulation &lt;br /&gt;
                       // every 43200 wall-clock seconds = 12 hours of real time&lt;br /&gt;
&lt;br /&gt;
purgeWrite      0;&lt;br /&gt;
writeFormat     binary;&lt;br /&gt;
writePrecision  10;&lt;br /&gt;
writeCompression off;&lt;br /&gt;
timeFormat      general;&lt;br /&gt;
timePrecision   10;&lt;br /&gt;
runTimeModifiable false;&lt;br /&gt;
&lt;br /&gt;
functions&lt;br /&gt;
{&lt;br /&gt;
    writeFields // name of the function object&lt;br /&gt;
    {&lt;br /&gt;
        type writeObjects;&lt;br /&gt;
        libs ( &amp;quot;libutilityFunctionObjects.so&amp;quot; );&lt;br /&gt;
&lt;br /&gt;
        objects&lt;br /&gt;
        (&lt;br /&gt;
	    T U rho // list of fields/variables to be written&lt;br /&gt;
        );&lt;br /&gt;
&lt;br /&gt;
        // E.g. write every 1e-5 seconds of simulation time only the specified fields&lt;br /&gt;
        writeControl runTime;&lt;br /&gt;
        writeInterval 1e-5; // write every 1e-5 seconds&lt;br /&gt;
    }&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You can also define multiple function objects in order to write different subsets of fields at different times. You can also use wildcards in the list of fields- for example, in order to write out all fields starting with &amp;quot;RR_&amp;quot; you can add&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;quot;RR_.*&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
to the list of objects. You can get a list of valid field names by writing &amp;quot;banana&amp;quot; in the field list. During the run of the solver all valid field names are printed.&lt;br /&gt;
The output time can be changed too. Instead of writing at specific times in the simulation, you can also write after a certain number of time steps or depening on the wall clock time:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;// write every 100th simulation time step&lt;br /&gt;
writeControl timeStep;&lt;br /&gt;
writeInterval 100;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;// every 3600 seconds of real wall clock time&lt;br /&gt;
writeControl runtime;&lt;br /&gt;
writeInterval 3600; &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you use OpenFOAM before version 4.0 or 1606, the type of function object is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
type writeRegisteredObject; // (instead of type writeObjects) &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If you use OpenFOAM before version 3.0, you have to load the library with&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
functionObjectLibs (&amp;quot;libIOFunctionObjects.so&amp;quot;); // (instead of libs ( &amp;quot;libutilityFunctionObjects.so&amp;quot; )) &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
and exchange the entry &amp;quot;writeControl&amp;quot; with &amp;quot;outputControl&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
= OpenFOAM and ParaView on bwUniCluster=&lt;br /&gt;
ParaView is not directly linked to OpenFOAM installation on the cluster. Therefore, to visualize OpenFOAM jobs with ParaView, they will have to be manually opened within the specific ParaView module.  &lt;br /&gt;
&lt;br /&gt;
1. Load the ParaView module. For example: &lt;br /&gt;
&amp;lt;pre&amp;gt;$ module load cae/paraview/5.9&amp;lt;/pre&amp;gt;&lt;br /&gt;
2. Create a dummy &#039;*.openfoam&#039; file in the OpenFOAM case folder:&lt;br /&gt;
&amp;lt;pre&amp;gt;$ cd &amp;lt;case_folder_path&amp;gt;&lt;br /&gt;
$ touch &amp;lt;case_name&amp;gt;.openfoam&amp;lt;/pre&amp;gt;&lt;br /&gt;
&#039;&#039;&#039;NOTICE:&#039;&#039;&#039; the name of the dummy file should be the same as the name of the OpenFOAM case folder, with &#039;.openfoam&#039; extension.&lt;br /&gt;
&lt;br /&gt;
3. Open ParaView:&lt;br /&gt;
To run Paraview using VNC system is required on the bwUniCluster.&lt;br /&gt;
On the cluster run: &lt;br /&gt;
&amp;lt;pre&amp;gt;$ start_vnc_desktop --hw-rendering &amp;lt;/pre&amp;gt;&lt;br /&gt;
Start your VNC client on your desktop PC.&lt;br /&gt;
&#039;&#039;&#039;NOTICE&#039;&#039;&#039; Information for remote visualization on KIT HPC system is available on: https://wiki.bwhpc.de/e/BwUniCluster2.0/Software/Start_vnc_desktop&lt;br /&gt;
&lt;br /&gt;
4. In Paraview go to &#039;File&#039; -&amp;gt; &#039;Open&#039;, or press Ctrl+O. Choose to show &#039;All files (*)&#039;, and open your &amp;lt;case_name&amp;gt;.openfoam file. In the pop-up window select OpenFOAM, and press &#039;Ok&#039;.&lt;br /&gt;
&lt;br /&gt;
5. That&#039;s it! Enjoy ParaView and OpenFOAM.&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster2.0/Software/Matlab&amp;diff=15201</id>
		<title>BwUniCluster2.0/Software/Matlab</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster2.0/Software/Matlab&amp;diff=15201"/>
		<updated>2025-08-14T11:11:33Z</updated>

		<summary type="html">&lt;p&gt;S Braun: S Braun moved page BwUniCluster2.0/Software/Matlab to BwUniCluster3.0/Software/Matlab&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;#REDIRECT [[BwUniCluster3.0/Software/Matlab]]&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Software/Matlab&amp;diff=15200</id>
		<title>BwUniCluster3.0/Software/Matlab</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Software/Matlab&amp;diff=15200"/>
		<updated>2025-08-14T11:11:33Z</updated>

		<summary type="html">&lt;p&gt;S Braun: S Braun moved page BwUniCluster2.0/Software/Matlab to BwUniCluster3.0/Software/Matlab&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Softwarepage|math/matlab}}&lt;br /&gt;
&lt;br /&gt;
{| width=600px class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Description !! Content&lt;br /&gt;
|-&lt;br /&gt;
| module load&lt;br /&gt;
| math/matlab&lt;br /&gt;
|-&lt;br /&gt;
| License&lt;br /&gt;
| [https://de.mathworks.com/pricing-licensing/index.html?intendeduse=edu&amp;amp;prodcode=ML Academic License/Commercial]&lt;br /&gt;
|-&lt;br /&gt;
| Citing&lt;br /&gt;
| n/a&lt;br /&gt;
|-&lt;br /&gt;
| Links&lt;br /&gt;
| [https://de.mathworks.com/products/matlab/ MATLAB Homepage] &amp;amp;#124; [https://de.mathworks.com/index.html?s_tid=gn_logo MathWorks Homepage] &amp;amp;#124; [https://de.mathworks.com/support/?s_tid=gn_supp Support and more]&lt;br /&gt;
|-&lt;br /&gt;
| Graphical Interface&lt;br /&gt;
| No&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
= Description =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;MATLAB&#039;&#039;&#039; (MATrix LABoratory) is a high-level programming language and interactive computing environment for numerical calculation and data visualization.&lt;br /&gt;
&lt;br /&gt;
= Loading MATLAB =&lt;br /&gt;
&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
It is not advisable to invoke an interactive MATLAB session on a login node of the cluster. Such sessions will be terminated automatically.&lt;br /&gt;
The recommended way to run a long-duration interactive MATLAB session is to submit an interactive job and start MATLAB from within the dedicated compute node assigned to you by the queueing system (consult the specific cluster users guide on how to submit interactive jobs).&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
An interactive MATLAB session with graphical user interface (GUI) can be started with the command (requires X11 forwarding enabled for your ssh login):&lt;br /&gt;
&amp;lt;pre&amp;gt;$ matlab&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Since graphics rendering can be very slow on remote connections, the preferable way is to run the MATLAB command line interface without GUI:&lt;br /&gt;
&amp;lt;pre&amp;gt;$ matlab -nodisplay&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The following command will execute a MATLAB script or function named &amp;quot;example&amp;quot; &#039;&#039;&#039;on a single thread&#039;&#039;&#039;:&lt;br /&gt;
&amp;lt;pre&amp;gt;$ matlab -nodisplay -singleCompThread -r example &amp;gt; result.out 2&amp;gt;&amp;amp;1&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The output of this session will be redirected to the file result.out. The option &amp;lt;span style=&amp;quot;background:#edeae2;margin:2px;padding:1px;border:1px dotted #808080&amp;quot;&amp;gt;-r&amp;lt;/span&amp;gt; executes the MATLAB statement non-interactively. The option &amp;lt;span style=&amp;quot;background:#edeae2;margin:2px;padding:1px;border:1px dotted #808080&amp;quot;&amp;gt;-singleCompThread&amp;lt;/span&amp;gt; limits MATLAB to single computational thread. Most of the time, running MATLAB in single-threaded mode will meet your needs. But if you have mathematically intense computations that benefit from the built-in multithreading provided by MATLAB&#039;s BLAS and FFT implementation, then you can experiment with running in multi-threaded mode by omitting this option (see section 4.1 - Implicit Threading).&lt;br /&gt;
&lt;br /&gt;
As with all processes that require more than a few minutes to run, non-trivial MATLAB jobs must be submitted to the cluster queuing system. Example batch scripts are available in the directory pointed to by the environment variable &amp;lt;span style=&amp;quot;background:#edeae2;margin:2px;padding:1px;border:1px dotted #808080&amp;quot;&amp;gt;$MATLAB_EXA_DIR&amp;lt;/span&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
= Parallel Computing Using MATLAB =&lt;br /&gt;
&lt;br /&gt;
Parallelization of MATLAB jobs is realized via the built-in multithreading provided by MATLAB&#039;s BLAS and FFT implementation and the parallel computing functionality of MATLAB&#039;s Parallel Computing Toolbox (PCT). The MATLAB Parallel/Distributed Computing Server is not available on the bwHPC-Clusters.&lt;br /&gt;
&lt;br /&gt;
== Implicit Threading ==&lt;br /&gt;
&lt;br /&gt;
A large number of built-in MATLAB functions may utilize multiple cores automatically without any code modifications required. This is referred to as implicit multithreading and must be strictly distinguished from explicit parallelism provided by the Parallel Computing Toolbox (PCT) which requires specific commands in your code in order to create threads.&lt;br /&gt;
&lt;br /&gt;
Implicit threading particularly takes place for linear algebra operations (such as the solution to a linear system A\b or matrix products A*B) and FFT operations. Many other high-level MATLAB functions do also benefit from multithreading capabilities of their underlying routines. However, the user can still enforce single-threaded mode by adding the command line option &amp;lt;span style=&amp;quot;background:#edeae2;margin:2px;padding:1px;border:1px dotted #808080&amp;quot;&amp;gt;-singleCompThread&amp;lt;/span&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Whenever implicit threading takes place, MATLAB will detect the total number of cores that exist on a machine and by default makes use of all of them. This has very important implications for MATLAB jobs in HPC environments with shared-node job scheduling policy (i.e. with multiple users sharing one compute node). Due to this behaviour, a MATLAB job may take over more compute resources than assigned by the queueing system of the cluster (and thereby taking away these resources from all other users with running jobs on the same node - including your own jobs).&lt;br /&gt;
&lt;br /&gt;
Therefore, when running in multi-threaded mode, MATLAB always requires the user&#039;s intervention to not allocate all cores of the machine (unless requested so from the queueing system). The number of threads must be controlled from within the code by means of the &amp;lt;span style=&amp;quot;background:#edeae2;margin:2px;padding:1px;border:1px dotted #808080&amp;quot;&amp;gt;maxNumCompThreads(N)&amp;lt;/span&amp;gt; function (which is supposed to be deprecated) or, alternatively, with the &amp;lt;span style=&amp;quot;background:#edeae2;margin:2px;padding:1px;border:1px dotted #808080&amp;quot;&amp;gt;feature(&#039;numThreads&#039;, N)&amp;lt;/span&amp;gt; function (which is currently undocumented).&lt;br /&gt;
&lt;br /&gt;
== Using the Parallel Computing Toolbox (PCT) ==&lt;br /&gt;
&lt;br /&gt;
By using the PCT one can make explicit use of several cores on multicore processors to parallelize MATLAB applications without MPI programming. Under MATLAB version 8.4 and earlier, this toolbox provides 12 workers (MATLAB computational engines) to execute applications locally on a single multicore node. Under MATLAB version 8.5 and later, the number of workers available is equal to the number of cores on a single node (up to a maximum of 512).&lt;br /&gt;
&lt;br /&gt;
If multiple PCT jobs are running at the same time, they all write temporary MATLAB job information to the same location. This race condition can cause one or more of the parallel MATLAB jobs fail to use the parallel functionality of the toolbox.&lt;br /&gt;
&lt;br /&gt;
To solve this issue, each MATLAB job should explicitly set a unique location where these files are created. This can be accomplished by the following snippet of code added to your MATLAB script.&lt;br /&gt;
&lt;br /&gt;
{{bwFrameA|&lt;br /&gt;
&amp;lt;source lang=&amp;quot;Matlab&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
% create a local cluster object&lt;br /&gt;
pc = parcluster(&#039;local&#039;)&lt;br /&gt;
&lt;br /&gt;
% get the number of dedicated cores from environment&lt;br /&gt;
pc.NumWorkers = str2num(getenv(&#039;SLURM_NPROCS&#039;))&lt;br /&gt;
&lt;br /&gt;
% explicitly set the JobStorageLocation to the tmp directory that is unique to each cluster job (and is on local, fast scratch)&lt;br /&gt;
parpool_tmpdir = [getenv(&#039;TMP&#039;),&#039;/.matlab/local_cluster_jobs/slurm_jobID_&#039;,getenv(&#039;SLURM_JOB_ID&#039;)]&lt;br /&gt;
mkdir(parpool_tmpdir)&lt;br /&gt;
pc.JobStorageLocation = parpool_tmpdir&lt;br /&gt;
&lt;br /&gt;
% start the parallel pool&lt;br /&gt;
parpool(pc)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
Note: The code snippet also sets the correct number of parallel workers in MATLAB according to the total number of processes dedicated to the job given by the environment variable &amp;lt;span style=&amp;quot;background:#edeae2;margin:2px;padding:1px;border:1px dotted #808080&amp;quot;&amp;gt;$SLURM_NPROCS&amp;lt;/span&amp;gt; in the job submission file.&lt;br /&gt;
&lt;br /&gt;
= General Performance Tips for MATLAB =&lt;br /&gt;
&lt;br /&gt;
MATLAB data structures (arrays or matrices) are dynamic in size, i.e. MATLAB will automatically resize the structure on demand. Although this seems to be convenient, MATLAB continually needs to allocate a new chunk of memory and copy over the data to the new block of memory as the array or matrix grows in a loop. This may take a significant amount of extra time during execution of the program.&lt;br /&gt;
&lt;br /&gt;
Code performance can often be drastically improved by preallocating memory for the final expected size of the array or matrix before actually starting the processing loop. In order to preallocate an array of strings, you can use MATLAB&#039;s build-in cell function. In order to preallocate an array or matrix of numbers, you can use MATLAB&#039;s build-in zeros function.&lt;br /&gt;
&lt;br /&gt;
The performance benefit of preallocation is illustrated with the following example code.&lt;br /&gt;
&lt;br /&gt;
{{bwFrameA|&lt;br /&gt;
&amp;lt;source lang=&amp;quot;Matlab&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
% prealloc.m&lt;br /&gt;
&lt;br /&gt;
clear all;&lt;br /&gt;
&lt;br /&gt;
num=10000000;&lt;br /&gt;
&lt;br /&gt;
disp(&#039;Without preallocation:&#039;)&lt;br /&gt;
tic&lt;br /&gt;
for i=1:num&lt;br /&gt;
    a(i)=i;&lt;br /&gt;
end&lt;br /&gt;
toc&lt;br /&gt;
&lt;br /&gt;
disp(&#039;With preallocation:&#039;)&lt;br /&gt;
tic&lt;br /&gt;
b=zeros(1,num);&lt;br /&gt;
for i=1:num&lt;br /&gt;
    b(i)=i;&lt;br /&gt;
end&lt;br /&gt;
toc&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/source&amp;gt;&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
On a compute node, the result may look like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Without preallocation:&lt;br /&gt;
Elapsed time is 2.879446 seconds.&lt;br /&gt;
With preallocation:&lt;br /&gt;
Elapsed time is 0.097557 seconds.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Please recognize that the code runs almost 30 times faster with preallocation.&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Running_Jobs&amp;diff=15177</id>
		<title>BwUniCluster3.0/Running Jobs</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Running_Jobs&amp;diff=15177"/>
		<updated>2025-07-25T04:40:24Z</updated>

		<summary type="html">&lt;p&gt;S Braun: /* Short Queues */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
= Purpose and function of a queuing system =&lt;br /&gt;
&lt;br /&gt;
All compute activities on bwUniCluster 3.0 have to be performed on the compute nodes. Compute nodes are only available by requesting the corresponding resources via the queuing system. As soon as the requested resources are available, automated tasks are executed via a batch script or they can be accessed interactively.&amp;lt;br&amp;gt;&lt;br /&gt;
General procedure: Hint to [[Running_Calculations | Running Calculations]]&lt;br /&gt;
&lt;br /&gt;
== Job submission process ==&lt;br /&gt;
&lt;br /&gt;
bwUniCluster 3.0 has installed the workload managing software Slurm. Therefore any job submission by the user is to be executed by commands of the Slurm software. Slurm queues and runs user jobs based on fair sharing policies.&lt;br /&gt;
&lt;br /&gt;
== Slurm ==&lt;br /&gt;
&lt;br /&gt;
HPC Workload Manager on bwUniCluster 3.0 is Slurm.&lt;br /&gt;
Slurm is a cluster management and job scheduling system. Slurm has three key functions. &lt;br /&gt;
* It allocates access to resources (compute cores on nodes) to users for some duration of time so they can perform work. &lt;br /&gt;
* It provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. &lt;br /&gt;
* It arbitrates contention for resources by managing a queue of pending work.&lt;br /&gt;
&lt;br /&gt;
Any kind of calculation on the compute nodes of bwUniCluster 3.0 requires the user to define calculations as a sequence of commands together with required run time, number of CPU cores and main memory and submit all, i.e., the &#039;&#039;&#039;batch job&#039;&#039;&#039;, to a resource and workload managing software.&lt;br /&gt;
&lt;br /&gt;
== Terms and definitions ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Partitions &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Slurm manages job queues for different &#039;&#039;&#039;partitions&#039;&#039;&#039;. Partitions are used to group similar node types (e.g. nodes with and without accelerators) and to enforce different access policies and resource limits.&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 there are different partitions:&lt;br /&gt;
&lt;br /&gt;
* CPU-only nodes&lt;br /&gt;
** 2-socket nodes, consisting of 2 Intel Ice Lake processors with 32 cores each or 2 AMD processors with 48 cores each&lt;br /&gt;
** 2-socket nodes with very high RAM capacity, consisting of 2 AMD processors with 48 cores each&lt;br /&gt;
* GPU-accelerated nodes&lt;br /&gt;
** 2-socket nodes with 4x NVIDIA A100 or 4x NVIDIA H100 GPUs&lt;br /&gt;
** 4-socket node with 4x AMD Instinct accelerator&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Queues &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Job &#039;&#039;&#039;queues&#039;&#039;&#039; are used to manage jobs that request access to shared but limited computing resources of a certain kind (partition).&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 there are different main types of queues:&lt;br /&gt;
* Regular queues&lt;br /&gt;
** cpu: Jobs that request CPU-only nodes.&lt;br /&gt;
** gpu: Jobs that request GPU-accelerated nodes.&lt;br /&gt;
* Development queues (dev)&lt;br /&gt;
** Short, usually interactive jobs that are used for developing, compiling and testing code and workflows. The intention behind development queues is to provide users with immediate access to computer resources without having to wait. This is the place to realize instantaneous heavy compute without affecting other users, as would be the case on the login nodes.&lt;br /&gt;
&lt;br /&gt;
Requested compute resources such as (wall-)time, number of nodes and amount of memory are restricted and must fit into the boundaries imposed by the queues. The request for compute resources on the bwUniCluster 3.0 &amp;lt;font color=red&amp;gt;requires at least the specification of the &#039;&#039;&#039;queue&#039;&#039;&#039; and the &#039;&#039;&#039;time&#039;&#039;&#039;&amp;lt;/font&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Jobs &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Jobs can be run non-interactively as &#039;&#039;&#039;batch jobs&#039;&#039;&#039; or as &#039;&#039;&#039;interactive jobs&#039;&#039;&#039;.&amp;lt;br&amp;gt;&lt;br /&gt;
Submitting a batch job means, that all steps of a compute project are defined in a Bash script. This Bash script is queued and executed as soon as the compute resources are available and allocated. Jobs are enqueued with the &amp;lt;code&amp;gt;sbatch&amp;lt;/code&amp;gt; command.&lt;br /&gt;
For interactive jobs, the resources are requested with the &amp;lt;code&amp;gt;salloc&amp;lt;/code&amp;gt; command. As soon as the computing resources are available and allocated, a command line prompt is returned on a computing node and the user can freely dispose of the resources now available to him.&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
&#039;&#039;&#039;Please remember:&#039;&#039;&#039;&lt;br /&gt;
* &#039;&#039;&#039;Heavy computations are not allowed on the login nodes&#039;&#039;&#039;.&amp;lt;br&amp;gt;Use a developement or a regular job queue instead! Please refer to [[BwUniCluster3.0/Login#Allowed_Activities_on_Login_Nodes|Allowed Activities on Login Nodes]].&lt;br /&gt;
* &#039;&#039;&#039;Development queues&#039;&#039;&#039; are meant for &#039;&#039;&#039;development tasks&#039;&#039;&#039;.&amp;lt;br&amp;gt;Do not misuse this queue for regular, short-running jobs or chain jobs! Only one running job at a time is enabled. Maximum queue length is reduced to 3.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
= Queues on bwUniCluster 3.0 = &lt;br /&gt;
== Policy ==&lt;br /&gt;
&lt;br /&gt;
The computing time is provided in accordance with the &#039;&#039;&#039;fair share policy&#039;&#039;&#039;. The individual investment shares of the respective university and the resources already used by its members are taken into account. Furthermore, the following throttling policy is also active: The &#039;&#039;&#039;maximum amount of physical cores&#039;&#039;&#039; used at any given time from jobs running is &#039;&#039;&#039;1920 per user&#039;&#039;&#039; (aggregated over all running jobs). This number corresponds to 30 nodes on the Ice Lake partition or 20 nodes on the standard partition. The aim is to minimize waiting times and maximize the number of users who can access computing time at the same time.&lt;br /&gt;
&lt;br /&gt;
== Regular Queues ==&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node-Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;cpu_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Ice Lake&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=72:00:00, nodes=30, mem=249600mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;cpu&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Standard&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=72:00:00, nodes=20, mem=380000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;highmem&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;High Memory&lt;br /&gt;
| mem-per-cpu=12090mb&lt;br /&gt;
| mem=380001mb&lt;br /&gt;
| time=72:00:00, nodes=4, mem=2300000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_h100&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=193300mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=72:00:00, nodes=12, mem=760000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_mi300&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU node&amp;lt;br/&amp;gt;AMD GPU x4&lt;br /&gt;
| mem-per-gpu=128200mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=72:00:00, nodes=1, mem=510000mb, ntasks-per-node=40, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_a100_il&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;gpu_h100_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;Ice Lake&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=127500mb&amp;lt;br/&amp;gt;cpus-per-gpu=16&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=48:00:00, nodes=9, mem=510000mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|}&lt;br /&gt;
Table 1: Regular Queues&lt;br /&gt;
&lt;br /&gt;
== Short Queues ==&lt;br /&gt;
&amp;lt;p style=&amp;quot;color:red; &amp;quot;&amp;gt;&amp;lt;b&amp;gt;Queues with a short runtime of 30 minutes.&amp;lt;/b&amp;gt;&amp;lt;/p&amp;gt; &lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_a100_short&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;Ice Lake&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=94000mb&amp;lt;br/&amp;gt;cpus-per-gpu=12&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=12, mem=376000mb, ntasks-per-node=48, (threads-per-core=2)&lt;br /&gt;
|}&lt;br /&gt;
Table 2: Short Queues&lt;br /&gt;
&lt;br /&gt;
== Development Queues ==&lt;br /&gt;
Only for development, i.e. debugging or performance optimization ...&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_cpu_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Ice Lake&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=30, nodes=8, mem=249600mb, ntasks-per-node=64, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_cpu&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Standard&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=30, nodes=1, mem=380000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_gpu_h100&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=193300mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=1, mem=760000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_gpu_a100_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&amp;lt;br/&amp;gt;&lt;br /&gt;
| mem-per-gpu=127500mb&amp;lt;br/&amp;gt;cpus-per-gpu=16 &lt;br /&gt;
| gres=gpu:1&lt;br /&gt;
| time=30, nodes=1, mem=510000mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|}&lt;br /&gt;
Table 3: Development Queues&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Default resources of a queue class define number of tasks and memory if not explicitly given with sbatch command. Resource list acronyms &#039;&#039;--time&#039;&#039;, &#039;&#039;--ntasks&#039;&#039;, &#039;&#039;--nodes&#039;&#039;, &#039;&#039;--mem&#039;&#039; and &#039;&#039;--mem-per-cpu&#039;&#039; are described [[BwUniCluster3.0/Running_Jobs/Slurm|here]].&lt;br /&gt;
&lt;br /&gt;
== Check available resources: sinfo_t_idle ==&lt;br /&gt;
The Slurm command sinfo is used to view partition and node information for a system running Slurm. It incorporates down time, reservations, and node state information in determining the available backfill window. The sinfo command can only be used by the administrator.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
SCC has prepared a special script (sinfo_t_idle) to find out how many processors are available for immediate use on the system. It is anticipated that users will use this information to submit jobs that meet these criteria and thus obtain quick job turnaround times. &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* The following command displays what resources are available for immediate use for the whole partition.&lt;br /&gt;
&amp;lt;pre&amp;gt;$ sinfo_t_idle &lt;br /&gt;
Partition dev_cpu                 :      1 nodes idle&lt;br /&gt;
Partition cpu                     :      1 nodes idle&lt;br /&gt;
Partition highmem                 :      2 nodes idle&lt;br /&gt;
Partition dev_gpu_h100            :      0 nodes idle&lt;br /&gt;
Partition gpu_h100                :      0 nodes idle&lt;br /&gt;
Partition gpu_mi300               :      0 nodes idle&lt;br /&gt;
Partition dev_cpu_il              :      7 nodes idle&lt;br /&gt;
Partition cpu_il                  :      2 nodes idle&lt;br /&gt;
Partition dev_gpu_a100_il         :      1 nodes idle&lt;br /&gt;
Partition gpu_a100_il             :      0 nodes idle&lt;br /&gt;
Partition gpu_h100_il             :      1 nodes idle&lt;br /&gt;
Partition gpu_a100_short          :      0 nodes idle&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Running Jobs =&lt;br /&gt;
&lt;br /&gt;
== Slurm Commands (excerpt) ==&lt;br /&gt;
Important Slurm commands for non-administrators working on bwUniCluster 3.0.&lt;br /&gt;
{| width=850px class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! Slurm commands !! Brief explanation&lt;br /&gt;
|-&lt;br /&gt;
| [[#Batch Jobs: sbatch|sbatch]] || Submits a job and puts it into the queue [[https://slurm.schedmd.com/sbatch.html sbatch]] &lt;br /&gt;
|-&lt;br /&gt;
| [[#Interactive Jobs: salloc|salloc]] || Requests resources for an interactive Job [[https://slurm.schedmd.com/salloc.html salloc]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Monitor and manage jobs |scontrol show job]] || Displays detailed job state information [[https://slurm.schedmd.com/scontrol.html scontrol]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#List of your submitted jobs : squeue|squeue]] || Displays information about active, eligible, blocked, and/or recently completed jobs [[https://slurm.schedmd.com/squeue.html squeue]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#List of your submitted jobs : squeue|squeue --start]] || Returns start time of submitted job [[https://slurm.schedmd.com/squeue.html squeue]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Check available resources: sinfo_t_idle|sinfo_t_idle]] || Shows what resources are available for immediate use [[https://slurm.schedmd.com/sinfo.html sinfo]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Canceling own jobs : scancel|scancel]] || Cancels a job [[https://slurm.schedmd.com/scancel.html scancel]]&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
* [https://slurm.schedmd.com/tutorials.html  Slurm Tutorials]&lt;br /&gt;
* [https://slurm.schedmd.com/pdfs/summary.pdf  Slurm command/option summary (2 pages)]&lt;br /&gt;
* [https://slurm.schedmd.com/man_index.html  Slurm Commands]&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Batch Jobs: sbatch ==&lt;br /&gt;
&lt;br /&gt;
Batch jobs are submitted by using the command &#039;&#039;&#039;sbatch&#039;&#039;&#039;. The main purpose of the &#039;&#039;&#039;sbatch&#039;&#039;&#039; command is to specify the resources that are needed to run the job. &#039;&#039;&#039;sbatch&#039;&#039;&#039; will then queue the batch job. However, starting of batch job depends on the availability of the requested resources and the fair sharing value.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* The syntax and use of &#039;&#039;&#039;sbatch&#039;&#039;&#039; can be displayed via:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ man sbatch&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;sbatch&#039;&#039;&#039; options can be used from the command line or in your job script. Different defaults for some of these options are set based on the queue and can be found [[BwUniCluster3.0/Slurm | here]]&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! colspan=&amp;quot;3&amp;quot; | sbatch Options&lt;br /&gt;
|-&lt;br /&gt;
! style=&amp;quot;width:8%&amp;quot;| Command line&lt;br /&gt;
! style=&amp;quot;width:9%&amp;quot;| Script&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Purpose&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -t, --time=&#039;&#039;time&#039;&#039;&lt;br /&gt;
| #SBATCH --time=&#039;&#039;time&#039;&#039;&lt;br /&gt;
| Wall clock time limit.&amp;lt;br&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -N, --nodes=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --nodes=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of nodes to be used.&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -n, --ntasks=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --ntasks=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of tasks to be launched.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --ntasks-per-node=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --ntasks-per-node=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Maximum count of tasks per node.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -c, --cpus-per-task=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --cpus-per-task=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of CPUs required per (MPI-)task.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mem=&#039;&#039;value_in_MB&#039;&#039;&lt;br /&gt;
| #SBATCH --mem=&#039;&#039;value_in_MB&#039;&#039; &lt;br /&gt;
| Memory in MegaByte per node. (You should omit the setting of this option.)&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mem-per-cpu=&#039;&#039;value_in_MB&#039;&#039;&lt;br /&gt;
| #SBATCH --mem-per-cpu=&#039;&#039;value_in_MB&#039;&#039; &lt;br /&gt;
| Minimum Memory required per allocated CPU. (You should omit the setting of this option.)&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mail-type=&#039;&#039;type&#039;&#039;&lt;br /&gt;
| #SBATCH --mail-type=&#039;&#039;type&#039;&#039;&lt;br /&gt;
| Notify user by email when certain event types occur.&amp;lt;br&amp;gt;Valid type values are NONE, BEGIN, END, FAIL, REQUEUE, ALL.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mail-user=&#039;&#039;mail-address&#039;&#039;&lt;br /&gt;
| #SBATCH --mail-user=&#039;&#039;mail-address&#039;&#039;&lt;br /&gt;
|  The specified mail-address receives email notification of state changes as defined by --mail-type.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --output=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --output=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| File in which job output is stored. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --error=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --error=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| File in which job error messages are stored. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -J, --job-name=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --job-name=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| Job name.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --export=[ALL,] &#039;&#039;env-variables&#039;&#039;&lt;br /&gt;
| #SBATCH --export=[ALL,] &#039;&#039;env-variables&#039;&#039;&lt;br /&gt;
| Identifies which environment variables from the submission environment are propagated to the launched application. Default is ALL.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -A, --account=&#039;&#039;group-name&#039;&#039;&lt;br /&gt;
| #SBATCH --account=&#039;&#039;group-name&#039;&#039;&lt;br /&gt;
| Change resources used by this job to specified group. You may need this option if your account is assigned to more than one group. By command &amp;quot;scontrol show job&amp;quot; the project group the job is accounted on can be seen behind &amp;quot;Account=&amp;quot;. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -p, --partition=&#039;&#039;queue-name&#039;&#039;&lt;br /&gt;
| #SBATCH --partition=&#039;&#039;queue-name&#039;&#039;&lt;br /&gt;
| Request a specific queue for the resource allocation.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --reservation=&#039;&#039;reservation-name&#039;&#039;&lt;br /&gt;
| #SBATCH --reservation=&#039;&#039;reservation-name&#039;&#039;&lt;br /&gt;
| Use a specific reservation for the resource allocation.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -C, --constraint=&#039;&#039;LSDF&#039;&#039;&lt;br /&gt;
| #SBATCH --constraint=LSDF&lt;br /&gt;
| Job constraint LSDF filesystems.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -C, --constraint=&#039;&#039;BEEOND (BEEOND_4MDS, BEEOND_MAXMDS)&#039;&#039;&lt;br /&gt;
| #SBATCH --constraint=BEEOND (BEEOND_4MDS, BEEOND_MAXMDS)&lt;br /&gt;
| Job constraint BeeOND filesystem.&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Interactive Jobs: salloc ==&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 you are only allowed to run short jobs (&amp;lt;&amp;lt; 1 hour) with little memory requirements (&amp;lt;&amp;lt; 8 GByte) on the logins nodes. If you want to run longer jobs and/or jobs with a request of more than 8 GByte of memory, you must allocate resources for so-called interactive jobs by usage of the command salloc on a login node. Considering a serial application running on a compute node that requires 5000 MByte of memory and limiting the interactive run to 2 hours the following command has to be executed:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ salloc -p cpu -n 1 -t 120 --mem=5000&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then you will get one core on a compute node within the partition &amp;quot;cpu&amp;quot;. After execution of this command &#039;&#039;&#039;DO NOT CLOSE&#039;&#039;&#039; your current terminal session but wait until the queueing system Slurm has granted you the requested resources on the compute system. You will be logged in automatically on the granted core! To run a serial program on the granted core you only have to type the name of the executable.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./&amp;lt;my_serial_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Please be aware that your serial job must run less than 2 hours in this example, else the job will be killed during runtime by the system. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
You can also start now a graphical X11-terminal connecting you to the dedicated resource that is available for 2 hours. You can start it by the command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ xterm&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that, once the walltime limit has been reached the resources - i.e. the compute node - will automatically be revoked.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
An interactive parallel application running on one compute node or on many compute nodes (e.g. here 5 nodes) with 96 cores each requires usually an amount of memory in GByte (e.g. 50 GByte) and a maximum time (e.g. 1 hour). E.g. 5 nodes can be allocated by the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ salloc -p cpu -N 5 --ntasks-per-node=96 -t 01:00:00  --mem=50gb&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now you can run parallel jobs on 480 cores requiring 50 GByte of memory per node. Please be aware that you will be logged in on core 0 of the first node.&lt;br /&gt;
If you want to have access to another node you have to open a new terminal, connect it also to bwUniCluster 3.0 and type the following commands to&lt;br /&gt;
connect to the running interactive job and then to a specific node:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ srun --jobid=XXXXXXXX --pty /bin/bash&lt;br /&gt;
$ srun --nodelist=uc3nXXX --pty /bin/bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With the command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
the jobid and the nodelist can be shown.&lt;br /&gt;
&lt;br /&gt;
If you want to run MPI-programs, you can do it by simply typing mpirun &amp;lt;program_name&amp;gt;. Then your program will be run on 480 cores. A very simple example for starting a parallel job can be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mpirun &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You can also start the debugger ddt by the commands:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module add devel/ddt&lt;br /&gt;
$ ddt &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The above commands will execute the parallel program &amp;lt;my_mpi_program&amp;gt; on all available cores. You can also start parallel programs on a subset of cores; an example for this can be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mpirun -n 50 &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you are using Intel MPI you must start &amp;lt;my_mpi_program&amp;gt; by the command mpiexec.hydra (instead of mpirun).&lt;br /&gt;
&lt;br /&gt;
== Interactive Computing with Jupyter ==&lt;br /&gt;
&lt;br /&gt;
== Monitor and manage jobs ==&lt;br /&gt;
&lt;br /&gt;
=== List of your submitted jobs : squeue ===&lt;br /&gt;
Displays information about YOUR active, pending and/or recently completed jobs. The command displays all own active and pending jobs. The command squeue is explained in detail on the webpage https://slurm.schedmd.com/squeue.html or via manpage (man squeue).&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;squeue&#039;&#039; example on bwUniCluster 3.0 &amp;lt;small&amp;gt;(Only your own jobs are displayed!)&amp;lt;/small&amp;gt;.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue &lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
              1262       cpu     wrap ka_ab123  R       8:15      1 uc3n002&lt;br /&gt;
              1267 dev_gpu_h     wrap ka_ab123 PD       0:00      1 (Resources)&lt;br /&gt;
              1265   highmem     wrap ka_ab123  R       2:41      1 uc3n084&lt;br /&gt;
$ squeue -l&lt;br /&gt;
             JOBID PARTITION     NAME     USER    STATE       TIME TIME_LIMI  NODES NODELIST(REASON)&lt;br /&gt;
              1262       cpu     wrap ka_ab123  RUNNING       8:55     20:00      1 uc3n002&lt;br /&gt;
              1267 dev_gpu_h     wrap ka_ab123  PENDING       0:00     20:00      1 (Resources)&lt;br /&gt;
              1265   highmem     wrap ka_ab123  RUNNING       3:21     20:00      1 uc3n084&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Detailed job information : scontrol show job ===&lt;br /&gt;
scontrol show job displays detailed job state information and diagnostic output for all or a specified job of yours. Detailed information is available for active, pending and recently completed jobs. The command scontrol is explained in detail on the webpage https://slurm.schedmd.com/scontrol.html or via manpage (man scontrol). &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Display the state of all your jobs in normal mode: scontrol show job&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Display the state of a job with &amp;lt;jobid&amp;gt; in normal mode: scontrol show job &amp;lt;jobid&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Here is an example from bwUniCluster 3.0.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue&lt;br /&gt;
JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
1262       cpu     wrap ka_zs040  R       1:12      1 uc3n002&lt;br /&gt;
&lt;br /&gt;
$&lt;br /&gt;
$ # now, see what&#039;s up with my pending job with jobid 1262&lt;br /&gt;
$ &lt;br /&gt;
$ scontrol show job 1262&lt;br /&gt;
&lt;br /&gt;
JobId=1262 JobName=wrap&lt;br /&gt;
   UserId=ka_zs0402(241992) GroupId=ka_scc(12345) MCS_label=N/A&lt;br /&gt;
   Priority=4246 Nice=0 Account=ka QOS=normal&lt;br /&gt;
   JobState=RUNNING Reason=None Dependency=(null)&lt;br /&gt;
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0&lt;br /&gt;
   RunTime=00:00:37 TimeLimit=00:20:00 TimeMin=N/A&lt;br /&gt;
   SubmitTime=2025-04-04T10:01:30 EligibleTime=2025-04-04T10:01:30&lt;br /&gt;
   AccrueTime=2025-04-04T10:01:30&lt;br /&gt;
   StartTime=2025-04-04T10:01:31 EndTime=2025-04-04T10:21:31 Deadline=N/A&lt;br /&gt;
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2025-04-04T10:01:31 Scheduler=Main&lt;br /&gt;
   Partition=cpu AllocNode:Sid=uc3n999:2819841&lt;br /&gt;
   ReqNodeList=(null) ExcNodeList=(null)&lt;br /&gt;
   NodeList=uc3n002&lt;br /&gt;
   BatchHost=uc3n002&lt;br /&gt;
   NumNodes=1 NumCPUs=2 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*&lt;br /&gt;
   ReqTRES=cpu=1,mem=2000M,node=1,billing=1&lt;br /&gt;
   AllocTRES=cpu=2,mem=4000M,node=1,billing=2&lt;br /&gt;
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:1 CoreSpec=*&lt;br /&gt;
   MinCPUsNode=1 MinMemoryCPU=2000M MinTmpDiskNode=0&lt;br /&gt;
   Features=(null) DelayBoot=00:00:00&lt;br /&gt;
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)&lt;br /&gt;
   Command=(null)&lt;br /&gt;
   WorkDir=/pfs/data6/home/ka/ka_scc/ka_zs0402&lt;br /&gt;
   StdErr=/pfs/data6/home/ka/ka_scc/ka_zs0402/slurm-1262.out&lt;br /&gt;
   StdIn=/dev/null&lt;br /&gt;
   StdOut=/pfs/data6/home/ka/ka_scc/ka_zs0402/slurm-1262.out&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
=== Canceling own jobs : scancel ===&lt;br /&gt;
The scancel command is used to cancel jobs. The command scancel is explained in detail on the webpage https://slurm.schedmd.com/scancel.html or via manpage (man scancel). The command is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ scancel [-i] &amp;lt;job-id&amp;gt;&lt;br /&gt;
$ scancel -t &amp;lt;job_state_name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Slurm Options =&lt;br /&gt;
[[BwUniCluster3.0/Running_Jobs/Slurm | Detailed Slurm usage]]&lt;br /&gt;
&lt;br /&gt;
= Best Practices =&lt;br /&gt;
&lt;br /&gt;
== Step-by-Step example==&lt;br /&gt;
&lt;br /&gt;
== Dos and Don&#039;ts ==&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Running_Jobs&amp;diff=15162</id>
		<title>BwUniCluster3.0/Running Jobs</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=BwUniCluster3.0/Running_Jobs&amp;diff=15162"/>
		<updated>2025-07-21T15:18:20Z</updated>

		<summary type="html">&lt;p&gt;S Braun: /* Queues on bwUniCluster 3.0 */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
= Purpose and function of a queuing system =&lt;br /&gt;
&lt;br /&gt;
All compute activities on bwUniCluster 3.0 have to be performed on the compute nodes. Compute nodes are only available by requesting the corresponding resources via the queuing system. As soon as the requested resources are available, automated tasks are executed via a batch script or they can be accessed interactively.&amp;lt;br&amp;gt;&lt;br /&gt;
General procedure: Hint to [[Running_Calculations | Running Calculations]]&lt;br /&gt;
&lt;br /&gt;
== Job submission process ==&lt;br /&gt;
&lt;br /&gt;
bwUniCluster 3.0 has installed the workload managing software Slurm. Therefore any job submission by the user is to be executed by commands of the Slurm software. Slurm queues and runs user jobs based on fair sharing policies.&lt;br /&gt;
&lt;br /&gt;
== Slurm ==&lt;br /&gt;
&lt;br /&gt;
HPC Workload Manager on bwUniCluster 3.0 is Slurm.&lt;br /&gt;
Slurm is a cluster management and job scheduling system. Slurm has three key functions. &lt;br /&gt;
* It allocates access to resources (compute cores on nodes) to users for some duration of time so they can perform work. &lt;br /&gt;
* It provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. &lt;br /&gt;
* It arbitrates contention for resources by managing a queue of pending work.&lt;br /&gt;
&lt;br /&gt;
Any kind of calculation on the compute nodes of bwUniCluster 3.0 requires the user to define calculations as a sequence of commands together with required run time, number of CPU cores and main memory and submit all, i.e., the &#039;&#039;&#039;batch job&#039;&#039;&#039;, to a resource and workload managing software.&lt;br /&gt;
&lt;br /&gt;
== Terms and definitions ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Partitions &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Slurm manages job queues for different &#039;&#039;&#039;partitions&#039;&#039;&#039;. Partitions are used to group similar node types (e.g. nodes with and without accelerators) and to enforce different access policies and resource limits.&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 there are different partitions:&lt;br /&gt;
&lt;br /&gt;
* CPU-only nodes&lt;br /&gt;
** 2-socket nodes, consisting of 2 Intel Ice Lake processors with 32 cores each or 2 AMD processors with 48 cores each&lt;br /&gt;
** 2-socket nodes with very high RAM capacity, consisting of 2 AMD processors with 48 cores each&lt;br /&gt;
* GPU-accelerated nodes&lt;br /&gt;
** 2-socket nodes with 4x NVIDIA A100 or 4x NVIDIA H100 GPUs&lt;br /&gt;
** 4-socket node with 4x AMD Instinct accelerator&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Queues &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Job &#039;&#039;&#039;queues&#039;&#039;&#039; are used to manage jobs that request access to shared but limited computing resources of a certain kind (partition).&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 there are different main types of queues:&lt;br /&gt;
* Regular queues&lt;br /&gt;
** cpu: Jobs that request CPU-only nodes.&lt;br /&gt;
** gpu: Jobs that request GPU-accelerated nodes.&lt;br /&gt;
* Development queues (dev)&lt;br /&gt;
** Short, usually interactive jobs that are used for developing, compiling and testing code and workflows. The intention behind development queues is to provide users with immediate access to computer resources without having to wait. This is the place to realize instantaneous heavy compute without affecting other users, as would be the case on the login nodes.&lt;br /&gt;
&lt;br /&gt;
Requested compute resources such as (wall-)time, number of nodes and amount of memory are restricted and must fit into the boundaries imposed by the queues. The request for compute resources on the bwUniCluster 3.0 &amp;lt;font color=red&amp;gt;requires at least the specification of the &#039;&#039;&#039;queue&#039;&#039;&#039; and the &#039;&#039;&#039;time&#039;&#039;&#039;&amp;lt;/font&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039; Jobs &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Jobs can be run non-interactively as &#039;&#039;&#039;batch jobs&#039;&#039;&#039; or as &#039;&#039;&#039;interactive jobs&#039;&#039;&#039;.&amp;lt;br&amp;gt;&lt;br /&gt;
Submitting a batch job means, that all steps of a compute project are defined in a Bash script. This Bash script is queued and executed as soon as the compute resources are available and allocated. Jobs are enqueued with the &amp;lt;code&amp;gt;sbatch&amp;lt;/code&amp;gt; command.&lt;br /&gt;
For interactive jobs, the resources are requested with the &amp;lt;code&amp;gt;salloc&amp;lt;/code&amp;gt; command. As soon as the computing resources are available and allocated, a command line prompt is returned on a computing node and the user can freely dispose of the resources now available to him.&lt;br /&gt;
{|style=&amp;quot;background:#deffee; width:100%;&amp;quot;&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
[[Image:Attention.svg|center|25px]]&lt;br /&gt;
|style=&amp;quot;padding:5px; background:#cef2e0; text-align:left&amp;quot;|&lt;br /&gt;
&#039;&#039;&#039;Please remember:&#039;&#039;&#039;&lt;br /&gt;
* &#039;&#039;&#039;Heavy computations are not allowed on the login nodes&#039;&#039;&#039;.&amp;lt;br&amp;gt;Use a developement or a regular job queue instead! Please refer to [[BwUniCluster3.0/Login#Allowed_Activities_on_Login_Nodes|Allowed Activities on Login Nodes]].&lt;br /&gt;
* &#039;&#039;&#039;Development queues&#039;&#039;&#039; are meant for &#039;&#039;&#039;development tasks&#039;&#039;&#039;.&amp;lt;br&amp;gt;Do not misuse this queue for regular, short-running jobs or chain jobs! Only one running job at a time is enabled. Maximum queue length is reduced to 3.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
= Queues on bwUniCluster 3.0 = &lt;br /&gt;
== Policy ==&lt;br /&gt;
&lt;br /&gt;
The computing time is provided in accordance with the &#039;&#039;&#039;fair share policy&#039;&#039;&#039;. The individual investment shares of the respective university and the resources already used by its members are taken into account. Furthermore, the following throttling policy is also active: The &#039;&#039;&#039;maximum amount of physical cores&#039;&#039;&#039; used at any given time from jobs running is &#039;&#039;&#039;1920 per user&#039;&#039;&#039; (aggregated over all running jobs). This number corresponds to 30 nodes on the Ice Lake partition or 20 nodes on the standard partition. The aim is to minimize waiting times and maximize the number of users who can access computing time at the same time.&lt;br /&gt;
&lt;br /&gt;
== Regular Queues ==&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node-Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;cpu_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Ice Lake&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=72:00:00, nodes=30, mem=249600mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;cpu&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Standard&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=72:00:00, nodes=20, mem=380000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;highmem&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;High Memory&lt;br /&gt;
| mem-per-cpu=12090mb&lt;br /&gt;
| mem=380001mb&lt;br /&gt;
| time=72:00:00, nodes=4, mem=2300000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_h100&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=193300mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| &lt;br /&gt;
| time=72:00:00, nodes=12, mem=760000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_mi300&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU node&amp;lt;br/&amp;gt;AMD GPU x4&lt;br /&gt;
| mem-per-gpu=128200mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| &lt;br /&gt;
| time=72:00:00, nodes=1, mem=510000mb, ntasks-per-node=40, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_a100_il&amp;lt;/code&amp;gt;/&amp;lt;code&amp;gt;gpu_h100_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;Ice Lake&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=127500mb&amp;lt;br/&amp;gt;cpus-per-gpu=16&lt;br /&gt;
| &lt;br /&gt;
| time=48:00:00, nodes=9, mem=510000mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|}&lt;br /&gt;
Table 1: Regular Queues&lt;br /&gt;
&lt;br /&gt;
== Short Queues ==&lt;br /&gt;
Queues with a short runtime of 30 minutes.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;gpu_a100_short&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;Ice Lake&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=127500mb&amp;lt;br/&amp;gt;cpus-per-gpu=16&lt;br /&gt;
| &lt;br /&gt;
| time=30, nodes=8, mem=249600mb, ntasks-per-node=64, (threads-per-core=2)&lt;br /&gt;
|}&lt;br /&gt;
Table 2: Short Queues&lt;br /&gt;
&lt;br /&gt;
== Development Queues ==&lt;br /&gt;
Only for development, i.e. debugging or performance optimization ...&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|- &lt;br /&gt;
! style=&amp;quot;width:5%&amp;quot;| Queue&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Node Type&lt;br /&gt;
! style=&amp;quot;width:23%&amp;quot;| Default Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Minimal Resources&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Maximum Resources&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_cpu_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Ice Lake&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=30, nodes=8, mem=249600mb, ntasks-per-node=64, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_cpu&amp;lt;/code&amp;gt;&lt;br /&gt;
| CPU nodes&amp;lt;br/&amp;gt;Standard&lt;br /&gt;
| mem-per-cpu=2000mb&lt;br /&gt;
| &lt;br /&gt;
| time=30, nodes=1, mem=380000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_gpu_h100&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&lt;br /&gt;
| mem-per-gpu=193300mb&amp;lt;br/&amp;gt;cpus-per-gpu=24&lt;br /&gt;
| &lt;br /&gt;
| time=30, nodes=1, mem=760000mb, ntasks-per-node=96, (threads-per-core=2)&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;dev_gpu_a100_il&amp;lt;/code&amp;gt;&lt;br /&gt;
| GPU nodes&amp;lt;br/&amp;gt;NVIDIA GPU x4&amp;lt;br/&amp;gt;&lt;br /&gt;
| mem-per-gpu=127500mb&amp;lt;br/&amp;gt;cpus-per-gpu=16 &lt;br /&gt;
| &lt;br /&gt;
| time=30, nodes=1, mem=510000mb, ntasks-per-node=64, (threads-per-core=2) &lt;br /&gt;
|}&lt;br /&gt;
Table 3: Development Queues&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Default resources of a queue class define number of tasks and memory if not explicitly given with sbatch command. Resource list acronyms &#039;&#039;--time&#039;&#039;, &#039;&#039;--ntasks&#039;&#039;, &#039;&#039;--nodes&#039;&#039;, &#039;&#039;--mem&#039;&#039; and &#039;&#039;--mem-per-cpu&#039;&#039; are described [[BwUniCluster3.0/Running_Jobs/Slurm|here]].&lt;br /&gt;
&lt;br /&gt;
== Check available resources: sinfo_t_idle ==&lt;br /&gt;
The Slurm command sinfo is used to view partition and node information for a system running Slurm. It incorporates down time, reservations, and node state information in determining the available backfill window. The sinfo command can only be used by the administrator.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
SCC has prepared a special script (sinfo_t_idle) to find out how many processors are available for immediate use on the system. It is anticipated that users will use this information to submit jobs that meet these criteria and thus obtain quick job turnaround times. &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* The following command displays what resources are available for immediate use for the whole partition.&lt;br /&gt;
&amp;lt;pre&amp;gt;$ sinfo_t_idle&lt;br /&gt;
Partition dev_cpu                 :      2 nodes idle&lt;br /&gt;
Partition cpu                     :     68 nodes idle&lt;br /&gt;
Partition highmem                 :      4 nodes idle&lt;br /&gt;
Partition dev_gpu_h100            :      0 nodes idle&lt;br /&gt;
Partition gpu_h100                :     11 nodes idle&lt;br /&gt;
Partition gpu_mi300               :      1 nodes idle&lt;br /&gt;
Partition dev_cpu_il              :      0 nodes idle&lt;br /&gt;
Partition cpu_il                  :      0 nodes idle&lt;br /&gt;
Partition dev_gpu_a100_il         :      0 nodes idle&lt;br /&gt;
Partition gpu_a100_il             :      0 nodes idle&lt;br /&gt;
Partition gpu_h100_il             :      0 nodes idle&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Running Jobs =&lt;br /&gt;
&lt;br /&gt;
== Slurm Commands (excerpt) ==&lt;br /&gt;
Important Slurm commands for non-administrators working on bwUniCluster 3.0.&lt;br /&gt;
{| width=850px class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! Slurm commands !! Brief explanation&lt;br /&gt;
|-&lt;br /&gt;
| [[#Batch Jobs: sbatch|sbatch]] || Submits a job and puts it into the queue [[https://slurm.schedmd.com/sbatch.html sbatch]] &lt;br /&gt;
|-&lt;br /&gt;
| [[#Interactive Jobs: salloc|salloc]] || Requests resources for an interactive Job [[https://slurm.schedmd.com/salloc.html salloc]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Monitor and manage jobs |scontrol show job]] || Displays detailed job state information [[https://slurm.schedmd.com/scontrol.html scontrol]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#List of your submitted jobs : squeue|squeue]] || Displays information about active, eligible, blocked, and/or recently completed jobs [[https://slurm.schedmd.com/squeue.html squeue]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#List of your submitted jobs : squeue|squeue --start]] || Returns start time of submitted job [[https://slurm.schedmd.com/squeue.html squeue]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Check available resources: sinfo_t_idle|sinfo_t_idle]] || Shows what resources are available for immediate use [[https://slurm.schedmd.com/sinfo.html sinfo]]&lt;br /&gt;
|-&lt;br /&gt;
| [[#Canceling own jobs : scancel|scancel]] || Cancels a job [[https://slurm.schedmd.com/scancel.html scancel]]&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
* [https://slurm.schedmd.com/tutorials.html  Slurm Tutorials]&lt;br /&gt;
* [https://slurm.schedmd.com/pdfs/summary.pdf  Slurm command/option summary (2 pages)]&lt;br /&gt;
* [https://slurm.schedmd.com/man_index.html  Slurm Commands]&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Batch Jobs: sbatch ==&lt;br /&gt;
&lt;br /&gt;
Batch jobs are submitted by using the command &#039;&#039;&#039;sbatch&#039;&#039;&#039;. The main purpose of the &#039;&#039;&#039;sbatch&#039;&#039;&#039; command is to specify the resources that are needed to run the job. &#039;&#039;&#039;sbatch&#039;&#039;&#039; will then queue the batch job. However, starting of batch job depends on the availability of the requested resources and the fair sharing value.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* The syntax and use of &#039;&#039;&#039;sbatch&#039;&#039;&#039; can be displayed via:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ man sbatch&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;sbatch&#039;&#039;&#039; options can be used from the command line or in your job script. Different defaults for some of these options are set based on the queue and can be found [[BwUniCluster3.0/Slurm | here]]&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! colspan=&amp;quot;3&amp;quot; | sbatch Options&lt;br /&gt;
|-&lt;br /&gt;
! style=&amp;quot;width:8%&amp;quot;| Command line&lt;br /&gt;
! style=&amp;quot;width:9%&amp;quot;| Script&lt;br /&gt;
! style=&amp;quot;width:13%&amp;quot;| Purpose&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -t, --time=&#039;&#039;time&#039;&#039;&lt;br /&gt;
| #SBATCH --time=&#039;&#039;time&#039;&#039;&lt;br /&gt;
| Wall clock time limit.&amp;lt;br&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -N, --nodes=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --nodes=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of nodes to be used.&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -n, --ntasks=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --ntasks=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of tasks to be launched.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --ntasks-per-node=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --ntasks-per-node=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Maximum count of tasks per node.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -c, --cpus-per-task=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| #SBATCH --cpus-per-task=&#039;&#039;count&#039;&#039;&lt;br /&gt;
| Number of CPUs required per (MPI-)task.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mem=&#039;&#039;value_in_MB&#039;&#039;&lt;br /&gt;
| #SBATCH --mem=&#039;&#039;value_in_MB&#039;&#039; &lt;br /&gt;
| Memory in MegaByte per node. (You should omit the setting of this option.)&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mem-per-cpu=&#039;&#039;value_in_MB&#039;&#039;&lt;br /&gt;
| #SBATCH --mem-per-cpu=&#039;&#039;value_in_MB&#039;&#039; &lt;br /&gt;
| Minimum Memory required per allocated CPU. (You should omit the setting of this option.)&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mail-type=&#039;&#039;type&#039;&#039;&lt;br /&gt;
| #SBATCH --mail-type=&#039;&#039;type&#039;&#039;&lt;br /&gt;
| Notify user by email when certain event types occur.&amp;lt;br&amp;gt;Valid type values are NONE, BEGIN, END, FAIL, REQUEUE, ALL.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --mail-user=&#039;&#039;mail-address&#039;&#039;&lt;br /&gt;
| #SBATCH --mail-user=&#039;&#039;mail-address&#039;&#039;&lt;br /&gt;
|  The specified mail-address receives email notification of state changes as defined by --mail-type.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --output=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --output=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| File in which job output is stored. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --error=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --error=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| File in which job error messages are stored. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -J, --job-name=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| #SBATCH --job-name=&#039;&#039;name&#039;&#039;&lt;br /&gt;
| Job name.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --export=[ALL,] &#039;&#039;env-variables&#039;&#039;&lt;br /&gt;
| #SBATCH --export=[ALL,] &#039;&#039;env-variables&#039;&#039;&lt;br /&gt;
| Identifies which environment variables from the submission environment are propagated to the launched application. Default is ALL.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -A, --account=&#039;&#039;group-name&#039;&#039;&lt;br /&gt;
| #SBATCH --account=&#039;&#039;group-name&#039;&#039;&lt;br /&gt;
| Change resources used by this job to specified group. You may need this option if your account is assigned to more than one group. By command &amp;quot;scontrol show job&amp;quot; the project group the job is accounted on can be seen behind &amp;quot;Account=&amp;quot;. &lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -p, --partition=&#039;&#039;queue-name&#039;&#039;&lt;br /&gt;
| #SBATCH --partition=&#039;&#039;queue-name&#039;&#039;&lt;br /&gt;
| Request a specific queue for the resource allocation.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| --reservation=&#039;&#039;reservation-name&#039;&#039;&lt;br /&gt;
| #SBATCH --reservation=&#039;&#039;reservation-name&#039;&#039;&lt;br /&gt;
| Use a specific reservation for the resource allocation.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -C, --constraint=&#039;&#039;LSDF&#039;&#039;&lt;br /&gt;
| #SBATCH --constraint=LSDF&lt;br /&gt;
| Job constraint LSDF filesystems.&lt;br /&gt;
|-&lt;br /&gt;
|- style=&amp;quot;vertical-align:top;&amp;quot;&lt;br /&gt;
| -C, --constraint=&#039;&#039;BEEOND (BEEOND_4MDS, BEEOND_MAXMDS)&#039;&#039;&lt;br /&gt;
| #SBATCH --constraint=BEEOND (BEEOND_4MDS, BEEOND_MAXMDS)&lt;br /&gt;
| Job constraint BeeOND filesystem.&lt;br /&gt;
|-&lt;br /&gt;
|}&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Interactive Jobs: salloc ==&lt;br /&gt;
&lt;br /&gt;
On bwUniCluster 3.0 you are only allowed to run short jobs (&amp;lt;&amp;lt; 1 hour) with little memory requirements (&amp;lt;&amp;lt; 8 GByte) on the logins nodes. If you want to run longer jobs and/or jobs with a request of more than 8 GByte of memory, you must allocate resources for so-called interactive jobs by usage of the command salloc on a login node. Considering a serial application running on a compute node that requires 5000 MByte of memory and limiting the interactive run to 2 hours the following command has to be executed:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ salloc -p cpu -n 1 -t 120 --mem=5000&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then you will get one core on a compute node within the partition &amp;quot;cpu&amp;quot;. After execution of this command &#039;&#039;&#039;DO NOT CLOSE&#039;&#039;&#039; your current terminal session but wait until the queueing system Slurm has granted you the requested resources on the compute system. You will be logged in automatically on the granted core! To run a serial program on the granted core you only have to type the name of the executable.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ ./&amp;lt;my_serial_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Please be aware that your serial job must run less than 2 hours in this example, else the job will be killed during runtime by the system. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
You can also start now a graphical X11-terminal connecting you to the dedicated resource that is available for 2 hours. You can start it by the command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ xterm&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that, once the walltime limit has been reached the resources - i.e. the compute node - will automatically be revoked.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
An interactive parallel application running on one compute node or on many compute nodes (e.g. here 5 nodes) with 96 cores each requires usually an amount of memory in GByte (e.g. 50 GByte) and a maximum time (e.g. 1 hour). E.g. 5 nodes can be allocated by the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ salloc -p cpu -N 5 --ntasks-per-node=96 -t 01:00:00  --mem=50gb&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now you can run parallel jobs on 480 cores requiring 50 GByte of memory per node. Please be aware that you will be logged in on core 0 of the first node.&lt;br /&gt;
If you want to have access to another node you have to open a new terminal, connect it also to bwUniCluster 3.0 and type the following commands to&lt;br /&gt;
connect to the running interactive job and then to a specific node:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ srun --jobid=XXXXXXXX --pty /bin/bash&lt;br /&gt;
$ srun --nodelist=uc3nXXX --pty /bin/bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
With the command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
the jobid and the nodelist can be shown.&lt;br /&gt;
&lt;br /&gt;
If you want to run MPI-programs, you can do it by simply typing mpirun &amp;lt;program_name&amp;gt;. Then your program will be run on 480 cores. A very simple example for starting a parallel job can be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mpirun &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You can also start the debugger ddt by the commands:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module add devel/ddt&lt;br /&gt;
$ ddt &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The above commands will execute the parallel program &amp;lt;my_mpi_program&amp;gt; on all available cores. You can also start parallel programs on a subset of cores; an example for this can be:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mpirun -n 50 &amp;lt;my_mpi_program&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you are using Intel MPI you must start &amp;lt;my_mpi_program&amp;gt; by the command mpiexec.hydra (instead of mpirun).&lt;br /&gt;
&lt;br /&gt;
== Interactive Computing with Jupyter ==&lt;br /&gt;
&lt;br /&gt;
== Monitor and manage jobs ==&lt;br /&gt;
&lt;br /&gt;
=== List of your submitted jobs : squeue ===&lt;br /&gt;
Displays information about YOUR active, pending and/or recently completed jobs. The command displays all own active and pending jobs. The command squeue is explained in detail on the webpage https://slurm.schedmd.com/squeue.html or via manpage (man squeue).&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;squeue&#039;&#039; example on bwUniCluster 3.0 &amp;lt;small&amp;gt;(Only your own jobs are displayed!)&amp;lt;/small&amp;gt;.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue &lt;br /&gt;
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
              1262       cpu     wrap ka_ab123  R       8:15      1 uc3n002&lt;br /&gt;
              1267 dev_gpu_h     wrap ka_ab123 PD       0:00      1 (Resources)&lt;br /&gt;
              1265   highmem     wrap ka_ab123  R       2:41      1 uc3n084&lt;br /&gt;
$ squeue -l&lt;br /&gt;
             JOBID PARTITION     NAME     USER    STATE       TIME TIME_LIMI  NODES NODELIST(REASON)&lt;br /&gt;
              1262       cpu     wrap ka_ab123  RUNNING       8:55     20:00      1 uc3n002&lt;br /&gt;
              1267 dev_gpu_h     wrap ka_ab123  PENDING       0:00     20:00      1 (Resources)&lt;br /&gt;
              1265   highmem     wrap ka_ab123  RUNNING       3:21     20:00      1 uc3n084&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Detailed job information : scontrol show job ===&lt;br /&gt;
scontrol show job displays detailed job state information and diagnostic output for all or a specified job of yours. Detailed information is available for active, pending and recently completed jobs. The command scontrol is explained in detail on the webpage https://slurm.schedmd.com/scontrol.html or via manpage (man scontrol). &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Display the state of all your jobs in normal mode: scontrol show job&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Display the state of a job with &amp;lt;jobid&amp;gt; in normal mode: scontrol show job &amp;lt;jobid&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* Here is an example from bwUniCluster 3.0.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ squeue&lt;br /&gt;
JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)&lt;br /&gt;
1262       cpu     wrap ka_zs040  R       1:12      1 uc3n002&lt;br /&gt;
&lt;br /&gt;
$&lt;br /&gt;
$ # now, see what&#039;s up with my pending job with jobid 1262&lt;br /&gt;
$ &lt;br /&gt;
$ scontrol show job 1262&lt;br /&gt;
&lt;br /&gt;
JobId=1262 JobName=wrap&lt;br /&gt;
   UserId=ka_zs0402(241992) GroupId=ka_scc(12345) MCS_label=N/A&lt;br /&gt;
   Priority=4246 Nice=0 Account=ka QOS=normal&lt;br /&gt;
   JobState=RUNNING Reason=None Dependency=(null)&lt;br /&gt;
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0&lt;br /&gt;
   RunTime=00:00:37 TimeLimit=00:20:00 TimeMin=N/A&lt;br /&gt;
   SubmitTime=2025-04-04T10:01:30 EligibleTime=2025-04-04T10:01:30&lt;br /&gt;
   AccrueTime=2025-04-04T10:01:30&lt;br /&gt;
   StartTime=2025-04-04T10:01:31 EndTime=2025-04-04T10:21:31 Deadline=N/A&lt;br /&gt;
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2025-04-04T10:01:31 Scheduler=Main&lt;br /&gt;
   Partition=cpu AllocNode:Sid=uc3n999:2819841&lt;br /&gt;
   ReqNodeList=(null) ExcNodeList=(null)&lt;br /&gt;
   NodeList=uc3n002&lt;br /&gt;
   BatchHost=uc3n002&lt;br /&gt;
   NumNodes=1 NumCPUs=2 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*&lt;br /&gt;
   ReqTRES=cpu=1,mem=2000M,node=1,billing=1&lt;br /&gt;
   AllocTRES=cpu=2,mem=4000M,node=1,billing=2&lt;br /&gt;
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:1 CoreSpec=*&lt;br /&gt;
   MinCPUsNode=1 MinMemoryCPU=2000M MinTmpDiskNode=0&lt;br /&gt;
   Features=(null) DelayBoot=00:00:00&lt;br /&gt;
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)&lt;br /&gt;
   Command=(null)&lt;br /&gt;
   WorkDir=/pfs/data6/home/ka/ka_scc/ka_zs0402&lt;br /&gt;
   StdErr=/pfs/data6/home/ka/ka_scc/ka_zs0402/slurm-1262.out&lt;br /&gt;
   StdIn=/dev/null&lt;br /&gt;
   StdOut=/pfs/data6/home/ka/ka_scc/ka_zs0402/slurm-1262.out&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
=== Canceling own jobs : scancel ===&lt;br /&gt;
The scancel command is used to cancel jobs. The command scancel is explained in detail on the webpage https://slurm.schedmd.com/scancel.html or via manpage (man scancel). The command is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ scancel [-i] &amp;lt;job-id&amp;gt;&lt;br /&gt;
$ scancel -t &amp;lt;job_state_name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Slurm Options =&lt;br /&gt;
[[BwUniCluster3.0/Running_Jobs/Slurm | Detailed Slurm usage]]&lt;br /&gt;
&lt;br /&gt;
= Best Practices =&lt;br /&gt;
&lt;br /&gt;
== Step-by-Step example==&lt;br /&gt;
&lt;br /&gt;
== Dos and Don&#039;ts ==&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=Development/R&amp;diff=15069</id>
		<title>Development/R</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=Development/R&amp;diff=15069"/>
		<updated>2025-07-10T12:36:43Z</updated>

		<summary type="html">&lt;p&gt;S Braun: /* Installing R-Packages into your home folder */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Description =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;R&#039;&#039;&#039; is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&amp;amp;T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.&lt;br /&gt;
&lt;br /&gt;
R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.&lt;br /&gt;
&lt;br /&gt;
One of R’s strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.&lt;br /&gt;
&lt;br /&gt;
R is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License in source code form. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux), Windows and MacOS.&lt;br /&gt;
&lt;br /&gt;
= Usage =&lt;br /&gt;
The R installation also provides the standalone library libRmath. This library allows you to access R routines from your own C or C++ programs (see section 9 of the &#039;R Installation and Administration&#039; manual).&lt;br /&gt;
&lt;br /&gt;
= Description =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;R&#039;&#039;&#039; is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&amp;amp;T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.&lt;br /&gt;
&lt;br /&gt;
R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.&lt;br /&gt;
&lt;br /&gt;
One of R’s strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.&lt;br /&gt;
&lt;br /&gt;
R is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License in source code form. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux), Windows and MacOS.&lt;br /&gt;
&lt;br /&gt;
The R installation also provides the standalone library &#039;&#039;&#039;libRmath&#039;&#039;&#039;. This library allows you to access R routines from your own C or C++ programs (see section 9 of the &#039;R Installation and Administration&#039; manual).&lt;br /&gt;
&lt;br /&gt;
= Package management =&lt;br /&gt;
&lt;br /&gt;
== Installing R-Packages into your home folder ==&lt;br /&gt;
Since we cannot provide a software module for every R package, we recommend to install special R packages locally into your home folder. Depending on the package, some dependencies required for the installation may only be available on the login-nodes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;gt; library()                                                                # List pre-installed packages&lt;br /&gt;
&amp;gt; install.packages(&#039;package_name&#039;, repos=&amp;quot;http://cran.r-project.org&amp;quot;)      # Install your R package and the dependencies &lt;br /&gt;
&amp;gt; library(package_name)                                                    # Load the package into you R instance&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The package is now installed permanently in your home folder and is available every time you start R. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
By default R uses a version (and platform) specific path for personal libraries, such as &lt;br /&gt;
&amp;quot;$HOME/R/x86_64-pc-linux-gnu-library/x.y&amp;quot; for R version x.y.z. This directory will be created automatically (after confirmation) when installing a personal package for the first time.&lt;br /&gt;
&lt;br /&gt;
Users can customize a common location of their personal library packages, e.g. ~/R_libs, rather than the default location. A customized directory must exist before installing a personal package for the first time, i.e. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mkdir -p ~/R_libs&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In order to set the correct path when using R, the location must also be defined in a configuration file &#039;&#039;&#039;~/.Renviron&#039;&#039;&#039; in the home directory containing the following line:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
R_LIBS_USER=&amp;quot;~/R_libs&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By setting up a (fixed) custom location for personal library packages, any personal package installed into that directory will be visible across different R versions. This may be advantageous if the packages are to be used with different (future) R versions. &lt;br /&gt;
&lt;br /&gt;
A version specific path, such as the default path, allows users to maintain multiple personal library stacks for different (major and minor) R versions and does also prevent users from mixing their stack with libraries built with different R versions. &lt;br /&gt;
&lt;br /&gt;
The drawback is that, whenever switching to a new R release, the personal library stack &#039;&#039;&#039;must&#039;&#039;&#039; be rebuilt with that new R version into the corresponding (version specific) library path. This is considered good practice anyway in order to ensure a consistent personal library stack for any specific R version in use. Mixing libraries built with different major and minor R versions is discouraged, as this may result in unpredictable and subtle errors. Packages that are built and installed with one version of R may be incompatible with a newer version of R, at least when the major or minor version changes. The same is true if several versions are used simultaneously, e.g. a newer R version for a more recently started project and and older version for another project (but eventually picking up libraries built with the newer R version).&lt;br /&gt;
&lt;br /&gt;
Because the default version may change over time, it is highly recommended to always load a specific version, e.g.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load math/R/4.4.2-openblas&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Pre-installed R-packages ==&lt;br /&gt;
* Rmpi&lt;br /&gt;
* iterators&lt;br /&gt;
* foreach&lt;br /&gt;
* doMPI&lt;br /&gt;
* doParallel&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Custom installation options ==&lt;br /&gt;
=== Makevars ===&lt;br /&gt;
You can add custom build and installation flags for new R-packages through the &#039;&#039;~/.R/Makevars&#039;&#039; file. This will take the form of a basic Makefile and will be added to the existing options set by the module. While the module&#039;s default should be appropriate to install most packages, it can be helpful or necessary to include additional options, for example to include locally installed dependencies to g++:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
MAKEFLAGS=-j4 &lt;br /&gt;
CXXFLAGS=-I/path/to/dependency/include -L/path/to/dependency/lib -llib&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is also possible to set more targeted instruction sets, e.g. for AMD processors:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CXXFLAGS=-march=znver3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The defaults can be found under &#039;&#039;$R_HOME_DIR/lib64/R/etc/Makeconf&#039;&#039;. It is usually not advisable to change the compilers (gcc, g++) themselves.&lt;br /&gt;
&lt;br /&gt;
=== Configure arguments ===&lt;br /&gt;
You can set further &#039;&#039;configure.args&#039;&#039; and &#039;&#039;configure.vars&#039;&#039; as character vectors to &#039;&#039;install.packages()&#039;&#039;, e.g.:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
install.packages(&amp;quot;Rjags&amp;quot;, configure.args = &amp;quot;--enable-rpath&amp;quot;)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Tips for local dependencies ===&lt;br /&gt;
&lt;br /&gt;
It can make sense to build dependencies with MKL for BLAS/LAPACK in the same that it is used for R. Specifically, the MKL libraries and options that were used to install R are the following: &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-Wl,--no-as-needed -lmkl_gf_lp64 -Wl,--start-group -lmkl_gnu_thread -lmkl_core -Wl,--end-group -fopenmp -ldl -lpthread -lm&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=Development/R&amp;diff=15068</id>
		<title>Development/R</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=Development/R&amp;diff=15068"/>
		<updated>2025-07-10T12:36:23Z</updated>

		<summary type="html">&lt;p&gt;S Braun: /* Installing R-Packages into your home folder */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Description =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;R&#039;&#039;&#039; is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&amp;amp;T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.&lt;br /&gt;
&lt;br /&gt;
R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.&lt;br /&gt;
&lt;br /&gt;
One of R’s strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.&lt;br /&gt;
&lt;br /&gt;
R is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License in source code form. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux), Windows and MacOS.&lt;br /&gt;
&lt;br /&gt;
= Usage =&lt;br /&gt;
The R installation also provides the standalone library libRmath. This library allows you to access R routines from your own C or C++ programs (see section 9 of the &#039;R Installation and Administration&#039; manual).&lt;br /&gt;
&lt;br /&gt;
= Description =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;R&#039;&#039;&#039; is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&amp;amp;T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.&lt;br /&gt;
&lt;br /&gt;
R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.&lt;br /&gt;
&lt;br /&gt;
One of R’s strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.&lt;br /&gt;
&lt;br /&gt;
R is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License in source code form. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux), Windows and MacOS.&lt;br /&gt;
&lt;br /&gt;
The R installation also provides the standalone library &#039;&#039;&#039;libRmath&#039;&#039;&#039;. This library allows you to access R routines from your own C or C++ programs (see section 9 of the &#039;R Installation and Administration&#039; manual).&lt;br /&gt;
&lt;br /&gt;
= Package management =&lt;br /&gt;
&lt;br /&gt;
== Installing R-Packages into your home folder ==&lt;br /&gt;
Since we cannot provide a software module for every R package, we recommend to install special R packages locally into your home folder. Depending on the package, some dependencies required for the installation may only be available on the login-nodes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;gt; library()                                                                # List pre-installed packages&lt;br /&gt;
&amp;gt; install.packages(&#039;package_name&#039;, repos=&amp;quot;http://cran.r-project.org&amp;quot;)      # Install your R package and the dependencies &lt;br /&gt;
&amp;gt; library(package_name)                                                    # Load the package into you R instance&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The package is now installed permanently in your home folder and is available every time you start R. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
By default R uses a version (and platform) specific path for personal libraries, such as &lt;br /&gt;
&amp;quot;$HOME/R/x86_64-pc-linux-gnu-library/x.y&amp;quot; for R version x.y.z. This directory will be created automatically (after confirmation) when installing a personal package for the first time.&lt;br /&gt;
&lt;br /&gt;
Users can customize a common location of their personal library packages, e.g. ~/R_libs, rather than the default location. A customized directory must exist before installing a personal package for the first time, i.e. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mkdir -p ~/R_libs&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In order to set the correct path when using R, the location must also be defined in a configuration file &#039;&#039;&#039;~/.Renviron&#039;&#039;&#039; in the home directory containing the following line:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
R_LIBS_USER=&amp;quot;~/R_libs&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By setting up a (fixed) custom location for personal library packages, any personal package installed into that directory will be visible across different R versions. This may be advantageous if the packages are to be used with different (future) R versions. &lt;br /&gt;
&lt;br /&gt;
A version specific path, such as the default path, allows users to maintain multiple personal library stacks for different (major and minor) R versions and does also prevent users from mixing their stack with libraries built with different R versions. &lt;br /&gt;
&lt;br /&gt;
The drawback is that, whenever switching to a new R release, the personal library stack &#039;&#039;&#039;must&#039;&#039;&#039; be rebuilt with that new R version into the corresponding (version specific) library path. This is considered good practice anyway in order to ensure a consistent personal library stack for any specific R version in use. Mixing libraries built with different major and minor R versions is discouraged, as this may result in unpredictable and subtle errors. Packages that are built and installed with one version of R may be incompatible with a newer version of R, at least when the major or minor version changes. The same is true if several versions are used simultaneously, e.g. a newer R version for a more recently started project and and older version for another project (but eventually picking up libraries built with the newer R version).&lt;br /&gt;
&lt;br /&gt;
Because the default version may change over time, it is highly recommended to always load a specific version, e.g.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load math/R/4.3.3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Pre-installed R-packages ==&lt;br /&gt;
* Rmpi&lt;br /&gt;
* iterators&lt;br /&gt;
* foreach&lt;br /&gt;
* doMPI&lt;br /&gt;
* doParallel&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Custom installation options ==&lt;br /&gt;
=== Makevars ===&lt;br /&gt;
You can add custom build and installation flags for new R-packages through the &#039;&#039;~/.R/Makevars&#039;&#039; file. This will take the form of a basic Makefile and will be added to the existing options set by the module. While the module&#039;s default should be appropriate to install most packages, it can be helpful or necessary to include additional options, for example to include locally installed dependencies to g++:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
MAKEFLAGS=-j4 &lt;br /&gt;
CXXFLAGS=-I/path/to/dependency/include -L/path/to/dependency/lib -llib&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is also possible to set more targeted instruction sets, e.g. for AMD processors:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CXXFLAGS=-march=znver3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The defaults can be found under &#039;&#039;$R_HOME_DIR/lib64/R/etc/Makeconf&#039;&#039;. It is usually not advisable to change the compilers (gcc, g++) themselves.&lt;br /&gt;
&lt;br /&gt;
=== Configure arguments ===&lt;br /&gt;
You can set further &#039;&#039;configure.args&#039;&#039; and &#039;&#039;configure.vars&#039;&#039; as character vectors to &#039;&#039;install.packages()&#039;&#039;, e.g.:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
install.packages(&amp;quot;Rjags&amp;quot;, configure.args = &amp;quot;--enable-rpath&amp;quot;)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Tips for local dependencies ===&lt;br /&gt;
&lt;br /&gt;
It can make sense to build dependencies with MKL for BLAS/LAPACK in the same that it is used for R. Specifically, the MKL libraries and options that were used to install R are the following: &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-Wl,--no-as-needed -lmkl_gf_lp64 -Wl,--start-group -lmkl_gnu_thread -lmkl_core -Wl,--end-group -fopenmp -ldl -lpthread -lm&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=Development/R&amp;diff=15067</id>
		<title>Development/R</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=Development/R&amp;diff=15067"/>
		<updated>2025-07-10T12:35:42Z</updated>

		<summary type="html">&lt;p&gt;S Braun: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Description =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;R&#039;&#039;&#039; is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&amp;amp;T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.&lt;br /&gt;
&lt;br /&gt;
R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.&lt;br /&gt;
&lt;br /&gt;
One of R’s strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.&lt;br /&gt;
&lt;br /&gt;
R is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License in source code form. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux), Windows and MacOS.&lt;br /&gt;
&lt;br /&gt;
= Usage =&lt;br /&gt;
The R installation also provides the standalone library libRmath. This library allows you to access R routines from your own C or C++ programs (see section 9 of the &#039;R Installation and Administration&#039; manual).&lt;br /&gt;
&lt;br /&gt;
== Installing R-Packages into your home folder ==&lt;br /&gt;
Since we cannot provide a software module for every R package, we recommend to install special R packages locally into your home folder. One possibility doing this is from within an interactive R session: &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;gt; library()                                                                # List preinstalled packages&lt;br /&gt;
&amp;gt; install.packages(&#039;package_name&#039;, repos=&amp;quot;http://cran.r-project.org&amp;quot;)      # Installing your R package and the dependencies &lt;br /&gt;
&amp;gt; library(package_name)                                                    # Loading the package into you R instance&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The package is now installed permanently in your home folder and is available every time you start R. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
By default R uses a version (and platform) specific path for personal libraries, such as &lt;br /&gt;
&amp;quot;$HOME/R/x86_64-pc-linux-gnu-library/x.y&amp;quot; for R version x.y.z. This directory will be created automatically (after confirmation) when installing a personal package for the first time.&lt;br /&gt;
&lt;br /&gt;
Users can customize a common location of their personal library packages, e.g. ~/R_libs, rather than the default location. A customized directory must exist before installing a personal package for the first time, i.e. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mkdir -p ~/R_libs&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The location must also be defined in a configuration file ~/.Renviron within the home directory containing the following line:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
R_LIBS_USER=&amp;quot;~/R_libs&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By setting up a (fixed) custom location for personal library packages, any personal package installed into that directory will be visible across different R versions. This may be advantageous if the packages are to be used with different (future) R versions. &lt;br /&gt;
&lt;br /&gt;
A version specific path, such as the default path, allows users to maintain multiple personal library stacks for different (major and minor) R versions and does also prevent users from mixing their stack with libraries built with different R versions. &lt;br /&gt;
&lt;br /&gt;
The drawback is that, whenever switching to a new R release, the personal library stack &#039;&#039;&#039;must&#039;&#039;&#039; be rebuilt with that new R version into the corresponding (version specific) library path. This is considered good practice anyway in order to ensure a consistent personal library stack for any specific R version in use. Mixing libraries built with different major and minor R versions is discouraged, as this may (or may not) result in unpredictable and subtle errors. Packages that are built and installed with one version of R may be incompatible with a newer version of R, at least when the major or minor version changes. The same is true if several versions are used simultaneously, e.g. a newer R version for a more recently started project and and older version for another project (but eventually picking up libraries built with the newer R version).&lt;br /&gt;
&lt;br /&gt;
Special care has also to be taken by users who always load the default version, i.e. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load math/R&lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
&lt;br /&gt;
as the default version number may change any time. It is therefore highly recommended to always load a specific version, e.g.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$  module load math/R/4.4.1-mkl-2022.2.1-gnu-13.3 &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=Development/R&amp;diff=15066</id>
		<title>Development/R</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=Development/R&amp;diff=15066"/>
		<updated>2025-07-10T12:33:59Z</updated>

		<summary type="html">&lt;p&gt;S Braun: /* Installing R-Packages into your home folder */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Description =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;R&#039;&#039;&#039; is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&amp;amp;T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.&lt;br /&gt;
&lt;br /&gt;
R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.&lt;br /&gt;
&lt;br /&gt;
One of R’s strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.&lt;br /&gt;
&lt;br /&gt;
R is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License in source code form. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux), Windows and MacOS.&lt;br /&gt;
&lt;br /&gt;
The R installation also provides the standalone library &#039;&#039;&#039;libRmath&#039;&#039;&#039;. This library allows you to access R routines from your own C or C++ programs (see section 9 of the &#039;R Installation and Administration&#039; manual).&lt;br /&gt;
&lt;br /&gt;
= Package management =&lt;br /&gt;
&lt;br /&gt;
== Installing R-Packages into your home folder ==&lt;br /&gt;
Since we cannot provide a software module for every R package, we recommend to install special R packages locally into your home folder. Depending on the package, some dependencies required for the installation may only be available on the login-nodes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;gt; library()                                                                # List pre-installed packages&lt;br /&gt;
&amp;gt; install.packages(&#039;package_name&#039;, repos=&amp;quot;http://cran.r-project.org&amp;quot;)      # Install your R package and the dependencies &lt;br /&gt;
&amp;gt; library(package_name)                                                    # Load the package into you R instance&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The package is now installed permanently in your home folder and is available every time you start R. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
By default R uses a version (and platform) specific path for personal libraries, such as &lt;br /&gt;
&amp;quot;$HOME/R/x86_64-pc-linux-gnu-library/x.y&amp;quot; for R version x.y.z. This directory will be created automatically (after confirmation) when installing a personal package for the first time.&lt;br /&gt;
&lt;br /&gt;
Users can customize a common location of their personal library packages, e.g. ~/R_libs, rather than the default location. A customized directory must exist before installing a personal package for the first time, i.e. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mkdir -p ~/R_libs&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In order to set the correct path when using R, the location must also be defined in a configuration file &#039;&#039;&#039;~/.Renviron&#039;&#039;&#039; in the home directory containing the following line:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
R_LIBS_USER=&amp;quot;~/R_libs&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By setting up a (fixed) custom location for personal library packages, any personal package installed into that directory will be visible across different R versions. This may be advantageous if the packages are to be used with different (future) R versions. &lt;br /&gt;
&lt;br /&gt;
A version specific path, such as the default path, allows users to maintain multiple personal library stacks for different (major and minor) R versions and does also prevent users from mixing their stack with libraries built with different R versions. &lt;br /&gt;
&lt;br /&gt;
The drawback is that, whenever switching to a new R release, the personal library stack &#039;&#039;&#039;must&#039;&#039;&#039; be rebuilt with that new R version into the corresponding (version specific) library path. This is considered good practice anyway in order to ensure a consistent personal library stack for any specific R version in use. Mixing libraries built with different major and minor R versions is discouraged, as this may result in unpredictable and subtle errors. Packages that are built and installed with one version of R may be incompatible with a newer version of R, at least when the major or minor version changes. The same is true if several versions are used simultaneously, e.g. a newer R version for a more recently started project and and older version for another project (but eventually picking up libraries built with the newer R version).&lt;br /&gt;
&lt;br /&gt;
Because the default version may change over time, it is highly recommended to always load a specific version, e.g.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load math/R/4.4.2-openblas&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Pre-installed R-packages ==&lt;br /&gt;
* Rmpi&lt;br /&gt;
* iterators&lt;br /&gt;
* foreach&lt;br /&gt;
* doMPI&lt;br /&gt;
* doParallel&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Custom installation options ==&lt;br /&gt;
=== Makevars ===&lt;br /&gt;
You can add custom build and installation flags for new R-packages through the &#039;&#039;~/.R/Makevars&#039;&#039; file. This will take the form of a basic Makefile and will be added to the existing options set by the module. While the module&#039;s default should be appropriate to install most packages, it can be helpful or necessary to include additional options, for example to include locally installed dependencies to g++:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
MAKEFLAGS=-j4 &lt;br /&gt;
CXXFLAGS=-I/path/to/dependency/include -L/path/to/dependency/lib -llib&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is also possible to set more targeted instruction sets, e.g. for AMD processors:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CXXFLAGS=-march=znver3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The defaults can be found under &#039;&#039;$R_HOME_DIR/lib64/R/etc/Makeconf&#039;&#039;. It is usually not advisable to change the compilers (gcc, g++) themselves.&lt;br /&gt;
&lt;br /&gt;
=== Configure arguments ===&lt;br /&gt;
You can set further &#039;&#039;configure.args&#039;&#039; and &#039;&#039;configure.vars&#039;&#039; as character vectors to &#039;&#039;install.packages()&#039;&#039;, e.g.:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
install.packages(&amp;quot;Rjags&amp;quot;, configure.args = &amp;quot;--enable-rpath&amp;quot;)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Tips for local dependencies ===&lt;br /&gt;
&lt;br /&gt;
It can make sense to build dependencies with MKL for BLAS/LAPACK in the same that it is used for R. Specifically, the MKL libraries and options that were used to install R are the following: &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-Wl,--no-as-needed -lmkl_gf_lp64 -Wl,--start-group -lmkl_gnu_thread -lmkl_core -Wl,--end-group -fopenmp -ldl -lpthread -lm&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=Development/R&amp;diff=15065</id>
		<title>Development/R</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=Development/R&amp;diff=15065"/>
		<updated>2025-07-10T12:32:57Z</updated>

		<summary type="html">&lt;p&gt;S Braun: Created page with &amp;quot;= Description =  &amp;#039;&amp;#039;&amp;#039;R&amp;#039;&amp;#039;&amp;#039; is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&amp;amp;T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.  R provides a wide variety of statistical (linea...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Description =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;R&#039;&#039;&#039; is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&amp;amp;T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.&lt;br /&gt;
&lt;br /&gt;
R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.&lt;br /&gt;
&lt;br /&gt;
One of R’s strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.&lt;br /&gt;
&lt;br /&gt;
R is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License in source code form. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux), Windows and MacOS.&lt;br /&gt;
&lt;br /&gt;
The R installation also provides the standalone library &#039;&#039;&#039;libRmath&#039;&#039;&#039;. This library allows you to access R routines from your own C or C++ programs (see section 9 of the &#039;R Installation and Administration&#039; manual).&lt;br /&gt;
&lt;br /&gt;
= Package management =&lt;br /&gt;
&lt;br /&gt;
== Installing R-Packages into your home folder ==&lt;br /&gt;
Since we cannot provide a software module for every R package, we recommend to install special R packages locally into your home folder. Depending on the package, some dependencies required for the installation may only be available on the login-nodes.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;gt; library()                                                                # List pre-installed packages&lt;br /&gt;
&amp;gt; install.packages(&#039;package_name&#039;, repos=&amp;quot;http://cran.r-project.org&amp;quot;)      # Install your R package and the dependencies &lt;br /&gt;
&amp;gt; library(package_name)                                                    # Load the package into you R instance&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The package is now installed permanently in your home folder and is available every time you start R. &lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
By default R uses a version (and platform) specific path for personal libraries, such as &lt;br /&gt;
&amp;quot;$HOME/R/x86_64-pc-linux-gnu-library/x.y&amp;quot; for R version x.y.z. This directory will be created automatically (after confirmation) when installing a personal package for the first time.&lt;br /&gt;
&lt;br /&gt;
Users can customize a common location of their personal library packages, e.g. ~/R_libs, rather than the default location. A customized directory must exist before installing a personal package for the first time, i.e. &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ mkdir -p ~/R_libs&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In order to set the correct path when using R, the location must also be defined in a configuration file &#039;&#039;&#039;~/.Renviron&#039;&#039;&#039; in the home directory containing the following line:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
R_LIBS_USER=&amp;quot;~/R_libs&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
By setting up a (fixed) custom location for personal library packages, any personal package installed into that directory will be visible across different R versions. This may be advantageous if the packages are to be used with different (future) R versions. &lt;br /&gt;
&lt;br /&gt;
A version specific path, such as the default path, allows users to maintain multiple personal library stacks for different (major and minor) R versions and does also prevent users from mixing their stack with libraries built with different R versions. &lt;br /&gt;
&lt;br /&gt;
The drawback is that, whenever switching to a new R release, the personal library stack &#039;&#039;&#039;must&#039;&#039;&#039; be rebuilt with that new R version into the corresponding (version specific) library path. This is considered good practice anyway in order to ensure a consistent personal library stack for any specific R version in use. Mixing libraries built with different major and minor R versions is discouraged, as this may result in unpredictable and subtle errors. Packages that are built and installed with one version of R may be incompatible with a newer version of R, at least when the major or minor version changes. The same is true if several versions are used simultaneously, e.g. a newer R version for a more recently started project and and older version for another project (but eventually picking up libraries built with the newer R version).&lt;br /&gt;
&lt;br /&gt;
Because the default version may change over time, it is highly recommended to always load a specific version, e.g.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
$ module load math/R/4.3.3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Pre-installed R-packages ==&lt;br /&gt;
* Rmpi&lt;br /&gt;
* iterators&lt;br /&gt;
* foreach&lt;br /&gt;
* doMPI&lt;br /&gt;
* doParallel&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Custom installation options ==&lt;br /&gt;
=== Makevars ===&lt;br /&gt;
You can add custom build and installation flags for new R-packages through the &#039;&#039;~/.R/Makevars&#039;&#039; file. This will take the form of a basic Makefile and will be added to the existing options set by the module. While the module&#039;s default should be appropriate to install most packages, it can be helpful or necessary to include additional options, for example to include locally installed dependencies to g++:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
MAKEFLAGS=-j4 &lt;br /&gt;
CXXFLAGS=-I/path/to/dependency/include -L/path/to/dependency/lib -llib&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is also possible to set more targeted instruction sets, e.g. for AMD processors:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CXXFLAGS=-march=znver3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The defaults can be found under &#039;&#039;$R_HOME_DIR/lib64/R/etc/Makeconf&#039;&#039;. It is usually not advisable to change the compilers (gcc, g++) themselves.&lt;br /&gt;
&lt;br /&gt;
=== Configure arguments ===&lt;br /&gt;
You can set further &#039;&#039;configure.args&#039;&#039; and &#039;&#039;configure.vars&#039;&#039; as character vectors to &#039;&#039;install.packages()&#039;&#039;, e.g.:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
install.packages(&amp;quot;Rjags&amp;quot;, configure.args = &amp;quot;--enable-rpath&amp;quot;)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Tips for local dependencies ===&lt;br /&gt;
&lt;br /&gt;
It can make sense to build dependencies with MKL for BLAS/LAPACK in the same that it is used for R. Specifically, the MKL libraries and options that were used to install R are the following: &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-Wl,--no-as-needed -lmkl_gf_lp64 -Wl,--start-group -lmkl_gnu_thread -lmkl_core -Wl,--end-group -fopenmp -ldl -lpthread -lm&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=Development&amp;diff=15064</id>
		<title>Development</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=Development&amp;diff=15064"/>
		<updated>2025-07-10T12:31:54Z</updated>

		<summary type="html">&lt;p&gt;S Braun: /* Scripting Languages */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Compiling Software ==&lt;br /&gt;
&lt;br /&gt;
Overview of [[Development/General compiler usage|general compiler usage]]&lt;br /&gt;
&lt;br /&gt;
== Parallel Programming ==&lt;br /&gt;
Overview on [[Development/Parallel_Programming | parallel programming with OpenMP and MPI]].&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
Compiler, libraries and development tools are provided as environment modules.&lt;br /&gt;
&lt;br /&gt;
Required reading to use: [[Environment Modules]]&lt;br /&gt;
&lt;br /&gt;
== Available Development Software ==&lt;br /&gt;
Visit [https://www.bwhpc.de/software.php https://www.bwhpc.de/software.php] select your cluster and&lt;br /&gt;
* For compiler select &amp;lt;code&amp;gt;Category → compiler&amp;lt;/code&amp;gt;&lt;br /&gt;
* For MPI select &amp;lt;code&amp;gt;Category → mpi&amp;lt;/code&amp;gt;&lt;br /&gt;
* For libraries select &amp;lt;code&amp;gt;Category → lib&amp;lt;/code&amp;gt;&lt;br /&gt;
* For numerical libraries select &amp;lt;code&amp;gt;Category → numlib&amp;lt;/code&amp;gt;&lt;br /&gt;
* For further development tools select &amp;lt;code&amp;gt;Category → devel&amp;lt;/code&amp;gt; &lt;br /&gt;
&lt;br /&gt;
On a cluster use: &amp;lt;code&amp;gt;module avail &amp;lt;Category&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Documentation ==&lt;br /&gt;
Availabe documentation for environment modules: &lt;br /&gt;
* &amp;lt;code&amp;gt;module help&amp;lt;/code&amp;gt;&lt;br /&gt;
* examples in &amp;lt;code&amp;gt;$SOFTNAME_EXA_DIR&amp;lt;/code&amp;gt;&lt;br /&gt;
* additional docu in this wiki&lt;br /&gt;
&lt;br /&gt;
== Documentation in the Wiki ==&lt;br /&gt;
Environment modules and tools for software development and parallel programming with additional documentation here in the wiki:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Integrated Development Environments ===&lt;br /&gt;
* [[Development/VS_Code|Visual Studio Code]]&lt;br /&gt;
&lt;br /&gt;
=== Compiler and Debugger ===&lt;br /&gt;
* [[Development/GCC|GCC]]&lt;br /&gt;
* [[Development/GDB|GDB]]&lt;br /&gt;
* [[Development/Intel_Compiler|Intel Compiler]]&lt;br /&gt;
&lt;br /&gt;
=== Development Tools ===&lt;br /&gt;
* [[Development/Score-P|Score-P]]:&amp;lt;br /&amp;gt;Tracing of OpenMP-, MPI- and GPU-parallel applications for Vampir and other performance analysis tools.&lt;br /&gt;
* [[Development/Vampir_and_VampirServer|Vampir and VampirServer]]:&amp;lt;br /&amp;gt;Highly scalable Performance Analysis of OpenMP-, MPI- and GPU-parallel applications.&lt;br /&gt;
* [[Development/Pahole|Pahole]]:&amp;lt;br /&amp;gt;Analyse data structures for cache-line alignment and (un)necessary holes that increase data structure size&lt;br /&gt;
* [[Development/Valgrind|Valgrind]]:&amp;lt;br /&amp;gt;Very valuable framework with multiple tools, e.g. to detect memory access errors&lt;br /&gt;
* Forge:&amp;lt;br /&amp;gt;Tools for debugging (arm DDT) and performance analysis (arm MAP)&lt;br /&gt;
&lt;br /&gt;
=== Libraries and Numerical Libraries ===&lt;br /&gt;
* [[Development/GSL|GSL]]&lt;br /&gt;
* [[Development/FFTW|FFTW]]&lt;br /&gt;
* [[Development/MKL|MKL]]&lt;br /&gt;
=== Scripting Languages ===&lt;br /&gt;
* [[Development/Julia|Julia]]&lt;br /&gt;
* [[Development/Python|Python]]&lt;br /&gt;
* [[Development/R|R]]&lt;br /&gt;
&lt;br /&gt;
=== Development Environments ===&lt;br /&gt;
* [[Development/Conda|Conda]]&lt;br /&gt;
* [[Development/Containers|Containers]]&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=File:Uc3.svg&amp;diff=15014</id>
		<title>File:Uc3.svg</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=File:Uc3.svg&amp;diff=15014"/>
		<updated>2025-06-26T08:14:54Z</updated>

		<summary type="html">&lt;p&gt;S Braun: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=User:S_Braun&amp;diff=15013</id>
		<title>User:S Braun</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=User:S_Braun&amp;diff=15013"/>
		<updated>2025-06-26T08:08:35Z</updated>

		<summary type="html">&lt;p&gt;S Braun: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[File:uc3.svg|Optionen|Überschrift]]&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
	<entry>
		<id>https://wiki.bwhpc.de/wiki/index.php?title=User:S_Braun&amp;diff=15012</id>
		<title>User:S Braun</title>
		<link rel="alternate" type="text/html" href="https://wiki.bwhpc.de/wiki/index.php?title=User:S_Braun&amp;diff=15012"/>
		<updated>2025-06-26T08:04:47Z</updated>

		<summary type="html">&lt;p&gt;S Braun: Created page with &amp;quot;Überschrift&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[File:uc3.drawio.svg|Optionen|Überschrift]]&lt;/div&gt;</summary>
		<author><name>S Braun</name></author>
	</entry>
</feed>