Running Calculations: Difference between revisions

From bwHPC Wiki
Jump to navigation Jump to search
No edit summary
 
(35 intermediate revisions by 3 users not shown)
Line 1: Line 1:
== Description ==
[[File:running_calculations_on_cluster.svg|thumb|upright=0.4]]
[[File:running_calculations_on_cluster.svg|thumb|upright=0.4]]
On your desktop computer, you start your calculations and they start immediately, run until they are finished, then your desktop does mostly nothing, until you start another calculation. A compute cluster has several hundred, maybe a thousand computers (compute nodes), all of them are busy most of the time and many people want to run a great number of calculations. So running your job has to include some extra steps:
On your desktop computer, you start your calculations and they start immediately, run until they are finished, then your desktop does mostly nothing, until you start another calculation. A compute cluster has several hundred, maybe a thousand computers (compute nodes), all of them are busy most of the time and many people want to run a great number of calculations. So running your job has to include some extra steps:


# prepare a script (usually a shell script), with all the commands that are necessary to run your calculation from start to finish. In addition to the commands necessary to run the calculation, this ''batch script'' has a header section, in which you specify details like required compute cores, estimated runtime, memory requirements, disk space needed, etc.
# prepare a script (a set commands to run - usually as a shell script), with all the commands that are necessary to run your calculation from start to finish. In addition to the commands necessary to run the calculation, this ''[[batch script]]'' has a header section, in which you specify details like required compute cores (processing units witin a computer), estimated runtime, memory requirements, disk space needed, etc.
# ''Submit'' the script into a queue, where your ''job'' (calculation)
# ''Submit'' the script into a queue, where your ''job'' (calculation)
# Job is queued and waits in row with other compute jobs until the resources you requested in the header become available.
# is queued and waits in row with other compute jobs until the resources you requested in the header become available.
# Execution: Once your job reaches the front of the queue, your script is executed on a compute node. Your calculation runs on that node until it is finished or reaches the specified time limit.
# Execution: Once your job reaches the front of the queue, your script is executed on a compute node. Your calculation runs on that node until it is finished or reaches the specified time limit.
# Save results: At the end of your script, include commands to save the calculation results back to your home directory.
# Save results: At the end of your script, include commands to save the calculation results back to your home directory.


There are two types of batch systems currently used on bwHPC clusters, called "Moab" (legacy installs) and "Slurm".
There are two types of [[batch system]]s currently used on bwHPC clusters, called "Moab" (legacy installs) and "Slurm".

== Example Jobs ==

You are in luck, if you need to use a software that is installed on your bwHPC cluster as a software module (List of installed software via "module avail" on the cluster or also at https://www.bwhpc.de/software.html).

For most software that a bwHPC project installed on the cluster, we have prepared an example job script running some example calculation with that exact software.

How to access these examples is described in the "Software job examples" section of the [[Environment_Modules]] page - in short, after loading the module, the examples will be available in path available through the variable $SOFTWARENAME_EXA_DIR (e.g. for the module chem/lammps in $LAMMPS_EXA_DIR)

== Link to Batch System per Cluster ==


Because of differences in configuration (partly due to different available hardware), each cluster has their own batch system documention:
Because of differences in configuration (partly due to different available hardware), each cluster has their own batch system documention:
Line 17: Line 28:
** [[Helix/Slurm | Slurm Helix]]
** [[Helix/Slurm | Slurm Helix]]
* Moab systems (legacy systems with deprecated queuing system)
* Moab systems (legacy systems with deprecated queuing system)
** [[NEMO/Moab|Moab NEMO specific information]]
** [[NEMO/Moab|Moab NEMO]]
** [[BinAC/Moab|Moab BinAC specific information]]
** [[BinAC/Moab|Moab BinAC]]

{|style="background:#deffee; width:100%;"
|style="padding:5px; background:#cef2e0; text-align:left"|
[[Image:Attention.svg|center|25px]]
|style="padding:5px; background:#cef2e0; text-align:left"|
Scientific software installed on the bwHPC Clusters often comes with simple example jobs (job script and input files). See [[Software Modules]] on how to load examples.
|}

== How to Use Computing Ressources Efficiently ==


When you are running your calculations, you will have to decide on how many compute-cores your calculation will be simultaneously calculated.
For this, your computational problem will have to be divided into pieces, which always causes some overhead.


How to find a reasonable number of how many compute cores to use for your calculation can be found under '''[[Scaling]]'''
When you are running your calculations, you will have to decide on how many compute-cores your calculation will be simultaniously calculated. For this, your computational problem will have to be divided into pieces, which always causes some overhead.


Information regarding the supported parallel programming paradigms and specific hints on their usage are summarized at '''[[Parallel Programming]]'''
How to find a reasonable number of how many compute cores to use for your calculation is described in the page
* [[Scaling]]


Running calculations on an HPC node consumes a lot of energy. To make the most of the available resources and keep cluster and energy use as efficient as possible please also see our advice for '''[[Energy Efficient Cluster Usage]]
Please also see our advice for
'''
* [[Energy Efficient Cluster Usage]]

Latest revision as of 11:52, 11 September 2024

Description

Running calculations on cluster.svg

On your desktop computer, you start your calculations and they start immediately, run until they are finished, then your desktop does mostly nothing, until you start another calculation. A compute cluster has several hundred, maybe a thousand computers (compute nodes), all of them are busy most of the time and many people want to run a great number of calculations. So running your job has to include some extra steps:

  1. prepare a script (a set commands to run - usually as a shell script), with all the commands that are necessary to run your calculation from start to finish. In addition to the commands necessary to run the calculation, this batch script has a header section, in which you specify details like required compute cores (processing units witin a computer), estimated runtime, memory requirements, disk space needed, etc.
  2. Submit the script into a queue, where your job (calculation)
  3. is queued and waits in row with other compute jobs until the resources you requested in the header become available.
  4. Execution: Once your job reaches the front of the queue, your script is executed on a compute node. Your calculation runs on that node until it is finished or reaches the specified time limit.
  5. Save results: At the end of your script, include commands to save the calculation results back to your home directory.

There are two types of batch systems currently used on bwHPC clusters, called "Moab" (legacy installs) and "Slurm".

Example Jobs

You are in luck, if you need to use a software that is installed on your bwHPC cluster as a software module (List of installed software via "module avail" on the cluster or also at https://www.bwhpc.de/software.html).

For most software that a bwHPC project installed on the cluster, we have prepared an example job script running some example calculation with that exact software.

How to access these examples is described in the "Software job examples" section of the Environment_Modules page - in short, after loading the module, the examples will be available in path available through the variable $SOFTWARENAME_EXA_DIR (e.g. for the module chem/lammps in $LAMMPS_EXA_DIR)

Link to Batch System per Cluster

Because of differences in configuration (partly due to different available hardware), each cluster has their own batch system documention:

Attention.svg

Scientific software installed on the bwHPC Clusters often comes with simple example jobs (job script and input files). See Software Modules on how to load examples.

How to Use Computing Ressources Efficiently

When you are running your calculations, you will have to decide on how many compute-cores your calculation will be simultaneously calculated. For this, your computational problem will have to be divided into pieces, which always causes some overhead.

How to find a reasonable number of how many compute cores to use for your calculation can be found under Scaling

Information regarding the supported parallel programming paradigms and specific hints on their usage are summarized at Parallel Programming

Running calculations on an HPC node consumes a lot of energy. To make the most of the available resources and keep cluster and energy use as efficient as possible please also see our advice for Energy Efficient Cluster Usage