Difference between revisions of "Running Calculations"

From bwHPC Wiki
Jump to: navigation, search
m (K Siegmund moved page Batch Jobs to Running Calculations)
Line 1: Line 1:
  +
On your desktop computer, you start your calculations and they run until they are finished, then your desktop does (mostly) nothing, until again, you start a calculation. A compute cluster has several hundred, maybe a thousand computers (compute nodes) and many people use them for many calculations and so running your job has to include some extra steps:
Depending on which bwHPC cluster is to be used, please read the cluster corresponding batch job user manual:
 
   
  +
# prepare a script (usually a shell script), with all the commands that are necessary to run your calculation from start to finish. In addition to the commands necessary to run the calculation, this ''batch script'' has a header section, in which you specify details like required compute cores, estimated runtime, memory requirements, disk space needed, etc.
* For bwUniCluster 2.0 read:
 
  +
# ''Submit'' the script into a queue, where your ''job'' (calculation) waits in row with other compute jobs until the resources you requested in the header become available.
** [[bwUniCluster_2.0_Slurm_common_Features|Slurm @ bwUniCluster 2.0]]
 
  +
# Execution: Once your job reaches the front of the queue, your script is executed on a compute node. Your calculation runs on that node until it is finished or reaches the specified time limit.
* For bwForCluster JUSTUS 2 read:
 
  +
# Save results: At the end of your script, include commands to save the calculation results back to your home directory.
** [[Slurm JUSTUS 2| Slurm @ bwForCluster JUSTUS 2]]
 
  +
* For bwForCluster Helix read:
 
  +
There are two types of batch systems currently used on bwHPC clusters, Moab (legacy) and Slurm.
** [[Helix/Slurm | Slurm @ bwForCluster Helix]]
 
  +
* For bwForCluster NEMO read:
 
  +
Because of differences in configuration (partly due to different available hardware), each cluster has their own batch system documention:
** [[Batch Jobs Moab|Moab @ bwForCluster NEMO]] and the [[bwForCluster NEMO Specific Batch Features]]
 
  +
* For bwForCluster BinAC read:
 
  +
* Slurm systems
** [[Batch Jobs Moab|Moab @ bwForCluster BinAC]] and [[BinAC/Specific_Batch_Features|its specific modifications]]
 
  +
**[[bwUniCluster_2.0_Slurm_common_Features|Slurm bwUniCluster 2.0]]
  +
** [[JUSTUS2/Slurm | Slurm JUSTUS 2]]
  +
** [[Helix/Slurm | Slurm Helix]]
  +
* Moab systems
  +
** [[Batch Jobs Moab]] (general description, valid for both systems)
  +
** [[bwForCluster NEMO Specific Batch Features|Moab NEMO specific information]]
  +
** [[BinAC/Specific_Batch_Features|Moab BinAC specific information]]

Revision as of 15:45, 6 June 2023

On your desktop computer, you start your calculations and they run until they are finished, then your desktop does (mostly) nothing, until again, you start a calculation. A compute cluster has several hundred, maybe a thousand computers (compute nodes) and many people use them for many calculations and so running your job has to include some extra steps:

  1. prepare a script (usually a shell script), with all the commands that are necessary to run your calculation from start to finish. In addition to the commands necessary to run the calculation, this batch script has a header section, in which you specify details like required compute cores, estimated runtime, memory requirements, disk space needed, etc.
  2. Submit the script into a queue, where your job (calculation) waits in row with other compute jobs until the resources you requested in the header become available.
  3. Execution: Once your job reaches the front of the queue, your script is executed on a compute node. Your calculation runs on that node until it is finished or reaches the specified time limit.
  4. Save results: At the end of your script, include commands to save the calculation results back to your home directory.

There are two types of batch systems currently used on bwHPC clusters, Moab (legacy) and Slurm.

Because of differences in configuration (partly due to different available hardware), each cluster has their own batch system documention: