Running Calculations: Difference between revisions
| K Siegmund (talk | contribs) No edit summary | K Siegmund (talk | contribs)  No edit summary | ||
| Line 1: | Line 1: | ||
| == Description == | |||
| [[File:running_calculations_on_cluster.svg|thumb|upright=0.4]] | [[File:running_calculations_on_cluster.svg|thumb|upright=0.4]] | ||
| On your desktop computer, you start your calculations and they start immediately, run until they are finished, then your desktop does mostly nothing, until you start another calculation. A compute cluster has several hundred, maybe a thousand computers (compute nodes), all of them are busy most of the time and many people want to run a great number of calculations. So running your job has to include some extra steps: | On your desktop computer, you start your calculations and they start immediately, run until they are finished, then your desktop does mostly nothing, until you start another calculation. A compute cluster has several hundred, maybe a thousand computers (compute nodes), all of them are busy most of the time and many people want to run a great number of calculations. So running your job has to include some extra steps: | ||
| Line 9: | Line 10: | ||
| There are two types of batch systems currently used on bwHPC clusters, called "Moab" (legacy installs) and "Slurm".  | There are two types of batch systems currently used on bwHPC clusters, called "Moab" (legacy installs) and "Slurm".  | ||
| == Link to Batch System per Cluster == | |||
| Because of differences in configuration (partly due to different available hardware), each cluster has their own batch system documention: | Because of differences in configuration (partly due to different available hardware), each cluster has their own batch system documention: | ||
| Line 21: | Line 24: | ||
| When you are running your calculations, you will have to decide on how many compute-cores your calculation will be simultaniously calculated. For this, your computational problem will have to be divided into pieces, which always causes some overhead.  | When you are running your calculations, you will have to decide on how many compute-cores your calculation will be simultaniously calculated. For this, your computational problem will have to be divided into pieces, which always causes some overhead.  | ||
| == Scaling == | |||
| How to find a reasonable number of how many compute cores to use for your calculation is described in the page | How to find a reasonable number of how many compute cores to use for your calculation is described in the page | ||
| * [[Scaling]] | * [[Scaling]] | ||
| == Energy Efficiency == | |||
| Please also see our advice for | Please also see our advice for | ||
Revision as of 10:39, 4 July 2023
Description
On your desktop computer, you start your calculations and they start immediately, run until they are finished, then your desktop does mostly nothing, until you start another calculation. A compute cluster has several hundred, maybe a thousand computers (compute nodes), all of them are busy most of the time and many people want to run a great number of calculations. So running your job has to include some extra steps:
- prepare a script (usually a shell script), with all the commands that are necessary to run your calculation from start to finish. In addition to the commands necessary to run the calculation, this batch script has a header section, in which you specify details like required compute cores, estimated runtime, memory requirements, disk space needed, etc.
- Submit the script into a queue, where your job (calculation)
- Job is queued and waits in row with other compute jobs until the resources you requested in the header become available.
- Execution: Once your job reaches the front of the queue, your script is executed on a compute node. Your calculation runs on that node until it is finished or reaches the specified time limit.
- Save results: At the end of your script, include commands to save the calculation results back to your home directory.
There are two types of batch systems currently used on bwHPC clusters, called "Moab" (legacy installs) and "Slurm".
Link to Batch System per Cluster
Because of differences in configuration (partly due to different available hardware), each cluster has their own batch system documention:
- Slurm systems
- Moab systems (legacy systems with deprecated queuing system)
When you are running your calculations, you will have to decide on how many compute-cores your calculation will be simultaniously calculated. For this, your computational problem will have to be divided into pieces, which always causes some overhead.
Scaling
How to find a reasonable number of how many compute cores to use for your calculation is described in the page
Energy Efficiency
Please also see our advice for
