Running Calculations: Difference between revisions

From bwHPC Wiki
Jump to navigation Jump to search
Line 23: Line 23:
** [[BinAC/Moab|Moab BinAC specific information]]
** [[BinAC/Moab|Moab BinAC specific information]]


== How to Use Computing Ressources Efficiently ==


When you are running your calculations, you will have to decide on how many compute-cores your calculation will be simultaneously calculated. For this, your computational problem will have to be divided into pieces, which always causes some overhead.


How to find a reasonable number of how many compute cores to use for your calculation can be found under '''[[Scaling]]'''


Information regarding the supported parallel programming paradigms and specific hints on their usage are summarized at '''[[Parallel_Programming]]'''


Running calculations on an HPC node consumes a lot of energy. To make the most of the available resources and keep cluster and energy use as efficient as possible please also see our advice for '''[[Energy Efficient Cluster Usage]]
'''


== Scaling ==
== Scaling ==

Revision as of 15:32, 12 September 2023

Description

Running calculations on cluster.svg

On your desktop computer, you start your calculations and they start immediately, run until they are finished, then your desktop does mostly nothing, until you start another calculation. A compute cluster has several hundred, maybe a thousand computers (compute nodes), all of them are busy most of the time and many people want to run a great number of calculations. So running your job has to include some extra steps:

  1. prepare a script (usually a shell script), with all the commands that are necessary to run your calculation from start to finish. In addition to the commands necessary to run the calculation, this batch script has a header section, in which you specify details like required compute cores, estimated runtime, memory requirements, disk space needed, etc.
  2. Submit the script into a queue, where your job (calculation)
  3. Job is queued and waits in row with other compute jobs until the resources you requested in the header become available.
  4. Execution: Once your job reaches the front of the queue, your script is executed on a compute node. Your calculation runs on that node until it is finished or reaches the specified time limit.
  5. Save results: At the end of your script, include commands to save the calculation results back to your home directory.

There are two types of batch systems currently used on bwHPC clusters, called "Moab" (legacy installs) and "Slurm".

Link to Batch System per Cluster

Because of differences in configuration (partly due to different available hardware), each cluster has their own batch system documention:

How to Use Computing Ressources Efficiently

When you are running your calculations, you will have to decide on how many compute-cores your calculation will be simultaneously calculated. For this, your computational problem will have to be divided into pieces, which always causes some overhead.


How to find a reasonable number of how many compute cores to use for your calculation can be found under Scaling


Information regarding the supported parallel programming paradigms and specific hints on their usage are summarized at Parallel_Programming


Running calculations on an HPC node consumes a lot of energy. To make the most of the available resources and keep cluster and energy use as efficient as possible please also see our advice for Energy Efficient Cluster Usage

Scaling

When you are running your calculations, you will have to decide on how many compute-cores your calculation will be simultaniously calculated. For this, your computational problem will have to be divided into pieces, which always causes some overhead.

How to find a reasonable number of how many compute cores to use for your calculation is described in the page

Energy Efficiency

Please also see our advice for

HPC Glossary

A short definition of the typical elements of an HPC cluster.

HPC
short for High Performance Computing
HPC Cluster
Collection of compute nodes with (usually) high bandwidth and low latency communication. They can be accessed via login nodes.
Node
An individual computer with one or more sockets, part of an HPC cluster.
Socket
Physical socket where the CPU capsules are placed.
Core
The physical unit that can independently execute tasks on a CPU. Modern CPUs generally have multiple cores.
Thread
Logical unit that can be executed independently.
Hyperthreading
Modern computers can be configured so that one real compute-core appears like two "logical" cores on the system. These two "hyperthreads" can sometimes do computations in parallel, if the calculations use two different sub-units of the compute-core - but most of the time, two calculations on two hyperthreads run on the same physical hardware and both run half as fast as if one thread had a full core. Some programs (e.g. gromacs) can profit from running with twice as many threads on hyperthreads and finish 10-20% faster if run in that way.
Multithreading
Multithreading means that one computer program runs calculations on more than one compute-core using several logical "threads" of serial compute instructions to do so (eg. to work through different and independent data arrays in parallel). Specific types of multithreaded parallelization are OpenMP or MPI.
CPU
Central Processing Unit. It performs the actual computation in a compute node. A modern CPU is composed of numerous cores and layers of cache.
GPU
Graphics Processing Unit. GPUs in HPC clusters are used as high-performance accelerators and are particularly useful to process workloads in Machine Learning (ML) and Artificial Intelligence (AI) more efficiently. The software has to be explicitly designed to use GPUs. CUDA and OpenACC are the most popular platforms in scientific computing with GPUs.
RAM
Random Access Memory. It is used as the working memory for the cores.


Batch System
Moab
Script
Slurm
Shell Script / Bash
Job
Runtime
Scaling
Scheduler
Submit
Parallelization