Energy Efficient Cluster Usage

Poor job efficiency means that hardware resources are wasted and a similar overall result could have been achieved using fewer hardware resources, leaving those for other jobs and reducing the queue wait time for all users. Efficient cluster usage means choosing optimal values for job resources:

CPU cores
GPUs
Memory
(temporary) storage
Time

Motivation

User perspective

Short waiting times including short cycles of trial and error and fast results. The lower the job efficiency the longer the waiting times relative to the shortest possible time.

Energy perspective

Energy consumption of data centers has been increasing continuously throughout the last decade. In 2020, the energy consumption of all data centers in Germany amounted to around 3 percent of the total electricity produced. Accompanying this large energy consumption are large-scale emissions of CO2 to the atmosphere and thus significant contributions to climate change. To illustrate this, an average compute job running on a single node for one day may easily consume 10 kWh or even more. That translates roughly to brewing 700 cups of coffee. Assuming that a typical bwHPC cluster has a few hundred compute nodes, this amounts to the energy consumption of a village for each cluster. Although a large amount of this energy consumption is an intrinsic requirement of running large HPC clusters (even when it's processors are idle, a cluster uses a lot of energy), efficient use of the available resources is important.

Something to keep in mind:

‎

Using as many resources as possible does not make a power user. Using them wisely does.‎

Recommendations

Use the job monitoring options of your respective cluster to see the job efficiency and ways to improve it.
Test new setups first before submitting a lot of similar jobs or resource demanding jobs.
- Run a single job first before sending many.
- Run a simplified problem on a small number of parallel entities (be it processes or threads) first before requesting many resources.
- Run a time intensive task with a short runtime first to see if it already fails in the first minutes.
Follow the best practices of your chosen tool, software or programming language and choose the most efficient algorithms for the given problem. There are several places where you might find related documentation:
- Software Module help.
- Software page of your cluster.
- Pages linked at the Development page. For example information about debugging, performance analysis, specific programming languages etc. . → Use an efficient programming language such as Rust, C, and C++ -- well any compiled language. Do not use any interpreted language like Perl or Python. Since Machine Learning is a hot topic, this deserves a few words: Any ML-Python code using Tensorflow or other libraries will make heavy usage of NumPy and other math packages, which will use C-based implementations. Please make sure, you use the provided Python modules, which are optimized to use Intel MKL and other mathematical libraries.

Further reading: Rui Pereira, et al: "Energy efficiency across programming languages: how do energy, time, and memory relate?", SLE 2017: Proc. of the 10th ACM SIGPLAN Int. Conf. on SW Language Eng., Oct. 2017, pp. 256–267, doi:10.1145/3136014.3136031

If you have a task that can be scaled up, please consider a scaling analysis.
Analyse memory access patterns: For small tight loops checking for locks, use the pause instruction.

Things to avoid

Poor choice of resources compared to the size of the nodes leaves part of the node blocked, but doing nothing:
- Too much (un-needed) memory or disk space requested
Many small jobs with a short runtime (seconds in extreme cases). A job should run at least 10 minutes.
When a node has 256 gb memory but only 236 gb are usable (see Hardware table of your cluster) then requesting 256 gb memory needs two nodes.
GPU: Do not ask for a specific gpu type when your code would be fitting for any type.
More cores used for a single mpi/openmp parallel computation than useful.
Writing temporary files to global filesystems when a ram disk or local disk can be used.
Do not use more parallel processes than there are actual, reasonable parallel tasks. Examples:
- 4 datasets with values for the same 1000 genes each. Method xx shall be used per gene. This operation needs 10 seconds per gene.
  - Case 1: Provide 64 single threaded cores within one job and use one process/thread per gene: This creates a large overhead as only 64 processes can run in parallel while all 5000 processes try to get computing time so that a process might run for 5 seconds and then makes room for another process that also isn't able to finish because another process comes in between.
  - Case 2: Send 1 job per gene so that each job uses method xx on this gene in all 4 datasets: This creates a large overhead as each job only needs 50 seconds while there is a thousand times the overhead of creating and finishing a job.
  - Case 3: Provide 64 cores in one job. Each dataset uses 16 cores. Each core needs to process 1000/16 =~ 63 genes. This needs 63*10 seconds

= 10.5 minutes job runtime with just the overhead of creating a single job and 16 processes without switching between processes.

- → Simple parallelization by hand is advisable. See: A basic introduction to Parallel Programming.

Fair Share and Scheduler

Influence of Efficiency on Fair Share and Scheduler:

Fair Share value:
The fair share value represents your priority on the cluster. On a busy cluster a low priority leads to longer waiting times as other users with a higher priority have a higher chance of getting free resource slots. The fair share value declines when a lot of cluster resources were used recently. Used is defined as follows:
- Requested resources: 4 gpus for 2 hours
- Actual resource usage: 1 gpu and job finishes successfully after 1 hour.
- Used resources as considered by the fair share value: 4 gpus for 1 hour because the three idle gpus couldn't be used by any other cluster used as they were blocked by this job.
Job scheduler:
The longer the requested runtime and the more resources are needed in that time the harder it is to find a fitting timeframe where these resources are free.

Energy Efficient Cluster Usage

Contents

Motivation

Recommendations

Things to avoid

Fair Share and Scheduler

Navigation menu

Energy Efficient Cluster Usage

Motivation

Recommendations

Things to avoid

Fair Share and Scheduler

Navigation menu

Search