Energy Efficient Cluster Usage
General Issue with Energy Efficiency
You are all aware of the rising energy costs and you are certainly careful to economize your energy consumption at home. But are you aware of the energy consumed by your computational jobs? An average compute job running on just a single node for one day may easily consume 10 kWh or even more.
That translates roughly to one of the following activities:
- toasting about 1330 slices of toast in a toaster
- continuously blow-drying your hair for about 10 hours
- actively working on a laptop for about 500 hours
- brewing 700 cups of coffee
Besides energy costs, even this single job alone will also contribute to climate change by adding around 5 kg of CO2 to the atmosphere (based on the average German power mix) which is roughly equivalent to driving a distance of 30 km by car.
You get the point: Please always keep this in mind when submitting tens or even hundreds of jobs to the queue, just like you do when switching on your electrical devices at home. Also, please always think carefully about how many resources your jobs really need and whether your application really benefits from allocating more cores for the jobs. Application speedup is often limited and does not scale linearly with the number of dedicated cores. But energy consumption usually does ...
Using as many resources as possible does not make a power user. Using them wisely does. If in doubt, just ask.
General recommendations
- Choose the most efficient algorithms for the given problem
- Run only necessary jobs: Please consider testing new setups and their output for validity prior to submitting a huge amount of similar jobs
- Start small: Run Your problem on a small amount of parallel entities (be it processes or threads) first
- Estimate the runtime of the parallel job as exactly as possible to increase efficiency of the scheduling of the whole system
- Use the proper tools for development: If You develop your own code, please use the proper tools for debugging and parallel performance analysis. More information is available on the bwHPC Wiki.
- A look at the job feedback can help you determine if you are using the cluster efficiently
Code development recommendations
The above recommendations will help using the Cluster resources efficiently. Regarding Software Development, power efficiency correlates obviously heavily with computing performance, but also with memory usage, i.e. amount of memory used, but also memory efficiency.
Here, we have gathered a few results based on other research:
- Use an efficient programming language such as Rust, C and C++ -- well any compiled language. Do not use any interpreted language like Perl or Python. Since Machine Learning is a hot topic, this deserves a few words: Any ML-Python code using Tensorflow or other libraries will make heavy usage of NumPy and other math packages, which will use C-based implementations. Please make sure, you use the provided Python-Modules, which are optimized to use Intel MKL and other mathematical libraries.
Link to paper.
- Analyse memory access patterns
- For small tight loops checking for Locks, use the pauseinstruction.