Software Modules Lmod: Difference between revisions
Line 15: | Line 15: | ||
cp $GAUSSIAN_EXA_DIR/bwforcluster-gaussian-example.sbatch . |
cp $GAUSSIAN_EXA_DIR/bwforcluster-gaussian-example.sbatch . |
||
</pre> |
</pre> |
||
<pre> |
|||
1. load the module with gaussian software package of version 16, revision C.01 |
1. load the module with gaussian software package of version 16, revision C.01 |
||
2. print out the list of currently load modules |
2. print out the list of currently load modules |
||
3. provide the user help for the particular gaussian module |
3. provide the user help for the particular gaussian module |
||
4. copy the template batch script which was specifically designed for submission of g16 jobs into SLURM workload manager on JUSTUS 2. |
4. copy the template batch script which was specifically designed for submission of g16 jobs into SLURM workload manager on JUSTUS 2. |
||
</pre> |
|||
<br> |
<br> |
Revision as of 12:43, 10 December 2021
Software Module System - Lmod
Preface
This guide provides a general overview and introduction to the software system management via Lmod on JUSTUS 2 for new users as well as for experienced users coming, e.g. from sites planted with different Environment modules systems.
Scientific Software management through the module system Lmod
JUSTUS 2 uses Linux operating system. The standard Linux packages are installed on the front end nodes (login and visualisation nodes).
The scientific software is accessible via so called module system.
To find and load an scientific application, one needs to use module commands. For example the following command sequence:
module load chem/gaussian/g16.C.01 module list module help chem/gaussian/g16.C.01 cp $GAUSSIAN_EXA_DIR/bwforcluster-gaussian-example.sbatch .
1. load the module with gaussian software package of version 16, revision C.01 2. print out the list of currently load modules 3. provide the user help for the particular gaussian module 4. copy the template batch script which was specifically designed for submission of g16 jobs into SLURM workload manager on JUSTUS 2.
The following sections covers the basic module commands needed to find and load the application installed on the JUSTUS 2.
The module system on JUSTUS 2 is managed by Lmod (https://lmod.readthedocs.io/en/latest/).
The module system incorporates majority of the computational software available - this includes among others compilers, mpi libraries, numerical libraries, computational
chemistry packages, python specific libraries etc..
The programs managed by the module system are by default not utilizable. It has to be "loaded" to become executable.
Thus, among the main functionalities of Lmod belongs to load modules to make variety of the software packages pre-installed on the cluster accessible. This is feasible by only a single command:
module load <module_name>
The activation is realized by dynamical modification of the user's shell environment. This simply includes adding new paths to bin directories with the specific software into the PATH environmental variable. Typically, Lmod modifies PATH and LD_LIBRARY_PATH as well as it sets new variables as, for example <SOFTWARE_NAME>_EXA_DIR containing path to directory with the examples for a specific software.
Example: compare the content of $PATH environmental variable before and after the load of the gaussian module:
Before the load of gaussian module:
echo $PATH /home/software/common/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin
which g16 /usr/bin/which: no g16 in (/home/software/common/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin)
ml chem/gaussian
After gaussian is loaded:
echo $PATH /.../chem/gaussian/g16.C.01/x86_64-Intel-avx2-source/g16/bsd:/.../chem/gaussian/g16.C.01/x86_64-Intel-avx2-source/g16:/home/software/common/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin
which g16 /.../chem/gaussian/g16.C.01/x86_64-Intel-avx2-source/g16/g16
Why we use module system?
The use of module system provide among others the following functionalities:
1. When loading a module, it automatically sets the appropriate environment variables required by the application to run properly.
2. It also takes care about the module dependency. It either loads all additional modules required for the application, or it informs the user if additional dependency modules need to be manually loaded.
3. It prevents loading of modules that could be in conflicts and can cause instability or unexpected behavior.
Basic functions of Lmod and commands
The module system has other useful capabilities then just the managing the environment.
(i) to list available software
module available
(ii) to load (activate) modules (particular software)
module load
(iii) to unload (inactivate) modules (particular software)
module unload
module purge
(iv) to list currently loaded modules
module list
(v) to search through all packages within the module system
module available
module spider
module keyword
(vi) to provide specific help information for a particular module system
module help
module whatis
Elementary Lmod Commands
The module commands below might be used interactively (in shells' current session), as well as in the shell scripts, in particular in the sbatch scripts used for submission of the computational jobs into SLURM (workload manager on JUSTUS2).
List of available software
module available
alternatively also in a short form
ml av
Module naming convention: Category/Name/Version
On JUSTUS 2 (similarly as on other HPC sites of bwHPC), the software modules are grouped into several categories:
- chem
- compiler
- devel
- lib
- numlib
- phys
- system
- vis
- math
This makes it easier for
users to get oriented within the module system. For example Gaussian 16 program allowing
to calculate electronic structure of molecules is found in the category chem (together with other
programs used by theoretical chemists).
Each the category is further divided according to software packages and those finally according to software versions.
The full name of a module always consists of three parts category, name, and version separated by slash category/name/version . Consequently, the
full name of the module with Gaussian 16 package is chem/gaussian/g16.C.01. Analogously, gnu compiler of version 10.2 is addressed as compiler/gnu/10.2.
See, for example, all modules of category chem with:
ml av chem
---------------------------------------------- /opt/bwhpc/common/modulefiles/Core---------------------------------------------------------------------------------- chem/adf/2019.304 chem/gaussian/g16.C.01 chem/molpro/2020.1 (D) chem/orca/5.0.1-xtb-6.4.1 chem/tmolex/4.6 (D) chem/ams/2020.101 chem/gaussview/6.1.1 chem/namd/2.14 chem/orca/5.0.1 (D) chem/turbomole/7.4.1 chem/ams/2020.103 chem/gromacs/2020.2 chem/nbo/6.0.18_i4 chem/quantum_espresso/6.5 chem/turbomole/7.5 (D) chem/ams/2021.102 (D) chem/gromacs/2020.4 chem/nbo/6.0.18_i8 (D) chem/quantum_espresso/6.7_openmp-5 chem/vasp/5.4.4.3.16052018 chem/cfour/2.1_openmpi chem/gromacs/2021.1 (D) chem/openbabel/3.1.1 chem/quantum_espresso/6.7 (D) chem/vmd/1.9.3 chem/cp2k/7.1 chem/jmol/14.31.3 chem/openmolcas/19.11 chem/schrodinger/2020-2 chem/xtb/6.3.3 chem/cp2k/8.0_devel (D) chem/lammps/stable_3Mar2020 chem/openmolcas/21.06 (D) chem/schrodinger/2021-1 (D) chem/xtb/6.4.1 (D) chem/dalton/2020.0 chem/molcas/8.4 chem/openmolcas/21.10 chem/siesta/4.1-b4 chem/dftbplus/20.2.1-cpu chem/molden/5.9 chem/orca/4.2.1-xtb-6.3.3 chem/siesta/4.1.5 (D) chem/gamess/2020.2 chem/molpro/2019.2.3 chem/orca/4.2.1 chem/tmolex/4.5.2
or, analogously, all available versions of intel compilers:
ml av compiler/intel
--------------------------------------------------------------------------------/opt/bwhpc/common/modulefiles/Core-------------------------------------------------------------- compiler/intel/19.0 compiler/intel/19.1 compiler/intel/19.1.2 (D)
Load specific software
module load <module_name>
or shortly
ml <module_name>
For example to load gaussian of version 16 one has to run
ml chem/gaussian/g16.C.01
List of the loaded modules
module list
or simply
ml
Default module version
In case of there is multiple software versions, one version is always pre-determined as the default version. To address a default version, version can be omitted in the module identifier. For example, the loading of the default intel compiler module is realized via
ml compiler/intel
ml Currently Loaded Modules: 1) compiler/intel/19.1.2
Unload a specific software from the environment
module unload <module_name>
or equivalently
ml -<module_name>
for example to unload previously loaded vasp module chem/vasp/5.4.4.3.16052018 use
ml -chem/vasp/5.4.4.3.16052018
Unload all the loaded modules
$ module purge
or
ml purge
Providing a specific help for a particular module
module help <module_name>
or
ml help <module_name>
Software examples and batch script templates
Majority of the software modules provides examples, including job queueing system examples (batch scripts) for slurm. A full path the directory with examples is normally contained in <SOFTWARE_NAME>_EXA_DIR environmental variable. For example the examples for Gromacs-2021.1 are located in (after the loading of the module).
ml chem/gromacs/2021.1
echo $GROMACS_EXA_DIR/ /opt/bwhpc/common/chem/gromacs/2021.1-openmpi-4.0/bwhpc-examples/
ls $GROMACS_EXA_DIR GROMACS_TestCaseA Performance-Tuning-and-Optimization-of-GROMACS.pdf README
ls $GROMACS_EXA_DIR/GROMACS_TestCaseA/ gromacs-2021.1_gpu.slurm gromacs-2021.1.slurm ion_channel.tpr
Users may make a copy of these examples and use it as template for their own job scripts:
cp $GROMACS_EXA_DIR/GROMACS_TestCaseA/gromacs-2021.1_gpu.slurm .
Note: All the batch scripts examples are fully functional, i.e. its could be directly submitted into the queuing system. Typically, the scripts launch a short, simple calculation of the given software. Moreover, most of the sbatch scripts contain general submit instructions, as well as hints specific for the particular program.
Simple searching through names of the modules
module available <module_name>
or shortly
ml av <module_name>
For example, searching for python modules is realized via
ml av python
with the following output:
----------------------------------------------------------------------------------- /opt/bwhpc/common/modulefiles/Core ------------------------------------------------------------------------------------ devel/python/3.8.3 lib/python_matplotlib/3.2.2_numpy-1.19.0_python-3.8.3 numlib/python_numpy/1.19.0_python-3.8.3 numlib/python_scipy/1.5.0_numpy-1.19.0_python-3.8.3 Use "module spider" to find all possible modules and extensions. Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys"
Finding detailed information about a specific module
module spider <searching_pattern>
or just
ml spider <searching_pattern>
Extended searching through entire module system
module keyword <searching_pattern>
For example, to find out which modules contain fftw library:
ml keyword fftw
which gives the following info:
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- The following modules match your search criteria: "fftw" ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- numlib/mkl: numlib/mkl/2019, numlib/mkl/2020, numlib/mkl/2020.2 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Best Practices when working with modules
Always load modules with the entire module name
The software stack is updated regularly. The adding of the new software version usually revokes which version is marked as default. The newer software is not always backwards compatible, including the existing scripts, workflow, or even input files. Therefore it is strongly recommended to avoid the loading based on just category and software name. Instead, one should always use the entire module name (including the version) to make sure the same module is loaded each time.
Load only those modules that are needed for the current application
Only load modules that are needed for the current script or workflow you are running, to reduce the chance of unexpected behavior caused by module conflicts.
Do not use module commands in .bashrc, .bash_profile etc. scripts
Avoid including “module load” commands in your .bashrc or .bash_profile files. As an alternative, create a bash script with the module load commands and source it each time, to load the modules needed.
Useful Extras
Conflicts between modules
Some modules cannot be loaded together at the same time. For example two different versions of the same package cannot
be activated simultaneously. The modules might already built-in this functionality. In such circumstances, Lmod, during the loading, either
prints an error message and no module is loaded, or the module is reloaded - the old module is unloaded and only the new module is become activated.
Example of two versions of the intel compiler - module reload:
ml compiler/intel/19.1
ml Currently Loaded Modules: 1) compiler/intel/19.1
ml compiler/intel/19.1.2 The following have been reloaded with a version change: 1) compiler/intel/19.1 => compiler/intel/19.1.2
ml Currently Loaded Modules: 1) compiler/intel/19.1.2
Example of two different compilers intel and gnu triggers the module conflict with the error during the load of gnu - the new module is not loaded:
ml compiler/intel/19.1.2
ml Currently Loaded Modules: 1) compiler/intel/19.1.2
ml compiler/gnu/10.2 Lmod has detected the following error: Cannot load module "compiler/gnu/10.2" because these module(s) are loaded: compiler/intel While processing the following module(s): Module fullname Module Filename --------------- --------------- compiler/gnu/10.2 /opt/bwhpc/common/modulefiles/Core/compiler/gnu/10.2.lua
ml Currently Loaded Modules: 1) compiler/intel/19.1.2
Solution: intel compiler must be unloaded prior the load of gnu module.
ml -compiler/intel/19.1.2
ml compiler/gnu/10.2
ml Currently Loaded Modules: 1) compiler/gnu/10.2
Module dependencies (why there is no mpi module?)
Some modules can depends on other modules. Typically, many modules depends on mpi library, Mpi library depends on a compiler, etc.. The user does not need to care about these fundamental dependencies. Majority of modules automatically take care about loading of all necessary packages it is depending on. However, there is an eminent exception - mpi library. While the most of the installed parallel applications exist for only one compiler-mpi combination, there are variety of mpi libraries of the same versions built with different compilers. For example there are two sets of OpenMPI 4.0 modules for intel and gnu compilers. Thus, an user who wants to load a specific mpi must chose (load) a particular compiler prior the mpi module load. Note, that the mpi modules also remains "invisible" for "module av <mpi_name>" command until a certain compiler is not loaded. This due to the module hierarchy of Lmod. More details about the hierarchy is below in https://wiki.bwhpc.de/e/Software_Modules_Lmod#Semi_hierarchical_layout_of_modules_on_JUSTUS_2.
Consequences of the partial module hierarchy for mpi modules
Mpi modules remains invisible for a user (prompted module avail) until some compiler module has been loaded. Once the compiler module has been activated corresponding mpi modules, i.e. built with the particular compiler, become visible.
E.g., with the originally empty list of the loaded modules, the module command
$ module avail
or its shorthand analogue
$ ml av
displays no mpi module available. After running
ml compiler/intel/19.1 ml av
mpi packages compatible with the intel 19.1 compiler becomes visible
------------ /opt/bwhpc/common/modulefiles/Compiler/intel/19.1 -------------------- mpi/impi/2019.7 mpi/openmpi/4.0
in the list of the available software.
Online user guide of Lmod
The complete user guide can be found on Lmod websites https://lmod.readthedocs.io/en/latest/010_user.html
Additional Module System tasks
Lmod offers more than 25 sub-commands plus various options to manage the modulefile system installed on JUSTUS 2. See, e.g. output of "module --help" command. Large majority of users will use only couple of them. A complete list of module sub-commands can be displayed by entering "module --help" command or in Lmod online documentation. The following text lists only a couple of them.
Other topics
Which shells supports module commands?
So far Bash is only supported shell on JUSTUS 2 to interpret module commands.
Semi hierarchical layout of modules on JUSTUS 2
Module hierarchy in Lmod
The structure of software modules on JUSTUS 2 exploits a "semi" hierarchical structure. This is slightly different from what can be seen on another HPC systems with "full" hierarchical structure. The typical systems with full hierarchy put compiler modules (i.e., intel, gcc) in the uppermost (Core) level, depending libraries (e.g., MPI) on the second level, and more depending libraries on a third level. As a consequence, not all the modules contained in the module system are initially visible, namely the modules putted in the second and third layer. Only after a loading a compiler module, the modules of the second layer directly depending on the particular compiler will become available. And similarly, loading an MPI module will make the modules of the third layer depending on the loaded MPI library visible.
Semi hierarchy of software stack on JUSTUS 2
JUSTUS 2 adopted the hierarchical structure of the modules layout only partially. In particular, there is only "Core" and the "second" level presented and there are only mpi modules contained in the second level. All other modules, i.e. for example those from the "chem" sub-cathegory such as vasp, turbomole, or gaussian, or those located in the "numlib" sub-cathegory such as mkl or python_numpy, are embodied in the "Core" level.
Module dependency
The adopted hierarchy models is not the only tool handling the module dependency. As a matter of fact, most of the modules on JUSTUS 2 require a provision of functionalities from another modules, albeit located in the "Core" level. Such provisioning is implemented in a modulefile either automatically without a need of any action from the user (the depending modulefile, while loading, loads all additional modules automatically) or the depending modulefile, while loading, informs the user about necessity to pre-load additional modules if those has not been activated yet (in this case the user must repeat the loading operation). Which of the solution is applied rests with the decision of the person who built the particular module.
An example of module with the implemented automated pre-loading is orca module. With the pre-emptied list of the loading modules, i.e.
ml
shows
No modules loaded
, the command sequence
ml chem/orca ml
shows
Currently Loaded Modules: 1) compiler/intel/19.1 2) chem/orca/4.2.1
I.e., loading of the intel compiler is built-in the orca module.
Complete list of Lmod options and sub-commands
The whole list of module options and all commands available can be displayed by running
man module
or
module --help
How do Modules work?
The default shell on the bwHPC clusters is bash, so explanations and examples will be shown for bash. In general, programs cannot modify the environment of the shell they are being run from, so how can the module command do exactly that?
The module command is not a program, but a bash-function.
You can view its content using:
type module
and you will get the following result:
type module module is a function module () { eval $($LMOD_CMD bash "$@"); [ $? = 0 ] && eval $(${LMOD_SETTARG_CMD:-:} -s sh) }
In this function, lmod is called. Its output to stdout is then executed inside your current shell using the bash-internal eval command. As a consequence, all output that you see from the module is transmitted via stderr (output handle 2) or in so