Software Modules Lmod

From bwHPC Wiki
Jump to navigation Jump to search

Software Module System - Lmod

Preface

This guide provides a general overview and introduction to the software system management via Lmod on JUSTUS 2 for new users as well as for experienced users coming, e.g. from sites planted with different Environment modules systems.

Scientific Software management through the module system Lmod

The following sections covers the basic module commands needed to find and load the application installed on the JUSTUS 2.

JUSTUS 2 uses Linux operating system. The standard Linux packages are installed on the front end nodes (login and visualisation nodes). The scientific software is accessible via so called module system.
To find and load an scientific application, one needs to use module commands. For example the following command sequence:

module load chem/gaussian/g16.C.01
module list
module help chem/gaussian/g16.C.01
cp $GAUSSIAN_EXA_DIR/bwforcluster-gaussian-example.sbatch .
1. load the module with gaussian software package of version 16, revision C.01
2. print out the list of currently load modules
3. provide the user help for the particular gaussian module
4. copy the template batch script which was specifically designed for submission of g16 jobs into SLURM workload manager on JUSTUS 2.


Why we use module system? Modules load scientific software

The module system on JUSTUS 2 is managed by Lmod (https://lmod.readthedocs.io/en/latest/). The module system incorporates majority of the computational software available - this includes among others compilers, mpi libraries, numerical libraries, computational chemistry packages, python specific libraries etc..

The programs managed by the module system are by default not utilizable. It has to be "loaded" to become executable.

The use of module system provide among others the following functionalities:

1. When loading a module, it automatically sets the appropriate environment variables required by the application to run properly.
2. It also takes care about the module dependency. It either loads all additional modules required for the application, or it informs the user if additional dependency modules need to be manually loaded.
3. It prevents loading of modules that could be in conflicts and can cause instability or unexpected behavior.

Among the main functionalities of Lmod belongs to load modules to make variety of the software packages pre-installed on the cluster accessible. This is feasible by only a single command:

module load <module_name>

The activation is realized by dynamical modification of the user's shell environment. This simply includes adding new paths to bin directories with the specific software into the PATH environmental variable. Typically, Lmod modifies PATH and LD_LIBRARY_PATH as well as it sets new variables as, for example <SOFTWARE_NAME>_EXA_DIR containing path to directory with the examples for a specific software.

Example: compare the content of $PATH environmental variable before and after the load of the gaussian module:
Before the load of gaussian module:

echo $PATH
/home/software/common/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin
which g16
/usr/bin/which: no g16 in (/home/software/common/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin)
ml chem/gaussian

After gaussian is loaded:

echo $PATH
/.../chem/gaussian/g16.C.01/x86_64-Intel-avx2-source/g16/bsd:/.../chem/gaussian/g16.C.01/x86_64-Intel-avx2-source/g16:/home/software/common/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin
which g16
/.../chem/gaussian/g16.C.01/x86_64-Intel-avx2-source/g16/g16

Basic functions of Lmod and commands


The module system has other useful capabilities then just the managing the environment.

(i) to list available software

module available


(ii) to load (activate) modules (particular software)

module load


(iii) to unload (inactivate) modules (particular software)

module unload
module purge

(iv) to list currently loaded modules

module list


(v) to search through all packages within the module system

module available
module spider
module keyword


(vi) to provide specific help information for a particular module system

module help
module whatis

Elementary Lmod Commands

The module commands below might be used interactively (in shells' current session), as well as in the shell scripts, in particular in the sbatch scripts used for submission of the computational jobs into SLURM (workload manager on JUSTUS2).

List of available software modules

module available

alternatively also in a short form

ml av

Module naming convention: Category/Name/Version

On JUSTUS 2 (similarly as on other HPC sites of bwHPC), the software modules are grouped into several categories:

  • chem
  • compiler
  • devel
  • lib
  • numlib
  • phys
  • system
  • vis
  • math

This makes it easier for users to get oriented within the module system. For example Gaussian 16 program allowing to calculate electronic structure of molecules is found in the category chem (together with other programs used by theoretical chemists).
Each the category is further divided according to software packages and those finally according to software versions.
The full name of a module always consists of three parts category, name, and version separated by slash category/name/version . Consequently, the full name of the module with Gaussian 16 package is chem/gaussian/g16.C.01. Analogously, gnu compiler of version 10.2 is addressed as compiler/gnu/10.2.

See, for example, all modules of category chem with:

ml av chem
---------------------------------------------- /opt/bwhpc/common/modulefiles/Core----------------------------------------------------------------------------------
   chem/adf/2019.304               chem/gaussian/g16.C.01             chem/molpro/2020.1        (D)    chem/orca/5.0.1-xtb-6.4.1                 chem/tmolex/4.6            (D)
   chem/ams/2020.101               chem/gaussview/6.1.1               chem/namd/2.14                   chem/orca/5.0.1                    (D)    chem/turbomole/7.4.1
   chem/ams/2020.103               chem/gromacs/2020.2                chem/nbo/6.0.18_i4               chem/quantum_espresso/6.5                 chem/turbomole/7.5         (D)
   chem/ams/2021.102        (D)    chem/gromacs/2020.4                chem/nbo/6.0.18_i8        (D)    chem/quantum_espresso/6.7_openmp-5        chem/vasp/5.4.4.3.16052018
   chem/cfour/2.1_openmpi          chem/gromacs/2021.1         (D)    chem/openbabel/3.1.1             chem/quantum_espresso/6.7          (D)    chem/vmd/1.9.3
   chem/cp2k/7.1                   chem/jmol/14.31.3                  chem/openmolcas/19.11            chem/schrodinger/2020-2                   chem/xtb/6.3.3
   chem/cp2k/8.0_devel      (D)    chem/lammps/stable_3Mar2020        chem/openmolcas/21.06     (D)    chem/schrodinger/2021-1            (D)    chem/xtb/6.4.1             (D)
   chem/dalton/2020.0              chem/molcas/8.4                    chem/openmolcas/21.10            chem/siesta/4.1-b4
   chem/dftbplus/20.2.1-cpu        chem/molden/5.9                    chem/orca/4.2.1-xtb-6.3.3        chem/siesta/4.1.5                  (D)
   chem/gamess/2020.2              chem/molpro/2019.2.3               chem/orca/4.2.1                  chem/tmolex/4.5.2

or, analogously, all available versions of intel compilers:

ml av compiler/intel
--------------------------------------------------------------------------------/opt/bwhpc/common/modulefiles/Core--------------------------------------------------------------
   compiler/intel/19.0    compiler/intel/19.1    compiler/intel/19.1.2 (D)

Load specific software

module load <module_name>

or shortly

ml <module_name>

For example to load gaussian of version 16 one has to run

ml chem/gaussian/g16.C.01

List of the loaded modules

module list

or simply

ml

Default module version

In case of there is multiple software versions, one version is always pre-determined as the default version. To address a default version, version can be omitted in the module identifier. For example, the loading of the default intel compiler module is realized via

ml compiler/intel
ml

Currently Loaded Modules:
  1) compiler/intel/19.1.2

Unload a specific software from the environment

module unload <module_name>

or equivalently

ml -<module_name>

for example to unload previously loaded vasp module chem/vasp/5.4.4.3.16052018 use

ml -chem/vasp/5.4.4.3.16052018

Unload all the loaded modules

$ module purge

or

ml purge

Providing a specific help for a particular module

module help <module_name>

or

ml help <module_name>

Software job examples and batch script templates

Majority of the software modules provides examples, including job queueing system examples (batch scripts) for slurm. A full path the directory with examples is normally contained in <SOFTWARE_NAME>_EXA_DIR environmental variable. For example the examples for Gromacs-2021.1 are located in (after the loading of the module).

ml chem/gromacs/2021.1
echo $GROMACS_EXA_DIR/
/opt/bwhpc/common/chem/gromacs/2021.1-openmpi-4.0/bwhpc-examples/
ls $GROMACS_EXA_DIR
GROMACS_TestCaseA  Performance-Tuning-and-Optimization-of-GROMACS.pdf  README
ls $GROMACS_EXA_DIR/GROMACS_TestCaseA/
gromacs-2021.1_gpu.slurm  gromacs-2021.1.slurm  ion_channel.tpr

Users may make a copy of these examples and use it as template for their own job scripts:

cp $GROMACS_EXA_DIR/GROMACS_TestCaseA/gromacs-2021.1_gpu.slurm .

Note: All the batch scripts examples are fully functional, i.e. the example scripts could be directly submitted into the queuing system, to launch a test job. Typically, the scripts launch a short, simple calculation of the given software. Moreover, most of the sbatch scripts contain general submit instructions, as well as hints specific for the particular program.

Searching through module names

module available <module_name>

or shortly

ml av <module_name>

For example, searching for python modules is realized via

ml av python

with the following output:

----------------------------------------------------------------------------------- /opt/bwhpc/common/modulefiles/Core ------------------------------------------------------------------------------------
   devel/python/3.8.3    lib/python_matplotlib/3.2.2_numpy-1.19.0_python-3.8.3    numlib/python_numpy/1.19.0_python-3.8.3    numlib/python_scipy/1.5.0_numpy-1.19.0_python-3.8.3

Use "module spider" to find all possible modules and extensions.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys"

What does this software do? Command when you don't know this software

ml whatis <modulename>

provides the short description of the software package.

Finding detailed information about a specific module

module spider <searching_pattern>

or just

ml spider <searching_pattern>

Extended searching through entire module system

module keyword <searching_pattern>

For example, to find out which modules contain fftw library:

ml keyword fftw

which gives the following info:

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

The following modules match your search criteria: "fftw"
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

  numlib/mkl: numlib/mkl/2019, numlib/mkl/2020, numlib/mkl/2020.2

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Best Practices when working with modules

Always load modules with the entire module name

The software stack is updated regularly. The adding of the new software version usually revokes which version is marked as default. The newer software is not always backwards compatible, including the existing scripts, workflow, or even input files. Therefore it is strongly recommended to avoid the loading based on just category and software name. Instead, one should always use the entire module name (including the version) to make sure the same module is loaded each time.

Load only those modules that are needed for the current application

Only load modules that are needed for the current script or workflow you are running, to reduce the chance of unexpected behavior caused by module conflicts.

Do not use module commands in .bashrc, .bash_profile etc. scripts

Avoid including “module load” commands in your .bashrc or .bash_profile files. As an alternative, create a bash script with the module load commands and source it each time, to load the modules needed.

Use 'module help' command

see [[https://wiki.bwhpc.de/e/Software_Modules_Lmod#Providing_a_specific_help_for_a_particular_module ]]

Check content of $<SOFTWARE_NAME>_EXA_DIR folder

see [[1]]

Use 'ml purge' in sbatch scripts before the first 'ml load'

The environment in effect at the time of the sbatch, salloc, or srun commands is executed are propagated to the spawned processes, i.e. also to the job-script. Consequently, should be some module loaded at the time of the 'sbatch <job-script>' command execution, its state, i.e. "loaded", as well as the values of the set environmental variables will be propagated with the job.

Thus, consider to put 'ml purge' command as the first module command when you are designing your job-scripts. This might prevent variety of module conflict situations.

Imagine for example, in the following scenario

On the login node:

ml compiler/intel/19.1.2
salloc --nodes=1 --ntasks-per-node=1

... waiting for the allocation of the resources ...
Once on the compute node execute

ml compiler/gnu/10.2

the load of compiler/gnu/10.2 module on the compute node fails with following error:

Lmod has detected the following error:  Cannot load module "compiler/gnu/10.2" because these module(s) are loaded:
   compiler/intel

While processing the following module(s):
    Module fullname    Module Filename
    ---------------    ---------------
    compiler/gnu/10.2  /opt/bwhpc/common/modulefiles/Core/compiler/gnu/10.2.lua

[ul_l_tkz12@login02 ~]$ ml

Currently Loaded Modules:
  1) compiler/intel/19.1.2

Useful Extras

Conflicts between modules

Some modules cannot be loaded together at the same time. For example two different versions of the same package cannot be activated simultaneously. The modules might already built-in this functionality. In such circumstances, Lmod, during the loading, either prints an error message and no module is loaded, or the module is reloaded - the old module is unloaded and only the new module is become activated.

Example of two versions of the intel compiler - module reload:

ml compiler/intel/19.1
ml

Currently Loaded Modules:
  1) compiler/intel/19.1
ml compiler/intel/19.1.2 

The following have been reloaded with a version change:
  1) compiler/intel/19.1 => compiler/intel/19.1.2
ml

Currently Loaded Modules:
  1) compiler/intel/19.1.2



Example of two different compilers intel and gnu triggers the module conflict with the error during the load of gnu - the new module is not loaded:

ml compiler/intel/19.1.2
ml

Currently Loaded Modules:
  1) compiler/intel/19.1.2
ml compiler/gnu/10.2 
Lmod has detected the following error:  Cannot load module "compiler/gnu/10.2" because these module(s) are loaded:
   compiler/intel

While processing the following module(s):
    Module fullname    Module Filename
    ---------------    ---------------
    compiler/gnu/10.2  /opt/bwhpc/common/modulefiles/Core/compiler/gnu/10.2.lua
ml

Currently Loaded Modules:
  1) compiler/intel/19.1.2


Solution: intel compiler must be unloaded prior the load of gnu module.

ml -compiler/intel/19.1.2
ml compiler/gnu/10.2
ml

Currently Loaded Modules:
  1) compiler/gnu/10.2

Module dependencies (why there is no mpi module?)

Some modules can depends on other modules. Typically, many modules depends on mpi library, Mpi library depends on a compiler, etc.. The user does not need to care about these fundamental dependencies. Majority of modules automatically take care about loading of all necessary packages it is depending on. However, there is an eminent exception - mpi library. While the most of the installed parallel applications exist for only one compiler-mpi combination, there are variety of mpi libraries of the same versions built with different compilers. For example there are two sets of OpenMPI 4.0 modules for intel and gnu compilers. Thus, an user who wants to load a specific mpi must chose (load) a particular compiler prior the mpi module load. Note, that the mpi modules also remains "invisible" for "module av <mpi_name>" command until a certain compiler is not loaded. This due to the module hierarchy of Lmod. More details about the hierarchy is below in https://wiki.bwhpc.de/e/Software_Modules_Lmod#Semi_hierarchical_layout_of_modules_on_JUSTUS_2.


Consequences of the partial module hierarchy for mpi modules

Mpi modules remains invisible for a user (prompted module avail) until some compiler module has been loaded. Once the compiler module has been activated corresponding mpi modules, i.e. built with the particular compiler, become visible.

E.g., with the originally empty list of the loaded modules, the module command

$ module avail

or its shorthand analogue

 
$ ml av

displays no mpi module available. After running

ml compiler/intel/19.1
ml av

mpi packages compatible with the intel 19.1 compiler becomes visible

------------ /opt/bwhpc/common/modulefiles/Compiler/intel/19.1 --------------------
   mpi/impi/2019.7    mpi/openmpi/4.0

in the list of the available software.

Online user guide of Lmod

The complete user guide can be found on Lmod websites https://lmod.readthedocs.io/en/latest/010_user.html

Additional Module System tasks

Lmod offers more than 25 sub-commands plus various options to manage the modulefile system installed on JUSTUS 2. See, e.g. output of "module --help" command. Large majority of users will use only couple of them. A complete list of module sub-commands can be displayed by entering "module --help" command or in Lmod online documentation. The following text lists only a couple of them.

Other topics

Which shells supports module commands?

So far Bash is only supported shell on JUSTUS 2 to interpret module commands.

Semi hierarchical layout of modules on JUSTUS 2

Module hierarchy in Lmod

The structure of software modules on JUSTUS 2 exploits a "semi" hierarchical structure. This is slightly different from what can be seen on another HPC systems with "full" hierarchical structure. The typical systems with full hierarchy put compiler modules (i.e., intel, gcc) in the uppermost (Core) level, depending libraries (e.g., MPI) on the second level, and more depending libraries on a third level. As a consequence, not all the modules contained in the module system are initially visible, namely the modules putted in the second and third layer. Only after a loading a compiler module, the modules of the second layer directly depending on the particular compiler will become available. And similarly, loading an MPI module will make the modules of the third layer depending on the loaded MPI library visible.

Semi hierarchy of software stack on JUSTUS 2

JUSTUS 2 adopted the hierarchical structure of the modules layout only partially. In particular, there is only "Core" and the "second" level presented and there are only mpi modules contained in the second level. All other modules, i.e. for example those from the "chem" sub-cathegory such as vasp, turbomole, or gaussian, or those located in the "numlib" sub-cathegory such as mkl or python_numpy, are embodied in the "Core" level.

Module dependency

The adopted hierarchy models is not the only tool handling the module dependency. As a matter of fact, most of the modules on JUSTUS 2 require a provision of functionalities from another modules, albeit located in the "Core" level. Such provisioning is implemented in a modulefile either automatically without a need of any action from the user (the depending modulefile, while loading, loads all additional modules automatically) or the depending modulefile, while loading, informs the user about necessity to pre-load additional modules if those has not been activated yet (in this case the user must repeat the loading operation). Which of the solution is applied rests with the decision of the person who built the particular module.

An example of module with the implemented automated pre-loading is orca module. With the pre-emptied list of the loading modules, i.e.

ml

shows

No modules loaded

, the command sequence

ml chem/orca
ml

shows

Currently Loaded Modules:
  1) compiler/intel/19.1   2) chem/orca/4.2.1 

I.e., loading of the intel compiler is built-in the orca module.

Complete list of Lmod options and sub-commands

The whole list of module options and all commands available can be displayed by running

man module

or

module --help

How do Modules work?

The default shell on the bwHPC clusters is bash, so explanations and examples will be shown for bash. In general, programs cannot modify the environment of the shell they are being run from, so how can the module command do exactly that?
The module command is not a program, but a bash-function. You can view its content using:

type module

and you will get the following result:

type module
module is a function
module ()
{
    eval $($LMOD_CMD bash "$@");
    [ $? = 0 ] && eval $(${LMOD_SETTARG_CMD:-:} -s sh)
}

In this function, lmod is called. Its output to stdout is then executed inside your current shell using the bash-internal eval command. As a consequence, all output that you see from the module is transmitted via stderr (output handle 2) or in so


Back to top