Software Modules Lmod: Difference between revisions
No edit summary |
|||
(120 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
Software Module System - Lmod |
Software Module System - Lmod |
||
= |
= Preface = |
||
This guide provides a general overview and introduction to the software system management via Lmod on JUSTUS 2 for new users as well as for experienced users coming, e.g. from sites planted with different Environment modules systems. |
This guide provides a general overview and introduction to the software system management via Lmod on JUSTUS 2 for new users as well as for experienced users coming, e.g. from sites planted with different Environment modules systems. |
||
= |
= Scientific Software management through the module system Lmod = |
||
The following sections covers the basic module commands needed to find and load the scientific applications installed on the JUSTUS 2.<br> |
|||
JUSTUS 2 uses Linux operating system. The standard Linux packages are installed on the front end nodes (login and visualisation nodes). |
JUSTUS 2 uses Linux operating system. The standard Linux packages are installed on the front end nodes (login and visualisation nodes). |
||
The '''scientific software is accessible via so called module system'''. |
The '''scientific software is accessible via so called module system'''.<br> |
||
'''To find and load an scientific application, one needs to use module commands.''' For example the following command sequence: |
|||
<pre> |
|||
module load chem/gaussian/g16.C.01 |
|||
module list |
|||
module help chem/gaussian/g16.C.01 |
|||
cp $GAUSSIAN_EXA_DIR/bwforcluster-gaussian-example.sbatch . |
|||
</pre> |
|||
<pre> |
|||
1. loads the module with gaussian software package of version 16, revision C.01 |
|||
2. prints out the list of currently load modules |
|||
3. provides the user help for the particular gaussian module |
|||
4. copies the template batch script which was specifically designed for submission of g16 jobs into SLURM workload manager on JUSTUS 2. |
|||
</pre> |
|||
=== Why we use module system? Modules load scientific software === |
|||
The module system on JUSTUS 2 is managed by '''[https://lmod.readthedocs.io/en/latest/ Lmod]''' (https://lmod.readthedocs.io/en/latest/). |
|||
The module system incorporates majority of the computational software available - this includes among others compilers, mpi libraries, numerical libraries, computational |
The module system incorporates majority of the computational software available - this includes among others compilers, mpi libraries, numerical libraries, computational |
||
chemistry packages, python specific libraries. '''The programs |
chemistry packages, python specific libraries etc.. <br><br>'''The programs managed by the module system are by default not utilizable. It has to be "loaded" to become executable.''' |
||
<br><br> |
|||
The main functionality of Lmod is load modules by means of |
|||
The use of '''module system''' provide among others the following '''functionalities''': |
|||
'''1.''' When '''loading a module''', it automatically '''sets the appropriate environment variables required by the application to run properly'''.<br> |
|||
'''2.''' It also takes care about the '''module dependency'''. It either loads all additional modules required for the application, or it informs the user if additional dependency modules need to be manually loaded.<br> |
|||
'''3.''' It '''prevents loading of modules''' that could be '''in conflicts''' and can '''cause instability or unexpected behavior'''.<br> |
|||
<br> |
|||
Among the main functionalities of Lmod belongs '''module load''' to make variety of the software packages pre-installed on the cluster accessible. '''This is feasible by only a single command''': |
|||
<br> |
<br> |
||
<pre> |
<pre> |
||
module load <module_name> |
|||
</pre> |
|||
The activation is realized by '''dynamical modification of the user's shell environment'''. This simply includes '''adding new paths''' to bin directories with the specific software '''into the PATH environmental variable'''. Typically, Lmod modifies '''PATH''' and '''LD_LIBRARY_PATH''' as well as it '''sets new variables''' as, for example '''<SOFTWARE_NAME>_EXA_DIR''' containing path to directory with the examples for a specific software. |
|||
<br><br> |
|||
Example: compare the content of $PATH environmental variable before and after the load of the gaussian module: |
|||
<br>Before the load of gaussian module: |
|||
<pre> |
|||
echo $PATH |
|||
/home/software/common/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin |
|||
</pre> |
|||
<pre> |
|||
which g16 |
|||
/usr/bin/which: no g16 in (/home/software/common/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin) |
|||
</pre> |
|||
<pre> |
|||
ml chem/gaussian |
|||
</pre> |
|||
After gaussian is loaded: |
|||
<pre> |
|||
echo $PATH |
|||
/.../chem/gaussian/g16.C.01/x86_64-Intel-avx2-source/g16/bsd:/.../chem/gaussian/g16.C.01/x86_64-Intel-avx2-source/g16:/home/software/common/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin |
|||
</pre> |
|||
<pre> |
|||
which g16 |
|||
/.../chem/gaussian/g16.C.01/x86_64-Intel-avx2-source/g16/g16 |
|||
</pre> |
|||
=== Basic functions of Lmod and commands === |
|||
<br>The module system has other useful capabilities then just the managing the environment.<br><br> |
|||
''' (i) to list available software'''<br> |
|||
<pre> |
|||
module available |
|||
</pre> |
|||
'''<br> (ii) to load (activate) modules (particular software)'''<br> |
|||
<pre> |
|||
module load |
|||
</pre> |
|||
'''<br> (iii) to unload (inactivate) modules (particular software)'''<br> |
|||
<pre> |
|||
module unload |
|||
</pre> |
|||
<pre> |
|||
module purge |
|||
</pre> |
|||
''' (iv) to list currently loaded modules'''<br> |
|||
<pre> |
|||
module list |
|||
</pre> |
|||
'''<br> (v) to search through all packages within the module system'''<br> |
|||
<pre> |
|||
module available |
|||
</pre> |
|||
<pre> |
|||
module spider |
|||
</pre> |
|||
<pre> |
|||
module keyword |
|||
</pre> |
|||
'''<br> (vi) to provide specific help information for a particular module system<br>''' |
|||
<pre> |
|||
module help |
|||
</pre> |
|||
<pre> |
|||
module whatis |
|||
</pre> |
</pre> |
||
command. The activation is realized by modifying of the shell environment, including adding new pathes to bin directories with the specific software into the PATH environmental variable. |
|||
Although, the module system has other useful capabilities. |
|||
Basic functions of Lmod are: |
|||
'''<br> (i) to list available software''' |
|||
'''<br> (ii) to load (activate) modules (particular software)''' |
|||
'''<br> (iii) to search through all packages within the module system''' |
|||
'''<br> (iv) to provide specific help information for a particular module system<br>''' |
|||
== Elementary Lmod Commands == |
== Elementary Lmod Commands == |
||
The module commands below might be used interactively (in shells' current session), as well as in the shell scripts, in particular in the sbatch scripts used for submission of the computational jobs into SLURM (workload manager on JUSTUS2). |
|||
=== List of available software === |
|||
=== List of available software modules === |
|||
<pre> |
<pre> |
||
module available |
|||
</pre> |
</pre> |
||
alternatively also in a short form |
alternatively also in a short form |
||
<pre> |
<pre> |
||
ml av |
|||
</pre> |
</pre> |
||
Line 62: | Line 147: | ||
Each the category is further divided according to software packages and those finally according to software versions. |
Each the category is further divided according to software packages and those finally according to software versions. |
||
<br> |
<br> |
||
The full name of a module always consists of three parts '''category, name, and version'''. Consequently, the |
The full '''name of a module''' always consists of three parts '''category, name, and version''' separated by slash '''category/name/version''' . Consequently, the |
||
full name of the module with Gaussian 16 package is '''chem/gaussian/g16.C.01'''. |
full name of the module with Gaussian 16 package is '''chem/gaussian/g16.C.01'''. Analogously, gnu compiler of version 10.2 is addressed as '''compiler/gnu/10.2'''. |
||
<br> |
<br><br> |
||
See, for example, all modules of category chem with: |
See, for example, all modules of category chem with: |
||
<pre> |
<pre> |
||
Line 92: | Line 177: | ||
</pre> |
</pre> |
||
=== Load |
=== Load specific software === |
||
<pre> |
<pre> |
||
module load <module_name> |
|||
</pre> |
</pre> |
||
or shortly |
or shortly |
||
<pre> |
<pre> |
||
ml <module_name> |
|||
</pre> |
</pre> |
||
For example to load gaussian of version 16 one has to run |
For example to load gaussian of version 16 one has to run |
||
<pre> |
<pre> |
||
ml chem/gaussian/g16.C.01 |
|||
</pre> |
|||
=== List of the loaded modules === |
|||
<pre> |
|||
module list |
|||
</pre> |
|||
or simply |
|||
<pre> |
|||
ml |
|||
</pre> |
</pre> |
||
Line 121: | Line 214: | ||
=== Unload a specific software from the environment === |
=== Unload a specific software from the environment === |
||
<pre> |
<pre> |
||
module unload <module_name> |
|||
</pre> |
</pre> |
||
or equivalently |
or equivalently |
||
<pre> |
<pre> |
||
ml -<module_name> |
|||
</pre> |
</pre> |
||
for example to unload previously loaded vasp module chem/vasp/5.4.4.3.16052018 |
for example to unload previously loaded vasp module chem/vasp/5.4.4.3.16052018 |
||
use |
use |
||
<pre> |
<pre> |
||
ml -chem/vasp/5.4.4.3.16052018 |
|||
</pre> |
</pre> |
||
=== Unload all the loaded modules === |
=== Unload all the loaded modules === |
||
Line 138: | Line 231: | ||
or |
or |
||
<pre> |
<pre> |
||
ml purge |
|||
</pre> |
</pre> |
||
=== Providing a specific help for a particular module === |
=== Providing a specific help for a particular module === |
||
<pre> |
<pre> |
||
module help <module_name> |
|||
</pre> |
</pre> |
||
or |
or |
||
<pre> |
<pre> |
||
ml help <module_name> |
|||
</pre> |
</pre> |
||
=== Software examples and batch script templates === |
=== Software job examples and batch script templates === |
||
Majority of the software modules provides examples, including job |
Majority of the software modules provides examples, including job queueing system examples (batch scripts) |
||
for slurm. A full path the directory with examples is normally contained in <SOFTWARE_NAME>_EXA_DIR |
for slurm. A full path the directory with examples is normally contained in <SOFTWARE_NAME>_EXA_DIR |
||
environmental variable. For example the examples for Gromacs-2021.1 are located in (after the loading of the module). |
environmental variable. For example the examples for Gromacs-2021.1 are located in (after the loading of the module). |
||
<pre> |
<pre> |
||
ml chem/gromacs/2021.1 |
|||
</pre> |
</pre> |
||
<pre> |
<pre> |
||
Line 167: | Line 260: | ||
gromacs-2021.1_gpu.slurm gromacs-2021.1.slurm ion_channel.tpr |
gromacs-2021.1_gpu.slurm gromacs-2021.1.slurm ion_channel.tpr |
||
</pre> |
</pre> |
||
Users may make a copy of these examples and use it as template for their own job scripts: |
|||
=== List of the loaded modules === |
|||
<pre> |
<pre> |
||
cp $GROMACS_EXA_DIR/GROMACS_TestCaseA/gromacs-2021.1_gpu.slurm . |
|||
$ module list |
|||
</pre> |
</pre> |
||
'''Note: All the batch scripts examples are fully functional, i.e. the example scripts could be directly submitted into the queuing system, to launch a test job.''' |
|||
or simply |
|||
Typically, the scripts launch a short, simple calculation of the given software. Moreover, most of the sbatch scripts contain |
|||
<pre> |
|||
general submit instructions, as well as hints specific for the particular program. |
|||
$ ml |
|||
</pre> |
|||
=== |
=== Searching through module names === |
||
<pre> |
<pre> |
||
module available <module_name> |
|||
</pre> |
</pre> |
||
or shortly |
or shortly |
||
<pre> |
<pre> |
||
ml av <module_name> |
|||
</pre> |
</pre> |
||
For example, searching for python modules is realized via |
For example, searching for python modules is realized via |
||
<pre> |
<pre> |
||
ml av python |
|||
</pre> |
</pre> |
||
with the following output: |
with the following output: |
||
Line 196: | Line 288: | ||
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys" |
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys" |
||
</pre> |
</pre> |
||
=== What does this software do? Command when you don't know this software === |
|||
<pre> |
|||
ml whatis <modulename> |
|||
</pre> |
|||
provides the short description of the software package. |
|||
=== Finding detailed information about a specific module === |
=== Finding detailed information about a specific module === |
||
<pre> |
<pre> |
||
module spider <searching_pattern> |
|||
</pre> |
</pre> |
||
or just |
or just |
||
<pre> |
<pre> |
||
ml spider <searching_pattern> |
|||
</pre> |
</pre> |
||
=== Extended searching through entire module system === |
=== Extended searching through entire module system === |
||
<pre> |
<pre> |
||
module keyword <searching_pattern> |
|||
</pre> |
</pre> |
||
For example, to find out which modules contain fftw library: |
For example, to find out which modules contain fftw library: |
||
Line 223: | Line 322: | ||
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
||
</pre> |
|||
== Best Practices when working with modules == |
|||
=== Always load modules with the entire module name === |
|||
The software stack is updated regularly. The adding of the new software version usually revokes which version is marked as default. |
|||
The newer software is not always backwards compatible, including the existing scripts, workflow, or even input files. |
|||
Therefore it is strongly recommended to avoid the loading based on just category and software name. Instead, one should always use the entire module name (including the version) to make sure the same module is loaded each time. |
|||
=== Load only those modules that are needed for the current application === |
|||
Only load modules that are needed for the current script or workflow you are running, to reduce the chance of unexpected behavior caused by module conflicts. |
|||
'''Typical error''' sometimes seen on the cluster by loading of vasp module is this: |
|||
<pre> |
|||
ml compiler/intel/19.1.2 |
|||
ml mpi/impi/2019.8 |
|||
ml chem/vasp/5.4.4.3.16052018 |
|||
</pre> |
|||
'''The correct way''' is indeed only: |
|||
<pre> |
|||
ml chem/vasp/5.4.4.3.16052018 |
|||
</pre> |
|||
=== Do not use module commands in .bashrc, .bash_profile etc. scripts === |
|||
Avoid including “module load” commands in your .bashrc or .bash_profile files. As an alternative, create a bash script with the module load commands and source it each time, to load the modules needed. |
|||
=== Use 'module help' command === |
|||
see [[Software_Modules_Lmod#Providing_a_specific_help_for_a_particular_module]] |
|||
=== Check content of $<SOFTWARE_NAME>_EXA_DIR folder === |
|||
see [[//wiki.bwhpc.de/e/Software_Modules_Lmod#Software_job_examples_and_batch_script_templates]] |
|||
=== Use 'ml purge' in sbatch scripts before the first 'ml load' === |
|||
The environment in effect at the time of the sbatch, salloc, or srun commands is executed are propagated |
|||
to the spawned processes, i.e. also to the job-script. Consequently, should be some module loaded at the |
|||
time of the 'sbatch <job-script>' command execution, its state, i.e. "loaded", as well as the values of |
|||
the set environmental variables will be propagated with the job.<br><br> |
|||
Thus, consider to put 'ml purge' command as the first module command when you are designing your job-scripts. |
|||
This might prevent variety of module conflict situations. |
|||
Imagine for example, in the following scenario |
|||
<pre> |
|||
On the login node: |
|||
ml compiler/intel/19.1.2 |
|||
salloc --nodes=1 --ntasks-per-node=1 |
|||
... waiting for the allocation of the resources ... |
|||
Once on the compute node execute |
|||
ml compiler/gnu/10.2 |
|||
</pre> |
|||
the load of compiler/gnu/10.2 module on the compute node fails with following error: |
|||
<pre> |
|||
Lmod has detected the following error: Cannot load module "compiler/gnu/10.2" because these module(s) are loaded: |
|||
compiler/intel |
|||
While processing the following module(s): |
|||
Module fullname Module Filename |
|||
--------------- --------------- |
|||
compiler/gnu/10.2 /opt/bwhpc/common/modulefiles/Core/compiler/gnu/10.2.lua |
|||
[ul_l_tkz12@login02 ~]$ ml |
|||
Currently Loaded Modules: |
|||
1) compiler/intel/19.1.2 |
|||
</pre> |
</pre> |
||
Line 229: | Line 394: | ||
Some modules cannot be loaded together at the same time. For example two different versions of the same package cannot |
Some modules cannot be loaded together at the same time. For example two different versions of the same package cannot |
||
be activated simultaneously. The modules might already built-in this functionality. In such circumstances, Lmod, during the loading, either |
be activated simultaneously. The modules might already built-in this functionality. In such circumstances, Lmod, during the loading, either |
||
prints an error message and no module is loaded, or the module is reloaded - the old module is unloaded and only the new module is become activated. |
prints an error message and no module is loaded, or the module is reloaded - the old module is unloaded and only the new module is become activated.<br><br> |
||
'''Example of two versions of the intel compiler - module reload:''' |
|||
<pre> |
|||
ml compiler/intel/19.1 |
|||
</pre> |
|||
<pre> |
|||
ml |
|||
Currently Loaded Modules: |
|||
1) compiler/intel/19.1 |
|||
</pre> |
|||
<pre> |
|||
ml compiler/intel/19.1.2 |
|||
The following have been reloaded with a version change: |
|||
1) compiler/intel/19.1 => compiler/intel/19.1.2 |
|||
</pre> |
|||
<pre> |
|||
ml |
|||
Currently Loaded Modules: |
|||
1) compiler/intel/19.1.2 |
|||
</pre> |
|||
<br><br> |
|||
'''Example of two different compilers intel and gnu triggers the module conflict with the error during the load of gnu - the new module is not loaded:''' |
|||
<pre> |
|||
ml compiler/intel/19.1.2 |
|||
</pre> |
|||
<pre> |
|||
ml |
|||
Currently Loaded Modules: |
|||
1) compiler/intel/19.1.2 |
|||
</pre> |
|||
<pre> |
|||
ml compiler/gnu/10.2 |
|||
Lmod has detected the following error: Cannot load module "compiler/gnu/10.2" because these module(s) are loaded: |
|||
compiler/intel |
|||
While processing the following module(s): |
|||
Module fullname Module Filename |
|||
--------------- --------------- |
|||
compiler/gnu/10.2 /opt/bwhpc/common/modulefiles/Core/compiler/gnu/10.2.lua |
|||
</pre> |
|||
<pre> |
|||
ml |
|||
Currently Loaded Modules: |
|||
1) compiler/intel/19.1.2 |
|||
</pre> |
|||
<br> |
|||
'''Solution:''' intel compiler must be unloaded prior the load of gnu module. |
|||
<pre> |
|||
ml -compiler/intel/19.1.2 |
|||
</pre> |
|||
<pre> |
|||
ml compiler/gnu/10.2 |
|||
</pre> |
|||
<pre> |
|||
ml |
|||
Currently Loaded Modules: |
|||
1) compiler/gnu/10.2 |
|||
</pre> |
|||
=== Module dependencies (why there is no mpi module?) === |
=== Module dependencies (why there is no mpi module?) === |
||
Some modules can depends on other modules. Typically, many modules depends on mpi library, Mpi library depends on a compiler, etc.. The user does not need |
Some modules can depends on other modules. Typically, many modules depends on mpi library, Mpi library depends on a compiler, etc.. The user does not need |
||
Line 237: | Line 467: | ||
Thus, an user who wants to load a specific mpi must chose (load) a particular compiler prior the mpi module load. Note, that the mpi modules also remains |
Thus, an user who wants to load a specific mpi must chose (load) a particular compiler prior the mpi module load. Note, that the mpi modules also remains |
||
"invisible" for "module av <mpi_name>" command until a certain compiler is not loaded. This due to the module hierarchy of Lmod. More details about the hierarchy |
"invisible" for "module av <mpi_name>" command until a certain compiler is not loaded. This due to the module hierarchy of Lmod. More details about the hierarchy |
||
is below in |
is below in [[Software_Modules_Lmod#Semi_hierarchical_layout_of_modules_on_JUSTUS_2]] |
||
Line 264: | Line 494: | ||
=== Online user guide of Lmod === |
=== Online user guide of Lmod === |
||
The complete user guide can be found on Lmod websites https://lmod.readthedocs.io/en/latest/010_user.html |
The complete user guide can be found on Lmod websites [https://lmod.readthedocs.io/en/latest/010_user.html] |
||
=== Additional Module System tasks === |
=== Additional Module System tasks === |
||
Lmod offers more than 25 sub-commands plus various options to manage the modulefile system installed on JUSTUS 2. See, e.g. output of "module --help" command. Large majority of users will use only couple of them. A complete list of module sub-commands can be displayed by entering "module --help" command or in [https://lmod.readthedocs.io/en/latest/010_user.html Lmod online documentation]. The following text lists only a couple of them. |
Lmod offers more than 25 sub-commands plus various options to manage the modulefile system installed on JUSTUS 2. See, e.g. output of "module --help" command. Large majority of users will use only couple of them. A complete list of module sub-commands can be displayed by entering "module --help" command or in [https://lmod.readthedocs.io/en/latest/010_user.html Lmod online documentation]. The following text lists only a couple of them. |
||
Line 286: | Line 517: | ||
An example of module with the implemented automated pre-loading is ''orca'' module. With the pre-emptied list of the loading modules, i.e. |
An example of module with the implemented automated pre-loading is ''orca'' module. With the pre-emptied list of the loading modules, i.e. |
||
<pre> |
<pre> |
||
ml |
|||
</pre> |
</pre> |
||
shows |
shows |
||
Line 294: | Line 525: | ||
, the command sequence |
, the command sequence |
||
<pre> |
<pre> |
||
ml chem/orca |
|||
ml |
|||
</pre> |
</pre> |
||
shows |
shows |
||
Line 309: | Line 540: | ||
man module |
man module |
||
</pre> |
</pre> |
||
or |
|||
=== Module categories, versions and defaults === |
|||
Software stack on bwHPC systems is commonly classified into following categories: |
|||
<!--* [[:Category:Chemistry_software|chem]]--> |
|||
* chem |
|||
<!--* [[:Category:Compiler_software|compiler]]--> |
|||
* compiler |
|||
<!--* [[:Category:Debugger_software|devel]]--> |
|||
* devel |
|||
<!--* [[:Category:Libraries|lib]]--> |
|||
* lib |
|||
<!--* [[:Category:Numerical libraries|numlib]]--> |
|||
* numlib |
|||
<!--* [[:Category:Physics software|phys]]--> |
|||
* phys |
|||
<!--* [[:Category:System software|system]]--> |
|||
* system |
|||
<!--* [[:Category:Visualization|vis]]--> |
|||
* vis |
|||
<!--* [[:Category:Mathematical ecosystems|math]]--> |
|||
* math |
|||
Each the category is further divided according to software packages and those finally according to software versions. |
|||
Similarly, the module identifier has the format: ''category/softwarename/version'' |
|||
For instance, ''gnu'' compiler of the version ''10.1'' is unabiguosly addressed as ''compiler/gnu/10.1'' in, e.g. the load command: |
|||
<pre> |
<pre> |
||
module --help |
|||
$ ml compiler/gnu/10.1 |
|||
</pre> |
</pre> |
||
<br> |
|||
In case of there is multiple software versions, one version is always pre-determined as the '''default''' |
|||
version. To address a default version, ''version'' can be omitted in the module identifier. |
|||
For example, the loading of the default intel compiler module is realized via |
|||
<pre> |
|||
$ ml compiler/intel |
|||
</pre> |
|||
=== Selective searching === |
|||
It is possible to perform searching through only selected category by adding name of the cathegory to ''module avail'' or ''ml av'' command. |
|||
For example, to only explore the compilers is realized with |
|||
<pre> |
|||
$ module avail compiler/ |
|||
</pre> |
|||
=== Extended searching through help documentation === |
|||
There is a keyword search tool 'module keyword word1 word2'. With this command one |
|||
can search through help messages or what is documentation. For example: |
|||
<pre> |
|||
$ module keyword fftw |
|||
</pre> |
|||
will print out all the modules containing any "fftw" string in its help or whatis descriptions. |
|||
=== How do Modules work? === |
=== How do Modules work? === |
||
Line 368: | Line 551: | ||
You can view its content using: |
You can view its content using: |
||
<pre> |
<pre> |
||
type module |
|||
</pre> |
</pre> |
||
and you will get the following result: |
and you will get the following result: |
||
<pre> |
<pre> |
||
type module |
|||
module is a function |
module is a function |
||
module () |
module () |
||
Line 384: | Line 567: | ||
<br> |
<br> |
||
---- |
---- |
||
[[Category:bwForCluster_JUSTUS_2|JUSTUS 2]] |
|||
[[#top|Back to top]] |
[[#top|Back to top]] |
Latest revision as of 14:38, 10 October 2024
Software Module System - Lmod
Preface
This guide provides a general overview and introduction to the software system management via Lmod on JUSTUS 2 for new users as well as for experienced users coming, e.g. from sites planted with different Environment modules systems.
Scientific Software management through the module system Lmod
The following sections covers the basic module commands needed to find and load the scientific applications installed on the JUSTUS 2.
JUSTUS 2 uses Linux operating system. The standard Linux packages are installed on the front end nodes (login and visualisation nodes).
The scientific software is accessible via so called module system.
To find and load an scientific application, one needs to use module commands. For example the following command sequence:
module load chem/gaussian/g16.C.01 module list module help chem/gaussian/g16.C.01 cp $GAUSSIAN_EXA_DIR/bwforcluster-gaussian-example.sbatch .
1. loads the module with gaussian software package of version 16, revision C.01 2. prints out the list of currently load modules 3. provides the user help for the particular gaussian module 4. copies the template batch script which was specifically designed for submission of g16 jobs into SLURM workload manager on JUSTUS 2.
Why we use module system? Modules load scientific software
The module system on JUSTUS 2 is managed by Lmod (https://lmod.readthedocs.io/en/latest/).
The module system incorporates majority of the computational software available - this includes among others compilers, mpi libraries, numerical libraries, computational
chemistry packages, python specific libraries etc..
The programs managed by the module system are by default not utilizable. It has to be "loaded" to become executable.
The use of module system provide among others the following functionalities:
1. When loading a module, it automatically sets the appropriate environment variables required by the application to run properly.
2. It also takes care about the module dependency. It either loads all additional modules required for the application, or it informs the user if additional dependency modules need to be manually loaded.
3. It prevents loading of modules that could be in conflicts and can cause instability or unexpected behavior.
Among the main functionalities of Lmod belongs module load to make variety of the software packages pre-installed on the cluster accessible. This is feasible by only a single command:
module load <module_name>
The activation is realized by dynamical modification of the user's shell environment. This simply includes adding new paths to bin directories with the specific software into the PATH environmental variable. Typically, Lmod modifies PATH and LD_LIBRARY_PATH as well as it sets new variables as, for example <SOFTWARE_NAME>_EXA_DIR containing path to directory with the examples for a specific software.
Example: compare the content of $PATH environmental variable before and after the load of the gaussian module:
Before the load of gaussian module:
echo $PATH /home/software/common/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin
which g16 /usr/bin/which: no g16 in (/home/software/common/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin)
ml chem/gaussian
After gaussian is loaded:
echo $PATH /.../chem/gaussian/g16.C.01/x86_64-Intel-avx2-source/g16/bsd:/.../chem/gaussian/g16.C.01/x86_64-Intel-avx2-source/g16:/home/software/common/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin
which g16 /.../chem/gaussian/g16.C.01/x86_64-Intel-avx2-source/g16/g16
Basic functions of Lmod and commands
The module system has other useful capabilities then just the managing the environment.
(i) to list available software
module available
(ii) to load (activate) modules (particular software)
module load
(iii) to unload (inactivate) modules (particular software)
module unload
module purge
(iv) to list currently loaded modules
module list
(v) to search through all packages within the module system
module available
module spider
module keyword
(vi) to provide specific help information for a particular module system
module help
module whatis
Elementary Lmod Commands
The module commands below might be used interactively (in shells' current session), as well as in the shell scripts, in particular in the sbatch scripts used for submission of the computational jobs into SLURM (workload manager on JUSTUS2).
List of available software modules
module available
alternatively also in a short form
ml av
Module naming convention: Category/Name/Version
On JUSTUS 2 (similarly as on other HPC sites of bwHPC), the software modules are grouped into several categories:
- chem
- compiler
- devel
- lib
- numlib
- phys
- system
- vis
- math
This makes it easier for
users to get oriented within the module system. For example Gaussian 16 program allowing
to calculate electronic structure of molecules is found in the category chem (together with other
programs used by theoretical chemists).
Each the category is further divided according to software packages and those finally according to software versions.
The full name of a module always consists of three parts category, name, and version separated by slash category/name/version . Consequently, the
full name of the module with Gaussian 16 package is chem/gaussian/g16.C.01. Analogously, gnu compiler of version 10.2 is addressed as compiler/gnu/10.2.
See, for example, all modules of category chem with:
ml av chem
---------------------------------------------- /opt/bwhpc/common/modulefiles/Core---------------------------------------------------------------------------------- chem/adf/2019.304 chem/gaussian/g16.C.01 chem/molpro/2020.1 (D) chem/orca/5.0.1-xtb-6.4.1 chem/tmolex/4.6 (D) chem/ams/2020.101 chem/gaussview/6.1.1 chem/namd/2.14 chem/orca/5.0.1 (D) chem/turbomole/7.4.1 chem/ams/2020.103 chem/gromacs/2020.2 chem/nbo/6.0.18_i4 chem/quantum_espresso/6.5 chem/turbomole/7.5 (D) chem/ams/2021.102 (D) chem/gromacs/2020.4 chem/nbo/6.0.18_i8 (D) chem/quantum_espresso/6.7_openmp-5 chem/vasp/5.4.4.3.16052018 chem/cfour/2.1_openmpi chem/gromacs/2021.1 (D) chem/openbabel/3.1.1 chem/quantum_espresso/6.7 (D) chem/vmd/1.9.3 chem/cp2k/7.1 chem/jmol/14.31.3 chem/openmolcas/19.11 chem/schrodinger/2020-2 chem/xtb/6.3.3 chem/cp2k/8.0_devel (D) chem/lammps/stable_3Mar2020 chem/openmolcas/21.06 (D) chem/schrodinger/2021-1 (D) chem/xtb/6.4.1 (D) chem/dalton/2020.0 chem/molcas/8.4 chem/openmolcas/21.10 chem/siesta/4.1-b4 chem/dftbplus/20.2.1-cpu chem/molden/5.9 chem/orca/4.2.1-xtb-6.3.3 chem/siesta/4.1.5 (D) chem/gamess/2020.2 chem/molpro/2019.2.3 chem/orca/4.2.1 chem/tmolex/4.5.2
or, analogously, all available versions of intel compilers:
ml av compiler/intel
--------------------------------------------------------------------------------/opt/bwhpc/common/modulefiles/Core-------------------------------------------------------------- compiler/intel/19.0 compiler/intel/19.1 compiler/intel/19.1.2 (D)
Load specific software
module load <module_name>
or shortly
ml <module_name>
For example to load gaussian of version 16 one has to run
ml chem/gaussian/g16.C.01
List of the loaded modules
module list
or simply
ml
Default module version
In case of there is multiple software versions, one version is always pre-determined as the default version. To address a default version, version can be omitted in the module identifier. For example, the loading of the default intel compiler module is realized via
ml compiler/intel
ml Currently Loaded Modules: 1) compiler/intel/19.1.2
Unload a specific software from the environment
module unload <module_name>
or equivalently
ml -<module_name>
for example to unload previously loaded vasp module chem/vasp/5.4.4.3.16052018 use
ml -chem/vasp/5.4.4.3.16052018
Unload all the loaded modules
$ module purge
or
ml purge
Providing a specific help for a particular module
module help <module_name>
or
ml help <module_name>
Software job examples and batch script templates
Majority of the software modules provides examples, including job queueing system examples (batch scripts) for slurm. A full path the directory with examples is normally contained in <SOFTWARE_NAME>_EXA_DIR environmental variable. For example the examples for Gromacs-2021.1 are located in (after the loading of the module).
ml chem/gromacs/2021.1
echo $GROMACS_EXA_DIR/ /opt/bwhpc/common/chem/gromacs/2021.1-openmpi-4.0/bwhpc-examples/
ls $GROMACS_EXA_DIR GROMACS_TestCaseA Performance-Tuning-and-Optimization-of-GROMACS.pdf README
ls $GROMACS_EXA_DIR/GROMACS_TestCaseA/ gromacs-2021.1_gpu.slurm gromacs-2021.1.slurm ion_channel.tpr
Users may make a copy of these examples and use it as template for their own job scripts:
cp $GROMACS_EXA_DIR/GROMACS_TestCaseA/gromacs-2021.1_gpu.slurm .
Note: All the batch scripts examples are fully functional, i.e. the example scripts could be directly submitted into the queuing system, to launch a test job. Typically, the scripts launch a short, simple calculation of the given software. Moreover, most of the sbatch scripts contain general submit instructions, as well as hints specific for the particular program.
Searching through module names
module available <module_name>
or shortly
ml av <module_name>
For example, searching for python modules is realized via
ml av python
with the following output:
----------------------------------------------------------------------------------- /opt/bwhpc/common/modulefiles/Core ------------------------------------------------------------------------------------ devel/python/3.8.3 lib/python_matplotlib/3.2.2_numpy-1.19.0_python-3.8.3 numlib/python_numpy/1.19.0_python-3.8.3 numlib/python_scipy/1.5.0_numpy-1.19.0_python-3.8.3 Use "module spider" to find all possible modules and extensions. Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys"
What does this software do? Command when you don't know this software
ml whatis <modulename>
provides the short description of the software package.
Finding detailed information about a specific module
module spider <searching_pattern>
or just
ml spider <searching_pattern>
Extended searching through entire module system
module keyword <searching_pattern>
For example, to find out which modules contain fftw library:
ml keyword fftw
which gives the following info:
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- The following modules match your search criteria: "fftw" ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- numlib/mkl: numlib/mkl/2019, numlib/mkl/2020, numlib/mkl/2020.2 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Best Practices when working with modules
Always load modules with the entire module name
The software stack is updated regularly. The adding of the new software version usually revokes which version is marked as default. The newer software is not always backwards compatible, including the existing scripts, workflow, or even input files. Therefore it is strongly recommended to avoid the loading based on just category and software name. Instead, one should always use the entire module name (including the version) to make sure the same module is loaded each time.
Load only those modules that are needed for the current application
Only load modules that are needed for the current script or workflow you are running, to reduce the chance of unexpected behavior caused by module conflicts.
Typical error sometimes seen on the cluster by loading of vasp module is this:
ml compiler/intel/19.1.2 ml mpi/impi/2019.8 ml chem/vasp/5.4.4.3.16052018
The correct way is indeed only:
ml chem/vasp/5.4.4.3.16052018
Do not use module commands in .bashrc, .bash_profile etc. scripts
Avoid including “module load” commands in your .bashrc or .bash_profile files. As an alternative, create a bash script with the module load commands and source it each time, to load the modules needed.
Use 'module help' command
see Software_Modules_Lmod#Providing_a_specific_help_for_a_particular_module
Check content of $<SOFTWARE_NAME>_EXA_DIR folder
see [[1]]
Use 'ml purge' in sbatch scripts before the first 'ml load'
The environment in effect at the time of the sbatch, salloc, or srun commands is executed are propagated
to the spawned processes, i.e. also to the job-script. Consequently, should be some module loaded at the
time of the 'sbatch <job-script>' command execution, its state, i.e. "loaded", as well as the values of
the set environmental variables will be propagated with the job.
Thus, consider to put 'ml purge' command as the first module command when you are designing your job-scripts.
This might prevent variety of module conflict situations.
Imagine for example, in the following scenario
On the login node: ml compiler/intel/19.1.2 salloc --nodes=1 --ntasks-per-node=1 ... waiting for the allocation of the resources ... Once on the compute node execute ml compiler/gnu/10.2
the load of compiler/gnu/10.2 module on the compute node fails with following error:
Lmod has detected the following error: Cannot load module "compiler/gnu/10.2" because these module(s) are loaded: compiler/intel While processing the following module(s): Module fullname Module Filename --------------- --------------- compiler/gnu/10.2 /opt/bwhpc/common/modulefiles/Core/compiler/gnu/10.2.lua [ul_l_tkz12@login02 ~]$ ml Currently Loaded Modules: 1) compiler/intel/19.1.2
Useful Extras
Conflicts between modules
Some modules cannot be loaded together at the same time. For example two different versions of the same package cannot
be activated simultaneously. The modules might already built-in this functionality. In such circumstances, Lmod, during the loading, either
prints an error message and no module is loaded, or the module is reloaded - the old module is unloaded and only the new module is become activated.
Example of two versions of the intel compiler - module reload:
ml compiler/intel/19.1
ml Currently Loaded Modules: 1) compiler/intel/19.1
ml compiler/intel/19.1.2 The following have been reloaded with a version change: 1) compiler/intel/19.1 => compiler/intel/19.1.2
ml Currently Loaded Modules: 1) compiler/intel/19.1.2
Example of two different compilers intel and gnu triggers the module conflict with the error during the load of gnu - the new module is not loaded:
ml compiler/intel/19.1.2
ml Currently Loaded Modules: 1) compiler/intel/19.1.2
ml compiler/gnu/10.2 Lmod has detected the following error: Cannot load module "compiler/gnu/10.2" because these module(s) are loaded: compiler/intel While processing the following module(s): Module fullname Module Filename --------------- --------------- compiler/gnu/10.2 /opt/bwhpc/common/modulefiles/Core/compiler/gnu/10.2.lua
ml Currently Loaded Modules: 1) compiler/intel/19.1.2
Solution: intel compiler must be unloaded prior the load of gnu module.
ml -compiler/intel/19.1.2
ml compiler/gnu/10.2
ml Currently Loaded Modules: 1) compiler/gnu/10.2
Module dependencies (why there is no mpi module?)
Some modules can depends on other modules. Typically, many modules depends on mpi library, Mpi library depends on a compiler, etc.. The user does not need to care about these fundamental dependencies. Majority of modules automatically take care about loading of all necessary packages it is depending on. However, there is an eminent exception - mpi library. While the most of the installed parallel applications exist for only one compiler-mpi combination, there are variety of mpi libraries of the same versions built with different compilers. For example there are two sets of OpenMPI 4.0 modules for intel and gnu compilers. Thus, an user who wants to load a specific mpi must chose (load) a particular compiler prior the mpi module load. Note, that the mpi modules also remains "invisible" for "module av <mpi_name>" command until a certain compiler is not loaded. This due to the module hierarchy of Lmod. More details about the hierarchy is below in Software_Modules_Lmod#Semi_hierarchical_layout_of_modules_on_JUSTUS_2
Consequences of the partial module hierarchy for mpi modules
Mpi modules remains invisible for a user (prompted module avail) until some compiler module has been loaded. Once the compiler module has been activated corresponding mpi modules, i.e. built with the particular compiler, become visible.
E.g., with the originally empty list of the loaded modules, the module command
$ module avail
or its shorthand analogue
$ ml av
displays no mpi module available. After running
ml compiler/intel/19.1 ml av
mpi packages compatible with the intel 19.1 compiler becomes visible
------------ /opt/bwhpc/common/modulefiles/Compiler/intel/19.1 -------------------- mpi/impi/2019.7 mpi/openmpi/4.0
in the list of the available software.
Online user guide of Lmod
The complete user guide can be found on Lmod websites [2]
Additional Module System tasks
Lmod offers more than 25 sub-commands plus various options to manage the modulefile system installed on JUSTUS 2. See, e.g. output of "module --help" command. Large majority of users will use only couple of them. A complete list of module sub-commands can be displayed by entering "module --help" command or in Lmod online documentation. The following text lists only a couple of them.
Other topics
Which shells supports module commands?
So far Bash is only supported shell on JUSTUS 2 to interpret module commands.
Semi hierarchical layout of modules on JUSTUS 2
Module hierarchy in Lmod
The structure of software modules on JUSTUS 2 exploits a "semi" hierarchical structure. This is slightly different from what can be seen on another HPC systems with "full" hierarchical structure. The typical systems with full hierarchy put compiler modules (i.e., intel, gcc) in the uppermost (Core) level, depending libraries (e.g., MPI) on the second level, and more depending libraries on a third level. As a consequence, not all the modules contained in the module system are initially visible, namely the modules putted in the second and third layer. Only after a loading a compiler module, the modules of the second layer directly depending on the particular compiler will become available. And similarly, loading an MPI module will make the modules of the third layer depending on the loaded MPI library visible.
Semi hierarchy of software stack on JUSTUS 2
JUSTUS 2 adopted the hierarchical structure of the modules layout only partially. In particular, there is only "Core" and the "second" level presented and there are only mpi modules contained in the second level. All other modules, i.e. for example those from the "chem" sub-cathegory such as vasp, turbomole, or gaussian, or those located in the "numlib" sub-cathegory such as mkl or python_numpy, are embodied in the "Core" level.
Module dependency
The adopted hierarchy models is not the only tool handling the module dependency. As a matter of fact, most of the modules on JUSTUS 2 require a provision of functionalities from another modules, albeit located in the "Core" level. Such provisioning is implemented in a modulefile either automatically without a need of any action from the user (the depending modulefile, while loading, loads all additional modules automatically) or the depending modulefile, while loading, informs the user about necessity to pre-load additional modules if those has not been activated yet (in this case the user must repeat the loading operation). Which of the solution is applied rests with the decision of the person who built the particular module.
An example of module with the implemented automated pre-loading is orca module. With the pre-emptied list of the loading modules, i.e.
ml
shows
No modules loaded
, the command sequence
ml chem/orca ml
shows
Currently Loaded Modules: 1) compiler/intel/19.1 2) chem/orca/4.2.1
I.e., loading of the intel compiler is built-in the orca module.
Complete list of Lmod options and sub-commands
The whole list of module options and all commands available can be displayed by running
man module
or
module --help
How do Modules work?
The default shell on the bwHPC clusters is bash, so explanations and examples will be shown for bash. In general, programs cannot modify the environment of the shell they are being run from, so how can the module command do exactly that?
The module command is not a program, but a bash-function.
You can view its content using:
type module
and you will get the following result:
type module module is a function module () { eval $($LMOD_CMD bash "$@"); [ $? = 0 ] && eval $(${LMOD_SETTARG_CMD:-:} -s sh) }
In this function, lmod is called. Its output to stdout is then executed inside your current shell using the bash-internal eval command. As a consequence, all output that you see from the module is transmitted via stderr (output handle 2) or in so