NEMO2/Easybuild Modules/EB Build Module

From bwHPC Wiki
Jump to navigation Jump to search

EasyBuild Module Builder Script

The eb-build-module.sh script is a wrapper around EasyBuild that simplifies building software modules with customizable configurations. It supports parallel builds, custom walltime limits, architecture-specific builds, and SLURM job submission.

This script is provided system-wide and can be called directly without specifying a path.

Features

  • Architecture-specific builds: Support for genoa, milan, mi300a, l40s, and h200 architectures
  • Automatic GPU allocation: Detects GPU architectures and allocates GPUs accordingly
  • Flexible configuration: Customizable cores, walltime, and installation prefix
  • Multiple build modes: Standard build, rebuild, module-only, and dry-run
  • EasyConfig search: Built-in search functionality for finding available easyconfigs
  • Environment integration: Automatically sources local EasyBuild environment
  • Additional options: Pass any EasyBuild option directly to the eb command

Usage

Basic Syntax

eb-build-module.sh [-c cores] [-w walltime] [-a arch] [-p prefix] [-r] [-o] [-d] [-s pattern] -m module [-- extra_eb_options]

Options

Option Description Default
-m module Name of the module to build (required unless using -s) -
-c cores Number of cores to allocate for the job 20
-w walltime Maximum walltime in hours 12
-a arch Architecture/partition: genoa, milan, mi300a, l40s, h200 genoa
-p prefix Installation prefix directory ~/.local/easybuild
-r Rebuild software, even if module already exists (--rebuild) -
-o Only generate module file(s), skip build steps (--module-only) -
-d Perform a dry-run, print build overview (--dry-run) -
-s pattern Search for easyconfig files matching pattern (--search-short) -
-h Display help message -
-- Pass everything after this to eb command -

Environment Variables

The script respects the following environment variables:

Variable Description
PREFIX Installation prefix (can be overridden with -p)
EASYBUILD_MODULE_NAMING_SCHEME Module naming scheme
EASYBUILD_ACCEPT_EULA_FOR Comma-separated list of EULAs to accept
EB_COMSOL_LICENSE_FILE COMSOL license file path
EB_MATLAB_KEY MATLAB license key
EB_MATLAB_LICENSE_SERVER MATLAB license server hostname
EB_MATLAB_LICENSE_SERVER_PORT MATLAB license server port
EB_MATHEMATICA_LICENSE_SERVER Mathematica license server

Note: If ~/.local/easybuild/env exists, it will be sourced automatically.

Examples

Basic Build

Build a module with default settings:

eb-build-module.sh -m Python-3.11.3-GCCcore-12.3.0

Custom Configuration

Build with custom cores, walltime, and architecture:

eb-build-module.sh -c 32 -w 24 -a genoa -m Python-3.11.3-GCCcore-12.3.0

Rebuild Existing Module

Force rebuild of an existing module:

eb-build-module.sh -r -m Python-3.11.3-GCCcore-12.3.0

Module File Only

Generate only the module file without building:

eb-build-module.sh -o -m Python-3.11.3-GCCcore-12.3.0

Dry-Run

Preview what would be built without actually building:

eb-build-module.sh -d -m Python-3.11.3-GCCcore-12.3.0

Search for EasyConfigs

Search for available easyconfig files:

# Simple search
eb-build-module.sh -s Python

# Search with regex pattern
eb-build-module.sh -s "GCC.*12.3"

Pass Additional EasyBuild Options

Pass extra options directly to EasyBuild:

# Build with force and debug options
eb-build-module.sh -m Python-3.11.3-GCCcore-12.3.0 -- --force --debug

# Skip test step
eb-build-module.sh -m Python-3.11.3-GCCcore-12.3.0 -- --skip-test-step

Using Environment Variables

Set custom prefix and accept EULAs:

export PREFIX=/opt/easybuild
export EASYBUILD_ACCEPT_EULA_FOR="CUDA,cuDNN,Intel-oneAPI"
eb-build-module.sh -m CUDA-12.0.0

Architecture Support

The script supports the following architectures:

Architecture Type GPU Support Notes
genoa CPU (AMD EPYC Genoa) No Default architecture
milan CPU (AMD EPYC Milan) No Separate installation directory
mi300a APU (AMD MI300A - CPU+GPU) Yes (1 GPU allocated) Separate installation directory
l40s GPU (NVIDIA L40S, Intel-based) Yes (1 GPU allocated) Separate installation directory
h200 GPU (NVIDIA H200, Genoa-based) Yes (1 GPU allocated) Separate installation directory

Note: GPU architectures (mi300a, l40s, h200) automatically allocate 1 GPU for the build job.

When building your own modules, each architecture gets its own separate installation directory:

  • ~/.local/easybuild/milan/
  • ~/.local/easybuild/genoa/
  • ~/.local/easybuild/mi300a/
  • ~/.local/easybuild/l40s/
  • ~/.local/easybuild/h200/

Optional: You can manually create symbolic links to mirror the global NEMO2 module structure (where mi300a and h200 link to genoa), but this is not required.

Installation Directories

Module Installation Paths

By default, modules are installed to:

~/.local/easybuild/<arch>/

For example:

  • ~/.local/easybuild/milan/ - Milan-specific modules
  • ~/.local/easybuild/genoa/ - Genoa-specific modules
  • ~/.local/easybuild/mi300a/ - MI300A-specific modules
  • ~/.local/easybuild/l40s/ - L40S-specific modules
  • ~/.local/easybuild/h200/ - H200-specific modules

Module files location: The actual module files that you load with module load are placed in:

~/.local/easybuild/modules/all/          # Architecture-independent modules
~/.local/easybuild/<arch>/modules/all/   # Architecture-specific modules

Example module paths:

~/.local/easybuild/modules/all/lang/python/3.11.3-gcccore-12.3.0.lua
~/.local/easybuild/genoa/modules/all/lang/python/3.11.3-gcccore-12.3.0.lua
~/.local/easybuild/mi300a/modules/all/ai/pytorch/2.1.2-foss-2023a.lua

Important: On NEMO2, the module naming scheme converts all names to lowercase. While EasyBuild uses files like Python-3.11.3-GCCcore-12.3.0.eb, the resulting modules are named in lowercase: lang/python/3.11.3-gcccore-12.3.0. This applies to all modules.

These paths are automatically added to your MODULEPATH when you load an architecture with module load arch/<arch>, making your custom modules visible with module avail.

You can change the installation location with the -p option or by setting the PREFIX environment variable.

Workflow

Standard Build Workflow

  1. Search for available versions: eb-build-module.sh -s Python
  2. Dry-run to check dependencies: eb-build-module.sh -d -m Python-3.11.3-GCCcore-12.3.0
  3. Build the module: eb-build-module.sh -m Python-3.11.3-GCCcore-12.3.0
  4. Verify module is installed: module avail python

GPU Software Workflow

  1. Choose appropriate GPU architecture and framework:
    • AMD MI300A APU: Use ROCm-based software
    • NVIDIA GPUs (L40S, H200): Use CUDA-based software
  2. Build with GPU arch:
    • For AMD MI300A APU: eb-build-module.sh -a mi300a -m PyTorch-2.1.2-foss-2023a
    • For NVIDIA L40S: eb-build-module.sh -a l40s -m CUDA-12.0.0
    • For NVIDIA H200: eb-build-module.sh -a h200 -m CUDA-12.0.0
  3. Module will be installed under the architecture-specific prefix (e.g., ~/.local/easybuild/mi300a/)
  4. Remember: After building, module names will be lowercase (e.g., system/cuda/12.0.0)

Build Process

The script performs the following steps:

  1. Sources local EasyBuild environment if available (~/.local/easybuild/env)
  2. Validates input parameters (module name, architecture)
  3. Configures architecture-specific settings (including GPU allocation)
  4. Automatically appends .eb extension to module name if not present
  5. Creates log directory: ~/.eb/robot/{architecture}/logs
  6. Purges existing modules to ensure clean build environment
  7. Loads the specified architecture module (module load arch/{architecture})
  8. Constructs and executes EasyBuild command with all specified options
  9. Reports build status (success or failure with exit code)

Script Output

The script provides detailed output including:

  • Module name and architecture
  • Number of cores and walltime
  • Installation prefix
  • Log directory
  • Build mode (rebuild, module-only, dry-run)
  • Full EasyBuild command being executed
  • Build status (success or failure with exit code)

Example output:

==========================================
EasyBuild Module Build Configuration
==========================================
Module:        Python-3.11.3-GCCcore-12.3.0.eb
Architecture:  genoa
Cores:         20
Walltime:      12 hours
Prefix:        /home/user/.local/easybuild/genoa
Log directory: /home/user/.eb/robot/genoa/logs
==========================================

Starting EasyBuild...
Command: eb Python-3.11.3-GCCcore-12.3.0.eb --prefix /home/user/.local/easybuild/genoa --robot --module-extensions --job --job-cores 20 --job-max-walltime 12

...

==========================================
Build completed successfully!
==========================================

Note: After building, the module will be available as lang/python/3.11.3-gcccore-12.3.0 (lowercase) due to NEMO2's module naming scheme, even though the easyconfig file uses capitalized names.

Error Handling

The script includes robust error handling:

  • Validates module name is provided (unless searching)
  • Validates architecture is one of the supported values
  • Ensures log directory can be created
  • Exits with appropriate error codes
  • Provides clear error messages

Tips and Best Practices

1. Use Dry-Run First

Always preview what will be built before starting a build:

eb-build-module.sh -d -m Python-3.11.3-GCCcore-12.3.0

This shows all dependencies and build steps without actually building.

2. Search Before Building

Search for available versions to ensure you're building the correct module:

eb-build-module.sh -s Python

3. Set Up Local Environment

Create ~/.local/easybuild/env to set default environment variables:

# Example ~/.local/easybuild/env
export EASYBUILD_MODULE_NAMING_SCHEME=CategorizedModuleNamingScheme
export EASYBUILD_ACCEPT_EULA_FOR="CUDA,cuDNN,Intel-oneAPI"
export PREFIX=/custom/path/easybuild

This file is sourced automatically by the script.

4. Monitor Build Logs

Build logs are stored in:

~/.eb/robot/{architecture}/logs/

Check these logs if a build fails or behaves unexpectedly.

5. Choose Correct Architecture

Use the architecture that matches your target hardware:

  • CPU-only software: genoa or milan
  • GPU software: mi300a, l40s, or h200

6. Use Appropriate Resources

Adjust cores and walltime based on the software being built:

  • Small packages: Default settings (20 cores, 12 hours) are usually sufficient
  • Large packages (e.g., GCC, LLVM): Increase cores and walltime
eb-build-module.sh -c 64 -w 48 -m GCC-13.2.0

Troubleshooting

Module Not Found

If EasyBuild cannot find the module:

# Search for available versions
eb-build-module.sh -s "ModuleName"

Note: The script automatically adds the .eb extension if not present, so you don't need to specify it.

Build Fails

Check the log files in ~/.eb/robot/{architecture}/logs/ for detailed error messages.

Permission Issues

Ensure you have write permissions to:

  • Installation prefix (PREFIX)
  • Log directory (~/.eb/robot/{architecture}/logs)

EULA Acceptance

For software requiring EULA acceptance:

export EASYBUILD_ACCEPT_EULA_FOR="CUDA,cuDNN,Intel-oneAPI"
eb-build-module.sh -m CUDA-12.0.0

See Also

Version Information

  • Script version: 1.0.0
  • Last updated: November 2025
  • Default architecture: genoa
  • Default cores: 20
  • Default walltime: 12 hours