BinAC2/Software/Nextflow
Description
Nextflow is a scientific workflow system primarily used for bioinformatics data analysis. This documentation also introduces nf-core, a community-driven initiative that maintains a curated collection of analysis pipelines built with Nextflow.
The documentation in the bwHPC Wiki serves as a getting-started guide for installing and using both Nextflow and nf-core pipelines on the bwForCluster BinAC 2. Additionally, the nf-core documentation provides an overview of the available pipelines. The nf-core documentation provides an overview of available pipelines.
Please note that this documentation does not cover how to develop your own pipelines. For that, refer to the official Nextflow documentation.
Installation
We recommend installing Nextflow using Conda.
Install Nextflow
The following commands will create a new Conda environment and install Nextflow in it.
# Load Miniforge and create a Conda environment with Nextflow pre-installed module load devel/miniforge conda create --name nextflow nextflow conda activate nextflow
Update Nextflow
You can also update Nextflow if it is already installed in your environment by running the following command:
module load devel/miniforge conda activate nextflow conda update nextflow
Specific Nextflow version
You may want to install a specific version of Nextflow if your pipeline was developed some time ago with an older version in mind. In this example, we will install Nextflow version 20.07:
module load devel/miniforge conda create --name nextflow_20.07 nextflow=20.07 conda activate nextflow_20.07
Configuration
There are some BinAC 2-specific configurations that you may want to use.
BinAC 2 Nextflow profile
Nextflow configuration files can define one or more profiles, which instruct Nextflow on how to execute pipeline processes on specific systems, such as HPC clusters. The nf-core project maintains a collection of these profiles, including a binac2 profile that runs your pipeline as SLURM jobs on BinAC 2.
nf-core pipelines
If you are using an nf-core pipeline, you can specify the profile with the following command:
nextflow run <pipeline> -profile binac2,<other profiles> [...]
Other Nextflow pipelines
If you are writing your own pipeline or using one that is not based on nf-core, you will need to manually include the nf-core profiles in the pipeline configuration.
Add the following to your nextflow.config
params { [...] custom_config_version = 'master' custom_config_base = "https://raw.githubusercontent.com/nf-core/configs/${params.custom_config_version}" } [...] // Load nf-core custom profiles from different Institutions includeConfig !System.getenv('NXF_OFFLINE') && params.custom_config_base ? "${params.custom_config_base}/nfcore_custom.config" : "/dev/null" // Load nf-core/demo custom profiles from different institutions. // nf-core: Optionally, you can add a pipeline-specific nf-core config at https://github.com/nf-core/configs includeConfig !System.getenv('NXF_OFFLINE') && params.custom_config_base ? "${params.custom_config_base}/pipeline/demo.config" : "/dev/null"
Now, your pipeline should be able to find the binac2
profile, and you can run the following command:
nextflow run <pipeline> -profile binac2,<other profiles> [...]
Apptainer
Nextflow uses either Conda packages or container images to deploy tools in a pipeline. On BinAC 2, Apptainer is installed on every node, and the binac2
profile automatically enables Apptainer and specifies a cache directory for your images.
apptainer { enabled = true autoMounts = true pullTimeout = '120m' cacheDir = "/pfs/10/project/apptainer_cache/${USER}" envWhitelist = 'CUDA_VISIBLE_DEVICES' }
Usage
Nextflow pipelines do not run in the background by default, so it's recommended to use a terminal multiplexer (such as screen
or tmux
) on the login node when running a pipeline. Terminal multiplexers allow you to have multiple windows within a single terminal. The advantage of using them for running Nextflow pipelines is that you can detach from the terminal and later reattach to check on the pipeline’s progress. This ensures the pipeline continues to run even if you disconnect from the cluster, as the detached session will keep running.
Start a screen session:
screen
Since this is a new terminal session, you will need to load the Conda environment again.
module load devel/miniforge conda activate nextflow
nf-core Pipelines
If you plan to use an nf-core pipeline, please run it once with the test
profile.
This will download the pipeline, execute it, and pull all the required containers into the Apptainer cache.
You should always specify two directories when running the pipeline to ensure you know exactly where the results are stored.
One is outdir
, which is where Nextflow stores the final pipeline results.
The other is workdir
, where Nextflow stores intermediate results and job scripts. Please set a working directory on either the work
or project
file system. Otherwise, it may clutter the backed-up home directory.
nextflow run nf-core/hlatyping -profile binac2,test --outdir <your output directory> -workdir <your work diretory>
As mentioned, the pipeline runs in a screen session.
You can detach from the screen session, and the pipeline will continue to run.
The keyboard shortcut for detaching is CTRL
+ c
, followed by d
.
This means you press the CTRL
and c
keys simultaneously, then release them and press d
.
You should now be detached from the screen session and back in your login terminal.
While in your login terminal (or another window within your screen session), you can observe that Nextflow has submitted a job to the cluster for each pipeline process execution.
Your output may differ, but it should show some pipeline jobs whose names begin with nf-NFCORE
.
[tu_iioba01@login03 ~]$ squeue -u $USER JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 162040 compute nf-NFCOR tu_iioba PD 0:00 1 (None) 162039 compute nf-NFCOR tu_iioba R 0:01 1 node1-083
Now, we return to the Nextflow process in the screen session where the pipeline is running.
You can list your screen sessions and their IDs with the command screen -ls
.
(nf-core) [tu_iioba01@login03 nextflow_tests]$ screen -ls There is a screen on: <screen session ID>.pts-2.login03 (Detached) 1 Socket in /var/run/screen/S-tu_iioba01.
If there is only one screen session, you can reattach using the following command:
screen -r
Otherwise, you will need to specify the screen session ID:
screen -r <screen session ID>
You can monitor the pipeline's execution progress. The test
profile typically runs for less than 10 minutes. In the end, it should look like this:
-[nf-core/hlatyping] Pipeline completed successfully- Completed at: 16-Apr-2025 16:18:26 Duration : 9m 16s CPU hours : 0.4 Succeeded : 7
The test run was successful. You can now run the pipeline with your own data.
Run nf-core pipeline with your own data
Typically, you specify your input files for nf-core pipelines in a samplesheet and run the pipeline with the parameter --input <your samplesheet>
. You can also override any nf-core pipeline default settings according to your needs by using a custom configuration file. For more information, please refer to the pipelines' documentation.
As usual, you can contact BinAC support if you have any problems or questions.