Revision as of 12:11, 20 February 2024

The main documentation is available on the cluster via module help bio/alphafold. Most software modules for applications provide working example batch scripts.

Description	Content
module load	bio/alphafold
License	Apache License 2.0 - see [1]
Citing	See [2]
Links	DeepMind AlphaFold Website: [3]

Description

AlphaFold developed by DeepMind predicts protein structures from the amino acid sequence at or near experimental resolution.

Usage

The BinAC provides Alphafold via an Apptainer Container. Both, the container and the AlphaFold database is stored on the WORK filesystem. The module bio/alphafold provides a wrapper script called alphafold. Upon loading the module, the wrapper alphafold is in PATH and can be directly used. The wrapper behaves like the script used in DeepMind's AlphaFold GitHub repository. Thus, all options explained in DeepMind's AlphaFold GitHub repository are also applicable for our alphafold wrapper.

Parallel Computing

AlphaFold's algorithm results in two optimal resource profile regarding the number of cores and GPUs, depending on the way you run AlphaFold. The memory requirement depends on protein size.

Compute MSAs

In the beginning, AlphaFold computes three multiple sequence alignments (MSA). These MSAs are computed on the CPU sequentially and the number of threads are hard-coded:

jackhmmer on UniRef90 using 8 threads
jackhmmer on MGnify using 8 threads
HHblits on BFD + Uniclust30 using 4 threads

Aftere computing the MSAs, AlphaFold then performs model inference on the GPU. Only one GPU is used. This use case has this optimal resource profile:

#PBS -l nodes=1:ppn=8:gpus=1

The three MSAs are stored in the directory specified by --output_dir and can be reused with --use_precomputed_msas=true.

Use existing MSAs

There is a switch (--use_precomputed_msas=true) that lets you use MSAs that were computed by an earlier AlphaFold run. As AlphaFold skips the computation of the MSAs. The model inference step will run on only one GPU. Thus the optimal resource profile is;

#PBS -l nodes=1:ppn=1:gpus=1

Example on BinAC

Benchmark on BinAC

We ran some CASP14 targets with the --benchmark=true on BinAC. The following table gives you some guidance for choosing meaningful memory and walltime values.

Benchmark results on BinAC (work in progress)
Target	#Residues	jackhmmer UniRef90 [s]	jackhmmer MGnify [s]	HHblits on BFD [s]	Inference [s]	Memory Usage [GB]
...	...	...	...	...	...	...
...	...	...	...	...	...	...
...	...	...	...	...	...	...

@@ Line 64: / Line 64: @@
 = Benchmark on BinAC =
+We ran some CASP14 targets with the <code>--benchmark=true</code> on BinAC. The following table gives you some guidance for choosing meaningful memory and walltime values.
+{| class="wikitable" style="margin:auto"
+|+ Benchmark results on BinAC (work in progress)
+|-
+! Target !! #Residues !! jackhmmer UniRef90 [s] !! jackhmmer MGnify [s] !! HHblits on BFD [s] !! Inference [s] !! Memory Usage [GB]
+|-
+| ... || ... || ... || ... || ... || ... || ...
+|-
+| ... || ... || ... || ... || ... || ... || ...
+|-
+| ... || ... || ... || ... || ... || ... || ...
+|}

BinAC/Software/Alphafold: Difference between revisions

Revision as of 12:11, 20 February 2024

Contents

Description

Usage

Parallel Computing

Compute MSAs

Use existing MSAs

Example on BinAC

Benchmark on BinAC

Navigation menu

BinAC/Software/Alphafold: Difference between revisions

Revision as of 12:11, 20 February 2024

Description

Usage

Parallel Computing

Compute MSAs

Use existing MSAs

Example on BinAC

Benchmark on BinAC

Navigation menu

Search