Memory Usage

From bwHPC Wiki
Jump to: navigation, search

1 Memory Usage

Often compute jobs get aborted because there is not enough memory available on the compute node to hold all the data. This page gives guidelines and tips for some tools how much memory should be requested for these tools and how the required memory scales with data and number of processors.

1.1 Specify memory in the job script

Please remember to specify the memory your job needs in the jobscript. Otherwise it can happen that the job is cancelled by the scheduler when it trys to allocate memory.

With the mem option you can request xx GB memory for the whole job.

#PBS -l mem=xxgb

With the pmem option you can request memory per core. This example will request 4 * xxGB, as the job runs on 4 cores.

#PBS -l nodes=1:ppn=4
#PBS -l pmem=xxgb

After the job finished a feedbackscript will denote the memory used by your job. This can help to get a more educated guess how much memory your jobs really need.

Memory job, MB                              | 420         [req.     10220]

1.2 Tools

1.2.1 Mothur

When analyzing reads with Mothur, at some point a distance matrix will be created. This pair wise distances are saved in a file with the .dist prefix. This matrix is held in memory when you have Mothur commands like cluster or cluster.split.

As stated in this Mothur forum post, the required memory increases with the number of used processes. If you experience memory problems with your Mothur script, you may have to use fewer processes for the clustering.

Note: The more processors used the more memory is required. Each process will load a distance matrix into memory (RAM).