BinAC/Memory Usage: Difference between revisions
F Bartusch (talk | contribs) (Created page with "= Memory Usage = Often compute jobs get aborted because not enough memory is available to hold all the data. This page should give guidelines and tips for some tools how much...") |
F Bartusch (talk | contribs) m (F Bartusch moved page Memory Usage to BinAC/Memory Usage) |
||
(4 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
= Memory Usage = |
= Memory Usage = |
||
Often compute jobs get aborted because not enough memory |
Often compute jobs get aborted because there is not enough memory available on the compute node to hold all the data. |
||
This page |
This page gives guidelines and tips for some tools how much memory should be requested for these tools and how the required memory scales with data and number of processors. |
||
== Specify memory in the job script == |
== Specify memory in the job script == |
||
Please remember to specify the memory your job needs in the jobscript. Otherwise it can happen that the job is cancelled by the scheduler when it trys to allocate memory |
Please remember to specify the memory your job needs in the jobscript. Otherwise it can happen that the job is cancelled by the scheduler when it trys to allocate memory. |
||
With the mem option you can request xx GB memory for the whole job. |
|||
<source lang="bash"> |
<source lang="bash"> |
||
#PBS -l mem=xxgb |
#PBS -l mem=xxgb |
||
</source> |
|||
With the pmem option you can request memory per core. |
|||
This example will request 4 * xxGB, as the job runs on 4 cores. |
|||
<source lang="bash"> |
|||
#PBS -l nodes=1:ppn=4 |
|||
#PBS -l pmem=xxgb |
|||
</source> |
|||
After the job finished a feedbackscript will denote the memory used by your job. |
|||
This can help to get a more educated guess how much memory your jobs really need. |
|||
<source lang="bash"> |
|||
Memory job, MB | 420 [req. 10220] |
|||
</source> |
</source> |
||
Line 16: | Line 30: | ||
=== Mothur === |
=== Mothur === |
||
When |
When analyzing reads with Mothur, at some point a distance matrix will be created. |
||
This pair wise distances are saved in a file with the .dist prefix. |
This pair wise distances are saved in a file with the .dist prefix. |
||
This matrix is held in memory when you have Mothur commands like cluster or cluster.split. |
|||
As stated in [https://forum.mothur.org/t/cluster-split-and-computer-characteristics/20147 this Mothur forum post], the required memory increases with the number of used processes. |
As stated in [https://forum.mothur.org/t/cluster-split-and-computer-characteristics/20147 this Mothur forum post], the required memory increases with the number of used processes. |
||
If you experience memory problems with your Mothur script, you may have to use fewer processes for the clustering. |
If you experience memory problems with your Mothur script, you may have to use fewer processes for the clustering. |
||
<source lang="bash"> |
|||
Note: The more processors used the more memory is required. Each process will load a distance matrix into memory (RAM). |
|||
</source> |
Latest revision as of 10:13, 4 August 2022
Memory Usage
Often compute jobs get aborted because there is not enough memory available on the compute node to hold all the data. This page gives guidelines and tips for some tools how much memory should be requested for these tools and how the required memory scales with data and number of processors.
Specify memory in the job script
Please remember to specify the memory your job needs in the jobscript. Otherwise it can happen that the job is cancelled by the scheduler when it trys to allocate memory.
With the mem option you can request xx GB memory for the whole job.
#PBS -l mem=xxgb
With the pmem option you can request memory per core. This example will request 4 * xxGB, as the job runs on 4 cores.
#PBS -l nodes=1:ppn=4
#PBS -l pmem=xxgb
After the job finished a feedbackscript will denote the memory used by your job. This can help to get a more educated guess how much memory your jobs really need.
Memory job, MB | 420 [req. 10220]
Tools
Mothur
When analyzing reads with Mothur, at some point a distance matrix will be created. This pair wise distances are saved in a file with the .dist prefix. This matrix is held in memory when you have Mothur commands like cluster or cluster.split.
As stated in this Mothur forum post, the required memory increases with the number of used processes. If you experience memory problems with your Mothur script, you may have to use fewer processes for the clustering.
Note: The more processors used the more memory is required. Each process will load a distance matrix into memory (RAM).