JUSTUS2/Software/Gaussian: Difference between revisions
(→Usage) |
|||
Line 63: | Line 63: | ||
<br> |
<br> |
||
<br> |
<br> |
||
= Usage = |
= Usage = |
||
== Loading the module == |
== Loading the module == |
||
You can load the default version of ''Gaussian'' with command: |
You can load the default version of ''Gaussian'' with command: |
||
<pre> |
<pre> |
||
Line 87: | Line 90: | ||
== Creating Gaussian input files == |
== Creating Gaussian input files == |
||
For documentation about how to construct input files see the [http://www.gaussian.com/g_tech/g_ur/g09help.htm Gaussian manual]. In addition the program [[Gaussview]] is a very good graphical user interface for constructing molecules and for setting up calculations. Finally these calculation setups can be saved as Gaussian command files and thereafter can be submitted to the cluster with help of the queueing system examples below. |
For documentation about how to construct input files see the [http://www.gaussian.com/g_tech/g_ur/g09help.htm Gaussian manual]. In addition the program [[Gaussview]] is a very good graphical user interface for constructing molecules and for setting up calculations. Finally these calculation setups can be saved as Gaussian command files and thereafter can be submitted to the cluster with help of the queueing system examples below. |
||
<br> |
<br> |
||
Line 98: | Line 102: | ||
You can also try to specify a fixed amount of disk space for a calculation. This is done by adding a statement like |
You can also try to specify a fixed amount of disk space for a calculation. This is done by adding a statement like |
||
<pre> |
<pre> |
||
%MaxDisk= |
%MaxDisk=50000MB |
||
</pre> |
</pre> |
||
to the route section of the Gaussian input file. But please be aware that (a) [[http://www.gaussian.com/g_tech/g_ur/k_maxdisk.htm Gaussian does not necessarily obey the specified value]] and (b) you might force Gaussian to select a slower algorithm when specifying an inappropriate value. |
to the route section of the Gaussian input file. But please be aware that (a) [[http://www.gaussian.com/g_tech/g_ur/k_maxdisk.htm Gaussian does not necessarily obey the specified value]] and (b) you might force Gaussian to select a slower algorithm when specifying an inappropriate value. |
||
Line 106: | Line 110: | ||
Except for very short interactive test jobs please never run Gaussian calculations in any globally mounted directory like your $HOME or $WORK directory. |
Except for very short interactive test jobs please never run Gaussian calculations in any globally mounted directory like your $HOME or $WORK directory. |
||
<br> |
<br> |
||
== Memory usage == |
|||
Predicting the memory requirements of a job is nearly as difficult as predicting the disk requirements. But the strategies can be very similar. So start with small test systems and small basis sets and then extrapolate. |
|||
You may specify the memory for a calculation explicitly in the route section of the Gaussian input file, for example |
|||
<pre> |
|||
%Mem=10000MB |
|||
</pre> |
|||
Gaussian usually obeys this value rather well. We have seen calculations that exceed the Mem value by at most by 2GB. Therefore it is usually sufficient to request Mem+2GB from the queueing system. |
|||
But please carefully monitor the output of Gaussian when restricting the memory in the input file. Gaussian automatically switches between algorithms (e.g. recalculating values instead of storing them) when specifying too low memory values. So when the output is indicating that with more memory e.g. the ''integrals'' could be kept in memory the calculation might be much faster when assigning more memory. |
|||
In case of shared-memory parallel jobs the number of workers has only minor influence on the memory consumption (maybe up to 10%). This is since all workers work together on one common data set. |
|||
== Using SSD systems efficiently == |
|||
Compared with conventional disks SSD's are far more than 1000 times faster when serving random-IO requests. Therefore some of the default strategies of Gaussian, e.g. recalculate some values instead of storing them on disk, might not be optimal in all cases. Of course this is only relevant when there is not enough RAM to store the intermediate values, e.g. two centre integrals, etc. |
|||
So if you plan to do many huge calculations that do not fit into the RAM, you may want to compare the execution time of a job that is re-calculating the intermediate values whenever needed and a job that forces these values to be written to and read from the node-local SSD's. Depending on how much time it costs to re-calculate the intermediate values, using the SSD's can be much faster. |
|||
= Examples = |
= Examples = |
Revision as of 13:38, 18 April 2015
Key facts | |
---|---|
Module name | chem/gaussian |
Availability | bwForCluster_Chemistry |
License | commercial |
Citing | See Gaussian manual |
Links | Homepage; Manual; IOps Reference |
Graphical interface | See Gaussview |
Description
Gaussian is a general purpose quantum chemistry software package for ab initio electronic structure calculations. It provides:
- ground state calculations for methods such as HF, many DFT functionals, MP2/3/4 or CCSD(T);
- basic excited state calculations such as TDHF or TDDF;
- coupled multi-shell QM/MM calculations (ONIOM);
- geometry optimizations, transition state searches, molecular dynamics calculations;
- property and spectra calculations such as IR, UV/VIS, Raman or CD; as well as
- shared-memory parallel versions for almost all kind of jobs.
For more information on features please visit Gaussian's Overview of Capabilities and Features web page.
Versions and Availability
A list of versions currently available on the bwForCluster Chemistry can be obtained from the Cluster Information System (CIS):
{{#widget:Iframe
|url=https://cis-hpc.uni-konstanz.de/prod.cis/Justus/chem/gaussian
|width=99%
|height=200
|border=1
}}
On the command line of a particular bwHPC cluster a list of all available Gaussian versions is displayed by command
$ module avail chem/gaussian
Parallel computing
The binaries of the Gaussian module can run in serial and shared-memory parallel mode. Switching between the serial and parallel version is done via statement
%NProcShare=PPN
in section Link 0 commands before the route section at the beginning of the Gaussian input file. PPN should be replaced by the number of parallel cores. This value must be identical to the ppn value specified when requesting resources from the queueing system. The installed Gaussian binaries are shared-memory parallel. Therefore only single node jobs do make sense. Without NProcShare statement the serial version of Gaussian is selected.
Usage
Loading the module
You can load the default version of Gaussian with command:
$ module load chem/gaussian
The Gaussian module does not depend on any other module (no dependencies).
If you wish to load a specific version you may do so by specifying the version explicitly, e.g.
$ module load chem/gaussian/g09.D.01
to load version g09.D.01 of Gaussian.
Running Gaussian interactively
After loading the Gaussian module you can run a quick interactive example by executing
$ time g09 < $GAUSSIAN_EXA_DIR/test0553-8core-parallel.com
In most cases running Gaussian requires setting up the command input file and piping that input into g09.
Creating Gaussian input files
For documentation about how to construct input files see the Gaussian manual. In addition the program Gaussview is a very good graphical user interface for constructing molecules and for setting up calculations. Finally these calculation setups can be saved as Gaussian command files and thereafter can be submitted to the cluster with help of the queueing system examples below.
Disk usage
By default, scratch files of Gaussian are placed in GAUSS_SCRDIR as displayed when loading the Gaussian module. In most cases the module load command of Gaussian should set the GAUSS_SCRDIR pointing to an optimal node-local file system. When running multiple Gaussian jobs together on one node a user may want to add one more sub-directory level containing e.g. job id and job name for clarity - if not done so already by the queueing system.
Predicting how much disk space a specific Gaussian calculation requires is a very difficult task. It requires experience with the methods, the basis sets, the calculated properties and the system you are investigating. The best advice is probably to start with small basis sets and small example systems, run such example calculations and observe their (hopefully small) disk usage while the job is running. Then read the Gaussian documentation about scaling behaviour and basis set sizes (the basis set size of the current calculation is printed at the beginning of the output of the Gaussian job). Finally try to extrapolate to your desired final system and basis set.
You can also try to specify a fixed amount of disk space for a calculation. This is done by adding a statement like
%MaxDisk=50000MB
to the route section of the Gaussian input file. But please be aware that (a) [Gaussian does not necessarily obey the specified value] and (b) you might force Gaussian to select a slower algorithm when specifying an inappropriate value.
In any case please make sure that you request sufficient but not far too much node-local disk space from the queueing system. For information on how much node-local disk space is available at the cluster and how to request a certain amount of node-local disk space for a calculation from the queueing system, please consult the cluster specific queueing system documentation as well as the queueing system examples of the Gaussian module as described below.
Except for very short interactive test jobs please never run Gaussian calculations in any globally mounted directory like your $HOME or $WORK directory.
Memory usage
Predicting the memory requirements of a job is nearly as difficult as predicting the disk requirements. But the strategies can be very similar. So start with small test systems and small basis sets and then extrapolate.
You may specify the memory for a calculation explicitly in the route section of the Gaussian input file, for example
%Mem=10000MB
Gaussian usually obeys this value rather well. We have seen calculations that exceed the Mem value by at most by 2GB. Therefore it is usually sufficient to request Mem+2GB from the queueing system.
But please carefully monitor the output of Gaussian when restricting the memory in the input file. Gaussian automatically switches between algorithms (e.g. recalculating values instead of storing them) when specifying too low memory values. So when the output is indicating that with more memory e.g. the integrals could be kept in memory the calculation might be much faster when assigning more memory.
In case of shared-memory parallel jobs the number of workers has only minor influence on the memory consumption (maybe up to 10%). This is since all workers work together on one common data set.
Using SSD systems efficiently
Compared with conventional disks SSD's are far more than 1000 times faster when serving random-IO requests. Therefore some of the default strategies of Gaussian, e.g. recalculate some values instead of storing them on disk, might not be optimal in all cases. Of course this is only relevant when there is not enough RAM to store the intermediate values, e.g. two centre integrals, etc.
So if you plan to do many huge calculations that do not fit into the RAM, you may want to compare the execution time of a job that is re-calculating the intermediate values whenever needed and a job that forces these values to be written to and read from the node-local SSD's. Depending on how much time it costs to re-calculate the intermediate values, using the SSD's can be much faster.
Examples
Single node jobs
Queueing system template provided by Gaussian module
The Gaussian module provides a simple Moab example of Hexanitroethan (C2N6O12) that runs an 8 core parallel single energy point calculation using method B3LYP and basis set 6-31g(df,pd). To submit the example do the following steps:
$ ws_allocate calc_repo 30; cd $(ws_find calc_repo) $ mkdir my_first_job; cd my_first_job $ module load chem/gaussian $ cp -v ${GAUSSIAN_EXA_DIR}/{bwforcluster-gaussian-example.moab,test0553-*.com} ./ $ msub bwforcluster-gaussian-example.moab
The last step submits the job example script bwforcluster-gaussian-example.moab to the queueing system. Once started on a compute node, all calculations will be done under an unique directory on the local file system ($TMPDIR) of that particular compute node. Please carefully read this local file system documentation as well as the comments in the queueing system example script bwforcluster-gaussian-example.moab.
Version-Specific Information
For specific information about version VERSION see the information available via the module system with the command
$ module help chem/gaussian/VERSION
Please read the local module help documentation before using the software. The module help contains links to additional documentation and resources as well as information about support contact.