JUSTUS2/Software/Gaussian

From bwHPC Wiki
Jump to navigation Jump to search
Description Content
module load chem/gaussian
Availability BwForCluster_Chemistry
License Commercial. See: Pricing for Gaussian Products
Citing See Gaussian manual
Links Homepage | Manual | IOps Reference
Graphical Interface Yes. See Gaussview.

Description

Gaussian is a general purpose quantum chemistry software package for ab initio electronic structure calculations. It provides:

  • ground state calculations for methods such as HF, many DFT functionals, MP2/3/4 or CCSD(T);
  • basic excited state calculations such as TDHF or TDDF;
  • coupled multi-shell QM/MM calculations (ONIOM);
  • geometry optimizations, transition state searches, molecular dynamics calculations;
  • property and spectra calculations such as IR, UV/VIS, Raman or CD; as well as
  • shared-memory parallel versions for almost all kind of jobs.

For more information on features please visit Gaussian's Overview of Capabilities and Features web page.

Versions and Availability

A list of versions currently available on the bwForCluster Chemistry can be obtained from the Cluster Information System (CIS): {{#widget:Iframe |url=https://cis-hpc.uni-konstanz.de/prod.cis/Justus/chem/gaussian |width=99% |height=250 |border= }} On the command line of a particular bwHPC cluster a list of all available versions is displayed by command

$ module avail chem/gaussian

Parallel computing

The binaries of the Gaussian module can run in serial and shared-memory parallel mode. Switching between the serial and parallel version is done via statement

%NProcShare=PPN

in section Link 0 commands before the route section at the beginning of the Gaussian input file. PPN should be replaced by the number of parallel cores. This value must be identical to the ppn value specified when requesting resources from the queueing system. The installed Gaussian binaries are shared-memory parallel. Therefore only single node jobs do make sense. Without NProcShare statement the serial version of Gaussian is selected.

Usage

Loading the module

You can load the default version of Gaussian with command:

$ module load chem/gaussian

The Gaussian module does not depend on any other module (no dependencies).

If you wish to load a specific version you may do so by specifying the version explicitly, e.g.

$ module load chem/gaussian/g09.D.01

to load version g09.D.01 of Gaussian.

Running Gaussian interactively

After loading the Gaussian module you can run a quick interactive example by executing

$ time g09 < $GAUSSIAN_EXA_DIR/test0553-8core-parallel.com

In most cases running Gaussian requires setting up the command input file and piping that input into g09.

Creating Gaussian input files

For documentation about how to construct input files see the Gaussian manual. In addition the program Gaussview is a very good graphical user interface for constructing molecules and for setting up calculations. Finally these calculation setups can be saved as Gaussian command files and thereafter can be submitted to the cluster with help of the queueing system examples below.

Disk usage

By default, scratch files of Gaussian are placed in GAUSS_SCRDIR as displayed when loading the Gaussian module. In most cases the module load command of Gaussian should set the GAUSS_SCRDIR pointing to an optimal node-local file system. When running multiple Gaussian jobs together on one node a user may want to add one more sub-directory level containing e.g. job id and job name for clarity - if not done so already by the queueing system.

Predicting how much disk space a specific Gaussian calculation requires is a very difficult task. It requires experience with the methods, the basis sets, the calculated properties and the system you are investigating. The best advice is probably to start with small basis sets and small example systems, run such example calculations and observe their (hopefully small) disk usage while the job is running. Then read the Gaussian documentation about scaling behaviour and basis set sizes (the basis set size of the current calculation is printed at the beginning of the output of the Gaussian job). Finally try to extrapolate to your desired final system and basis set.

You can also try to specify a fixed amount of disk space for a calculation. This is done by adding a statement like

%MaxDisk=50000MB

to the route section of the Gaussian input file. But please be aware that (a) [Gaussian does not necessarily obey the specified value] and (b) you might force Gaussian to select a slower algorithm when specifying an inappropriate value.

In any case please make sure that you request sufficient but not far too much node-local disk space from the queueing system. For information on how much node-local disk space is available at the cluster and how to request a certain amount of node-local disk space for a calculation from the queueing system, please consult the cluster specific queueing system documentation as well as the queueing system examples of the Gaussian module as described below.

Except for very short interactive test jobs please never run Gaussian calculations in any globally mounted directory like your $HOME or $WORK directory.

Memory usage

Predicting the memory requirements of a job is nearly as difficult as predicting the disk requirements. But the strategies can be very similar. So start with small test systems and small basis sets and then extrapolate.

You may specify the memory for a calculation explicitly in the route section of the Gaussian input file, for example

%Mem=10000MB

Gaussian usually obeys this value rather well. We have seen calculations that exceed the Mem value by at most by 2GB. Therefore it is usually sufficient to request Mem+2GB from the queueing system.

But please carefully monitor the output of Gaussian when restricting the memory in the input file. Gaussian automatically switches between algorithms (e.g. recalculating values instead of storing them) when specifying too low memory values. So when the output is indicating that with more memory e.g. the integrals could be kept in memory the calculation might be much faster when assigning more memory.

In case of shared-memory parallel jobs the number of workers has only minor influence on the memory consumption (maybe up to 10%). This is since all workers work together on one common data set.

Using SSD systems efficiently

Compared with conventional disks SSD's are far more than 1000 times faster when serving random-IO requests. Therefore some of the default strategies of Gaussian, e.g. recalculate some values instead of storing them on disk, might not be optimal in all cases. Of course this is only relevant when there is not enough RAM to store the intermediate values, e.g. two centre integrals, etc.

So if you plan to do many huge calculations that do not fit into the RAM, you may want to compare the execution time of a job that is re-calculating the intermediate values whenever needed and a job that forces these values to be written to and read from the node-local SSD's. Depending on how much time it costs to re-calculate the intermediate values, using the SSD's can be much faster.

Examples

Queueing system template provided by Gaussian module

The Gaussian module provides a simple Moab example of Hexanitroethan (C2N6O12) that runs an 8 core parallel single energy point calculation using method B3LYP and basis set 6-31g(df,pd). To submit the example do the following steps:

$ ws_allocate calc_repo 30; cd $(ws_find calc_repo)
$ mkdir my_first_job; cd my_first_job
$ module load chem/gaussian
$ cp -v ${GAUSSIAN_EXA_DIR}/{bwforcluster-gaussian-example.moab,test0553-*.com} ./
$ msub bwforcluster-gaussian-example.moab

The last step submits the job example script bwforcluster-gaussian-example.moab to the queueing system. Once started on a compute node, all calculations will be done under an unique directory on the local file system ($TMPDIR) of that particular compute node. Please carefully read this local file system documentation as well as the comments in the queueing system example script bwforcluster-gaussian-example.moab.

Direct submission of Gaussian command files

For users who do not want to deal with queueing system scripts we have created a submit command that automatically creates and submits queueing system scripts for Gaussian. For example:

$ ws_allocate calc_repo 30; cd $(ws_find calc_repo)
$ mkdir my_first_job; cd my_first_job
$ module load chem/gaussian
$ cp $GAUSSIAN_EXA_DIR/test0553-8core-parallel.com ./
$ gauss_sub test0553-8core-parallel.com

Caveat for windows users

If you have transferred the Gaussian input file from a Windows computer to Unix then make sure to convert the line breaks of Windows (<CR>+<LF>) to Unix (only <LF>). Otherwise Gaussian will write strange error messages. Typical Unix commands for that are: 'dos2unix' and 'unix2dos'. Example:

$ dos2unix test0553-8core-parallel.com

Version-specific information

For specific information about version VERSION see the information available via the module system with the command

$ module help chem/gaussian/VERSION

Please read the local version-specific module help documentation before using the software. The module help contains links to additional documentation and resources as well as information about support contact.