JUSTUS2/Software/Gaussian
The main documentation is available via |
Description | Content |
---|---|
module load | chem/gaussian |
License | Commercial - see Pricing for Gaussian Products |
Citing | See Gaussian Citation |
Links | Homepage; Reference Manual; Running Gaussian; Keywords, IOps Reference |
Graphical Interface | See Gaussview |
Description
Gaussian is a general purpose quantum chemistry software package for ab initio electronic structure calculations. It provides:
- ground state calculations for methods such as HF, many DFT functionals, MP2/3/4 or CCSD(T);
- basic excited state calculations such as TDHF or TDDF;
- coupled multi-shell QM/MM calculations (ONIOM);
- geometry optimizations, transition state searches, molecular dynamics calculations;
- property and spectra calculations such as IR, UV/VIS, Raman or CD; as well as
- shared-memory parallel versions for almost all kind of jobs.
For more information on features please visit Gaussian's Overview of Capabilities and Features and Release Notes web page.
Parallel computing
The binaries of the Gaussian module can run in serial and shared-memory parallel mode. Switching between the serial and parallel version is done via statement
%NProcShare=N
in section Link 0 commands before the route section at the beginning of the Gaussian input file. The number of cores requested from the queueing system (i.e. --ntasks-per-node=N) must be identical to %NProcShared=N as specified in the Gaussian input file. The installed Gaussian binaries are shared-memory parallel only. Therefore only single node jobs do make sense. Without NProcShare Gaussian will use only one core by default.
Usage
Loading the module
Running Gaussian interactively
After loading the Gaussian module you can run a quick interactive example by executing
$ time g16 < $GAUSSIAN_EXA_DIR/test0553-4core-parallel.com
In most cases running Gaussian requires setting up the command input file and redirecting that input into command g16.
Creating Gaussian input files
For documentation about how to construct input files see the Gaussian manual. In addition the program Gaussview is a very good graphical user interface for constructing molecules and for setting up calculations. Finally these calculation setups can be saved as Gaussian command files and thereafter can be submitted to the cluster with help of the queueing system examples below.
Disk usage
By default, scratch files of Gaussian are placed in GAUSS_SCRDIR as displayed when loading the Gaussian module. In most cases the module load command of Gaussian should set the GAUSS_SCRDIR pointing to an optimal node-local file system. When running multiple Gaussian jobs together on one node a user may want to add one more sub-directory level containing e.g. job id and job name for clarity - if not done so already by the queueing system.
Predicting how much disk space a specific Gaussian calculation can be a difficult task. It requires experience with the methods, the basis sets, the calculated properties and the system you are investigating. The best advice is probably to start with small basis sets and small example systems, run such example calculations and observe their (hopefully small) disk usage while the job is running. Then read the Gaussian documentation about scaling behaviour and basis set sizes (the basis set size of the current calculation is printed at the beginning of the output of the Gaussian job). Finally try to extrapolate to your desired final system and basis set.
You can also try to specify a fixed amount of disk space for a calculation. This is done by adding a statement like
%MaxDisk=50000MB
to the route section of the Gaussian input file. But please be aware that (a) Gaussian does not necessarily obey the specified value] and (b) you might force Gaussian to select a slower algorithm when specifying an inappropriate value.
In any case please make sure that you request more node-local disk space from the queueing system then you have specified in the Gausian input file. For information on how much node-local disk space is available at the cluster and how to request a certain amount of node-local disk space for a calculation from the queueing system, please consult the cluster specific queueing system documentation as well as the queueing system examples of the Gaussian module as described below.
Except for very short interactive test jobs please never run Gaussian calculations in any globally mounted directory like your $HOME or $WORK directory.
Memory usage
Predicting the memory requirements of a job is as difficult as predicting the disk requirements. The strategies are very similar. So for a large new unknown system, start with smaller test systems and smaller basis sets and then extrapolate.
You may specify the memory for a calculation explicitly in the route section of the Gaussian input file, for example
%Mem=10000MB
Gaussian usually obeys this value rather well. We have seen calculations that exceed the Mem value by at most by 2GB. Therefore it is usually sufficient to request Mem+2GB from the queueing system.
But please carefully monitor the output of Gaussian when restricting the memory in the input file. Gaussian automatically switches between algorithms (e.g. recalculating values instead of storing them) when specifying too low memory values. So when the output is indicating that with more memory the integrals could be kept in memory (just an example for one of the messages), the calculation will be faster when assigning more memory.
In case of shared-memory parallel jobs the number of workers has only minor influence on the memory consumption (maybe up to 10%). This is since all workers work together on one common data set.
Using SSD systems efficiently
Compared with conventional disks SSD's are far more than 1000 times faster when serving random-IO requests. Therefore some of the default strategies of Gaussian, e.g. recalculate some values instead of storing them on disk, might not be optimal in all cases. Of course this is only relevant when there is not enough RAM to store the intermediate values, e.g. two centre integrals, etc.
So if you plan to do many huge calculations that do not fit into the RAM, you may want to compare the execution time of a job that is re-calculating the intermediate values whenever needed and a job that forces these values to be written to and read from the node-local SSD's. Depending on how much time it costs to re-calculate the intermediate values, using the SSD's can be much faster.
Examples
Queueing system template provided by Gaussian module
The Gaussian module provides a simple Slurm example (Hexanitroethan C2N6O12) that runs a 4 core parallel single energy point calculation using method B3LYP and basis set 6-31g(df,pd). To submit the example do the following steps:
$ module load chem/gaussian $ cp -v ${GAUSSIAN_EXA_DIR}/bwforcluster-gaussian-example.sbatch ./ $ sbatch bwforcluster-gaussian-example.sbatch
The last step submits the job example script bwforcluster-gaussian-example.sbatch to the queueing system. Once started all temporary files are kept below directory $SCRATCH only visible on the compute node where the job is running. When option --gres=scratch:nnn has been specified while submitting the job script, then $SCRATCH points to the node-local SSDs. Otherwise (option --gres=scratch:nnn has not been specified) $SCRATCH points to a RAM disk. Please carefully read this local file system documentation as well as the comments in the queueing system example script bwforcluster-gaussian-example.sbatch.
Direct submission of Gaussian command files
For users who do not want to deal with queueing system scripts we have created a submit command that automatically creates and submits queueing system scripts for Gaussian. For example:
$ module load chem/gaussian $ cp -v $GAUSSIAN_EXA_DIR/test0553-4core-parallel.com ./ $ gauss_sub test0553-4core-parallel.com
Caveat for windows users
If you have transferred the Gaussian input file from a Windows computer to Unix then make sure to convert the line breaks of Windows (<CR>+<LF>) to Unix (only <LF>). Otherwise Gaussian will write strange error messages. Typical Unix commands for that are: 'dos2unix' and 'unix2dos'. Example:
$ dos2unix test0553-4core-parallel.com