Difference between revisions of "Development/MKL"

From bwHPC Wiki
Jump to: navigation, search
Line 4: Line 4:
   
   
= Compilers =
 
   
  +
= Math Kernel Library (MKL) =
== Intel ==
 
 
bwUniCluster currently provides Intel compiler suite versions:
 
* 13.1 (default)
 
* 12.1
 
<br>
 
To load the default compiler suite execute in your terminal session:
 
<pre>
 
$ module load compiler/intel
 
</pre>
 
To load your prefered version, e.g. 12.1, enter:
 
<pre>
 
$ module load compiler/intel/12.1
 
</pre>
 
For details about unloading or switching compiler suites, please see chapter [[BwUniCluster_Environment_Modules| environment modules]].
 
 
 
=== Default Intel compiler suite - version 13.1 ===
 
 
This module provides the Intel® compiler suite version 13.1.3 via
 
commands 'icc', 'icpc' and 'ifort', the debugger 'idb' as well as the Intel®
 
Threading Building Blocks TBB and the Integrated Performance Primitives IPP
 
libraries (for details see also http://software.intel.com/en-us/intel-compilers/).
 
 
The related Math Kernel Library MKL module is 'numlib/mkl/11.0.5'.
 
The related Intel MPI module is 'mpi/impi/4.1.1-intel-13.1'.
 
The Intel icpc should work well with GNU compiler 4.7.
 
 
The compiler suite contains:
 
<pre>
 
icc # Intel® C compiler
 
icpc # Intel® C++ compiler
 
ifort # Intel® Fortran compiler
 
idb # Intel® debugger in GUI mode
 
idbc # Intel® debugger in console mode
 
</pre>
 
 
For local documentation consult the module help:
 
<pre>
 
$ module help compiler/intel/13.1
 
</pre>
 
or the '''man pages''' of each compiler:
 
<pre>
 
$ man icc
 
$ man icpc
 
$ man ifort
 
 
</pre>
 
For further online documentation visit:
 
* [http://software.intel.com/sites/products/search/search.php?q=&x=27&y=4&product=composerxef&version=2013&docos=lin Intel® Fortran Composer XE Version 2013]
 
* [http://software.intel.com/sites/products/search/search.php?q=&x=25&y=6&product=composerxec&version=2013&docos=lin Intel® C++ Composer XE Version 2013]
 
<br>
 
For some Intel® compiler option examples, hints on how to compile 32bit code
 
and solutions for less common problems see the tips and troubleshooting doc under:
 
<pre>
 
$INTEL_DOC_DIR/intel-compiler-tips-and-troubleshooting.txt
 
</pre>
 
<br>
 
For details on library and include directories of this compiler suite please enter:
 
<pre>
 
$ module show compiler/intel/13.1
 
</pre>
 
<br>
 
Note that, the environment variables and commands are '''only''' available after loading this module.
 
 
 
 
== GCC ==
 
 
bwUniCluster currently provides GNU compiler suite versions:
 
* 4.5
 
* 4.7 (default)
 
* 4.8
 
<br>
 
To load the default compiler suite execute in your terminal session:
 
<pre>
 
$ module load compiler/gnu
 
</pre>
 
To load your prefered version, e.g. 4.5, enter:
 
<pre>
 
$ module load compiler/gnu/4.5
 
</pre>
 
For details about unload or switching compiler suites, please see chapter [[BwUniCluster_Environment_Modules| environment modules]].
 
 
 
=== Default GNU compiler suite - version 4.7 ===
 
 
This module provides the GNU compiler suite version 4.7.3
 
via commands 'gcc', 'g++' and 'gfortran' (see also 'http://gcc.gnu.org/').
 
The GNU compiler has been build with gmp-4.3.2, mpfr-2.4.2 and mpc-0.8.1.
 
 
The compiler suite contains:
 
<pre>
 
cpp # GNU pre processor
 
gcc # GNU C compiler
 
g++ # GNU C++ compiler
 
gfortran # GNU Fortran compiler (Fortran 77, 90 and 95)
 
</pre>
 
<br>
 
Libraries can be found under:
 
<pre>
 
$GNU_LIB_DIR = /opt/bwhpc/common/compiler/gnu/4.7.3/x86_64/lib64
 
</pre>
 
<br>
 
For local documentation consult the module help:
 
<pre>
 
$ module help compiler/gnu/4.7
 
</pre>
 
or the '''man pages''' of each compiler:
 
<pre>
 
$ man cpp
 
$ man gcc
 
$ man g++
 
$ man gfortran
 
</pre>
 
<br>
 
For further online documentation visit:
 
* [http://gcc.gnu.org/onlinedocs/ http://gcc.gnu.org/onlinedocs/]
 
<br>
 
For details on library and include directories of this compiler suite please enter:
 
<pre>
 
$ module show compiler/gnu/4.7
 
</pre>
 
<br>
 
Please do not add the gnu compiler module to any automatic environment setup
 
procedure (neither to ~/.profile nor to ~/.bashrc).
 
<br>
 
Please note, that the environment variables and commands are '''only''' available after loading this module.
 
 
 
= Debugging =
 
 
== Only for employees of KIT ==
 
On bwUniCluster the GUI based distributed debugging tool (ddt) may be used to debug serial as
 
well as parallel applications. For serial applications also the GNU gdb or Intel idb debugger
 
may be used. The Intel idb comes with the compiler and information on this tool is available
 
together with the compiler documentation. In order to debug your program it must be
 
compiled and linked using the -g compiler option. This will force the compiler to add additional information to the object code which is used by the debugger at runtime.
 
 
== Parallel Debugger ddt ==
 
 
ddt consists of a graphical frontend and a backend serial debugger which controls the
 
application program. One instance of the serial debugger controls one MPI process. Via the
 
frontend the user interacts with the debugger to select the program that will be debugged,
 
to specify different options and to monitor the execution of the program. Debugging
 
commands may be sent to one, all or a subset of the MPI processes.
 
 
Before the parallel debugger ddt can be used, it is necessary to load the corresponding
 
module file:
 
<pre>
 
$ module use /opt/bwhpc/ka/modulefiles (only available for employees of KIT)
 
$ module add debugger/ddt
 
</pre>
 
 
Now ddt may be started with the command
 
<pre>
 
$ ddt program
 
</pre>
 
 
where program is the name of your program that you want to debug.
 
 
[[File:ddt1_750.jpg]]
 
 
Figure: DDT startup window
 
 
The above figure shows ddt’s startup window. Before actually starting the debugging session
 
you should check the contents of several fields in this window:
 
 
1. The top line shows the executable file that will be run under control of the debugger. In
 
the following lines you may input some options that are passed to your program or to the
 
MPI environment.
 
 
2. If your program reads data from stdin you can specify an input file in the startup window.
 
 
3. Before starting an MPI program you should check that "Open MPI (Compatability)" or
 
"Intel MPI" is the MPI implementation that has been selected. If this is not the case, you
 
have to change this. Otherwise ddt may not be able to run your program. In order to debug
 
serial programs, the selected MPI implementation should be "none". You may also change
 
the underlying serial debugger using the "change" button. By default ddt uses its own serial debugger, but it may also use the Intel idb debugger.
 
 
4. Select the number of MPI processes that will be started by ddt. If you are using ddt within
 
a batch job, replace mpirun by ddt in the command line of ????? and make sure that the
 
chosen number of MPI processes is identical to the number of MPI tasks (-p option ???) that
 
you selected with the ?????? command. When you debug a serial program, select 1.
 
 
5. After you have checked all inputs in the ddt startup window, you can start the debugging
 
session by pressing the "run" button.
 
 
 
The ddt window now shows the source code of the program that is being debugged and breakpoints can be set by just pointing to the corresponding line and pressing the right
 
mouse button. So you may step through your program, display the values of variables
 
and arrays and look at the message queues.
 
 
[[File:ddt2_750.jpg]]
 
 
 
= Numerical Libraries =
 
 
 
== Math Kernel Library (MKL) ==
 
 
'''Intel MKL (Math Kernel Library)''' is a library of optimized math routines for numerical computations such as linear algebra (using BLAS, LAPACK, ScaLAPACK) and discrete Fourier Transformation.
 
'''Intel MKL (Math Kernel Library)''' is a library of optimized math routines for numerical computations such as linear algebra (using BLAS, LAPACK, ScaLAPACK) and discrete Fourier Transformation.
 
With its standard interface in matrix computation and the interface of the popular fast Fourier transformation library fftw, MKL can be used to replace other libraries with minimal code changes. In fact a program which uses FFTW without MPI doesn't need to be changed at all. Just recompile it with the MKL linker flags.
 
With its standard interface in matrix computation and the interface of the popular fast Fourier transformation library fftw, MKL can be used to replace other libraries with minimal code changes. In fact a program which uses FFTW without MPI doesn't need to be changed at all. Just recompile it with the MKL linker flags.
Line 318: Line 118:
   
   
== FFTW ==
+
= FFTW =
 
'''FFTW''' is a C subroutine library for computing the discrete Fourier transform
 
'''FFTW''' is a C subroutine library for computing the discrete Fourier transform
 
(DFT) in one or more dimensions, of arbitrary input size, and of both real and
 
(DFT) in one or more dimensions, of arbitrary input size, and of both real and
Line 404: Line 204:
   
   
== GNU Scientific Library (GSL) ==
+
= GNU Scientific Library (GSL) =
 
The '''GNU Scientific Library''' (or '''GSL''') is a software library for numerical computations in applied mathematics and science. The GSL is written in the C programming language, but bindings exist for other languages as well.
 
The '''GNU Scientific Library''' (or '''GSL''') is a software library for numerical computations in applied mathematics and science. The GSL is written in the C programming language, but bindings exist for other languages as well.
   

Revision as of 20:13, 22 January 2014

Navigation: bwHPC BPR


1 Math Kernel Library (MKL)

Intel MKL (Math Kernel Library) is a library of optimized math routines for numerical computations such as linear algebra (using BLAS, LAPACK, ScaLAPACK) and discrete Fourier Transformation. With its standard interface in matrix computation and the interface of the popular fast Fourier transformation library fftw, MKL can be used to replace other libraries with minimal code changes. In fact a program which uses FFTW without MPI doesn't need to be changed at all. Just recompile it with the MKL linker flags.

Online documentation: http://software.intel.com/en-us/articles/intel-math-kernel-library-documentation

Local documentation: There is some information in the module help file accessible via

$ module help numlib/mkl

and after loading the module, the environment variable $MKL_DOC_DIR points to the local documentation folder. Various examples can be found in $MKLROOT/examples.

Compiling and linking: After loading the module with

$ module load numlib/mkl

you can include the MKL header file in you program:

#include <mkl.h>

Compilation is simple:

$ icc -c example_mkl.c

When linking the program you have to tell the compiler to link against the mkl library:

$ icc example_mkl.o -mkl

With the -mkl switch the intel compiler automatically sets the correct linker flags but you can specify them explicitly for example to enable static linking or when non-intel compilers are used. Information about the different options can be found at http://software.intel.com/en-us/node/438568 and especially helpful is the MKL link line advisor at http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor. By default $MKL_NUM_THREADS is set to 1 and so only one thread will be created, but if you feel the need to run the computation on more cores (after benchmarking) you can set $MKL_NUM_THREADS to a higher number.

Examples: To help getting started we provide two C++ examples. The first one computes the square of a 2x2 matrix:

#include <iostream>
#include <mkl.h>
using namespace std;

int main()
{
    double m[2][2] = {{2,1}, {0,2}};
    double c[2][2];

    for(int i = 0; i < 2; ++i)
    {
        for(int j = 0; j < 2; ++j)
            cout << m[i][j] << " ";

        cout << endl;
    }

    cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, 2, 2, 2, 1.0, &m[0][0], 2, &m[0][0], 2, 0.0, &c[0][0], 2);

    cout << endl;

    for(int i = 0; i < 2; ++i)
    {
        for(int j = 0; j < 2; ++j)
            cout << c[i][j] << " ";

        cout << endl;
    }

    return 0;
}

And the second one does a fast Fourier transformation using the Intel MKL interface (DFTI):

#include <iostream>
#include <complex>
#include <cmath>
#include <mkl.h>
using namespace std;

int main()
{
    const int N = 3;
    complex<double> x[N] = {2, -1, 0.5};

    cout << "Input: " << endl;

    for(int i = 0; i < N; i++)
        cout << x[i] << endl;

    DFTI_DESCRIPTOR_HANDLE desc;

    DftiCreateDescriptor(&desc, DFTI_DOUBLE, DFTI_COMPLEX, 1, N);
    DftiCommitDescriptor(desc);
    DftiComputeForward(desc, x);
    DftiFreeDescriptor(&desc);

    cout << "\nOutput: " << endl;

    for(int i = 0; i < N; i++)
        cout << x[i] << endl;

    cout << "\nTest the interpolation function f:" << endl;

    for(int i = 0; i < N; i++)
    {
        double t = i/(double)N;
        complex<double> u(0, 2*M_PI*t);
        complex<double> z = exp(u);
        complex<double> w = 1.0/N * (x[0] + x[1]*z + x[2]*z*z);

        cout << "f(" << t << ") = " << w << endl;
    }

    return 0;
}



2 FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data (as well as of even/odd data, i.e. the discrete cosine/sine transforms or DCT/DST).

This package provides three versions of the fftw3 library depending on precision: libfft3, libfftw3f and libfftw3l for double, single and long-double precision libraries.

Online Documentation: http://www.fftw.org/fftw3_doc/

Local documentation:

See 'info fftw3', 'man fftw-wisdom' and 'man fftw-wisdom-to-conf'. See also documentation folder pointed to by shell variable $FFTW_DOC_DIR

Hints for compiling and linking:

Load the fftw module, and, if needed, the corresponding openmpi module.

After having loaded the appropriate module(s), you can use several environment variables to compile and link your application.

  • Compile serial program:
 $ gcc example.c -o example -I$FFTW_INC_DIR -L$FFTW_LIB_DIR -lfftw3 -lm
  • Compile program with support for POSIX threads:
 $ gcc example.c -o example -I$FFTW_INC_DIR -L$FFTW_LIB_DIR -lfftw3_threads -lfftw3 -lpthread -lm
  • Compile program with support for OpenMP threads:
 $ gcc example.c -o example -fopenmp -I$FFTW_INC_DIR -L$FFTW_LIB_DIR -lfftw3_omp -lfftw3 -lm
  • Compile program with support for MPI:
 $ mpicc example.c -o example -I$FFTW_INC_DIR -L$FFTW_LIB_DIR -lfftw3_mpi -lfftw3 -lm 
  • Run program with MPI support:
 $ mpirun -n <ncpu> ./example 

(Replace <ncpu> by number of processor cores.)

Replace -lfftw3, -lfftw3_threads, etc. by -lfftw3f, -lfftw3f_threads, etc. for single precision and by -lfftw3l, -lfftw3l_threads etc. for long-double precision codes, respectively.

These commands will compile your program with dynamic fftw library versions in which case you also have to have the fftw module loaded for running the program. Alternatively, you may want to link your program with static fftw library versions. With static fftw libraries it is only necessary to load the fftw module for compiling but not for executing the program.

  • Compile program with static fftw library versions (example for POSIX threads support):
 $ gcc example.c -o example -I$FFTW_INC_DIR $FFTW_LIB_DIR/{libfftw3_threads.a,libfftw3.a} -lpthread -lm 

or:

 $ gcc example.c -o example -I$FFTW_INC_DIR -L$FFTW_LIB_DIR -Wl,-Bstatic -lfftw3 -lfftw3_threads \
       -Wl,-Bdynamic -lpthread -lm 

Environment variables $FFTW_INC_DIR, $FFTW_LIB_DIR etc. are available after loading the module.

Sample code for various test cases is provided in folder pointed to by environment variable $FFTW_EXA_DIR.


3 GNU Scientific Library (GSL)

The GNU Scientific Library (or GSL) is a software library for numerical computations in applied mathematics and science. The GSL is written in the C programming language, but bindings exist for other languages as well.

Online-Documentation: http://www.gnu.org/software/gsl/

Local-Documentation:

See 'info gsl', 'man gsl' and 'man gsl-config'.

Tips for compiling and linking:

Load the gsl module. After having loaded the gsl environment module, you can use several environment variables to compile and link your application with the gsl library.

Your source code should contain preprocessor include statements with a gsl/ prefix, such as

 #include <gsl/gsl_math.h>

A typical compilation command for a source file example.c with the Intel C compiler icc is

 $ icc -Wall -I$GSL_INC_DIR  -c example.c 

The $GSL_INC_DIR environment variable points to location of the include path for the gsl header files.

The following command can be used to link the application with the gsl libraries,

 $ icc -L$GSL_LIB_DIR -o example example.o -lgsl -lgslcblas -lm 

The $GSL_LIB_DIR environment variable points to the location of the gsl libraries.

Also make sure to have the gsl module loaded before running applications build with this library.

Example

Create source code file 'intro.c':

#include <stdio.h>
#include <gsl/gsl_sf_bessel.h>

int main (void)
{
  double x = 5.0;
  double y = gsl_sf_bessel_J0 (x);
  printf ("J0(%g) = %.18e\n", x, y);
  return 0;
}

Load the gsl module for the Intel compiler, compile, link and run the program:

$ module load numlib/gsl/1.16-intel-13.1
Loading module dependency 'compiler/intel/13.1'.
$ icc -Wall -I$GSL_INC_DIR  -c intro.c
$ icc -L$GSL_LIB_DIR -o intro intro.o -lgsl -lgslcblas -lm
$ ./intro
J0(5) = -1.775967713143382642e-01