Development/Intel Compiler: Difference between revisions

From bwHPC Wiki
Jump to navigation Jump to search
(Created page with "{| style="border-style: solid; border-width: 1px" ! Navigation: bwHPC BPR |} = Intel = bwUniCluster currently provides Intel compiler suit…")
 
(Update Intel Compiler for LLVM-based / OneAPI)
 
(84 intermediate revisions by 10 users not shown)
Line 1: Line 1:
{{Softwarepage|compiler/intel}}
{| style="border-style: solid; border-width: 1px"

! Navigation: [[BwHPC_Best_Practices_Repository|bwHPC BPR]]
{| width=600px class="wikitable"
|-
! Description !! Content
|-
| module load
| compiler/intel/VERSION and compiler/intel/VERSION_llvm
|-
| License
| Commercial. See $INTEL_HOME/install-doc/EULA.txt. | [https://software.intel.com/en-us/faq/licensing Intel Product Licensing FAQ]
|-
|Citing
| n/a
|-
| Links
| [https://software.intel.com/en-us/c-compilers Intel C-Compiler Homepage]
|-
| Graphical Interface
| [[#Debugger|Yes (Intel Debugger GUI-Verison)]]
|-
| Included modules
| icc | icpc | ifort | idb | gdb-ia
|}
|}


= Intel =

bwUniCluster currently provides Intel compiler suite versions:
* 13.1 (default)
* 12.1
<br>
<br>
= Introduction =
To load the default compiler suite execute in your terminal session:
The '''Intel Compiler''' consists of tools to compile and debug C, C++ and Fortran programs, and currently is in a transition phase: the so-called legacy compiler (based on an Intel in-house development with many optimization hints) and the newer LLVM-based compiler (where many of these optimizations and hints are ported to). To smoothly handle this transition we offer the standard legacy compiler plus the new LLVM-based compiler with the ''_llvm'' prefix.
<pre>
The following table shows the preferred names:
$ module load compiler/intel
{| width=600px class="wikitable"
</pre>
|-
To load your prefered version, e.g. 12.1, enter:
|style="padding:3px"| '''Tool'''
<pre>
|style="padding:3px"| '''Legacy name'''
$ module load compiler/intel/12.1
|style="padding:3px"| '''LLVM-based name'''
</pre>
|-
For details about unloading or switching compiler suites, please see chapter [[BwUniCluster_Environment_Modules| environment modules]].
|-

|style="padding:3px"| Intel C compiler

|style="padding:3px"| icc
== Default Intel compiler suite - version 13.1 ==
|style="padding:3px"| icx

|-
This module provides the Intel® compiler suite version 13.1.3 via
|style="padding:3px"| Intel C++ compiler
commands 'icc', 'icpc' and 'ifort', the debugger 'idb' as well as the Intel®
|style="padding:3px"| icpc
Threading Building Blocks TBB and the Integrated Performance Primitives IPP
|style="padding:3px"| icpx
libraries (for details see also http://software.intel.com/en-us/intel-compilers/).
|-

|style="padding:3px"| [https://software.intel.com/en-us/fortran-compilers Intel Fortran compiler]
The related Math Kernel Library MKL module is 'numlib/mkl/11.0.5'.
|style="padding:3px"| ifort
The related Intel MPI module is 'mpi/impi/4.1.1-intel-13.1'.
|style="padding:3px"| ifx
The Intel icpc should work well with GNU compiler 4.7.
|-

|style="padding:3px"| Intel debugger in GUI mode (until version 14 only)
The compiler suite contains:
|style="padding:3px"| [[#GUI|idb]]
<pre>
|style="padding:3px"| N/A
icc # Intel® C compiler
|-
icpc # Intel® C++ compiler
|style="padding:3px"| Intel GNU debugger in console mode (from version 15)
ifort # Intel® Fortran compiler
|style="padding:3px"| [[#Console Mode|gdb-ia]]
idb # Intel® debugger in GUI mode
|style="padding:3px"| gdb-oneapi
idbc # Intel® debugger in console mode
|-
</pre>
|style="padding:3px"| Intel debugger in console mode (until version 14 only)

|style="padding:3px"| [[#Console Mode|idbc]]
For local documentation consult the module help:
|style="padding:3px"| N/A
<pre>
|}
$ module help compiler/intel/13.1
The intel compiler suite also includes the TBB (Threading Building Blocks), IPP (Integrated Performance Primitives) and oneAPI libraries.
</pre>
or the '''man pages''' of each compiler:
<pre>
$ man icc
$ man icpc
$ man ifort

</pre>
For further online documentation visit:
* [http://software.intel.com/sites/products/search/search.php?q=&x=27&y=4&product=composerxef&version=2013&docos=lin Intel® Fortran Composer XE Version 2013]
* [http://software.intel.com/sites/products/search/search.php?q=&x=25&y=6&product=composerxec&version=2013&docos=lin Intel® C++ Composer XE Version 2013]
<br>
<br>
For some Intel® compiler option examples, hints on how to compile 32bit code
and solutions for less common problems see the tips and troubleshooting doc under:
<pre>
$INTEL_DOC_DIR/intel-compiler-tips-and-troubleshooting.txt
</pre>
<br>
<br>
More information about the MPI versions of the Intel Compiler is available here:
For details on library and include directories of this compiler suite please enter:
* [[Development/Parallel_Programming|Best Practices Guide for Parallel Programming]].
<pre>
$ module show compiler/intel/13.1
</pre>
<br>
<br>
Note that, the environment variables and commands are '''only''' available after loading this module.


= Documentation =
== Online documentation ==
* [https://software.intel.com/en-us/articles/intel-c-composer-xe-documentation Intel® C-Compiler Documentation]
* [https://software.intel.com/en-us/intel-software-technical-documentation Intel® Software Documentation Library]


= Optimizations =

You can turn on various optimization options to enhance the performance of your program. Which options are the best depends on the specific program and can be determined by benchmarking your code. A command which gives good performance and a decent file size is
= GCC =
'''icx -xHost -O2 ex.c'''.

With the option '''-xHost''' instructions for the highest instruction set available on the compilation host processor are generated. If you want to generate optimal code on bwUniCluster for both nodes with Sandy Bridge architecture and nodes with Broadwell architecture, you must compile your code with the options '''-xAVX -axCORE-AVX2''' (instead of '''-xHost''').
bwUniCluster currently provides GNU compiler suite versions:
* 4.5
* 4.7 (default)
* 4.8
<br>
<br>
There are more aggressive optimization flags and levels (e.g. -O3 or -fast and implied options) but the compiled programs can get quite large due to inlining. Additionally the compilation process will probably take longer. Moreover it may happen that the compiled program is even slower -- or may require installation of additional statically-linked libraries. Such a command would be for example:
To load the default compiler suite execute in your terminal session:
'''icx -fast ex.c'''
<pre>
$ module load compiler/gnu
</pre>
To load your prefered version, e.g. 4.5, enter:
<pre>
$ module load compiler/gnu/4.5
</pre>
For details about unload or switching compiler suites, please see chapter [[BwUniCluster_Environment_Modules| environment modules]].


== Default GNU compiler suite - version 4.7 ==

This module provides the GNU compiler suite version 4.7.3
via commands 'gcc', 'g++' and 'gfortran' (see also 'http://gcc.gnu.org/').
The GNU compiler has been build with gmp-4.3.2, mpfr-2.4.2 and mpc-0.8.1.

The compiler suite contains:
<pre>
cpp # GNU pre processor
gcc # GNU C compiler
g++ # GNU C++ compiler
gfortran # GNU Fortran compiler (Fortran 77, 90 and 95)
</pre>
<br>
<br>
Libraries can be found under:
<pre>
$GNU_LIB_DIR = /opt/bwhpc/common/compiler/gnu/4.7.3/x86_64/lib64
</pre>
<br>
<br>

For local documentation consult the module help:
= Profiling =
<pre>
Profiling an application means augmenting the compiled binary with information on execution counts per source-line (and basic blocks) -- e.g. one may see how many times an if-statement has been evaluated to true. To do so, compile your code with the profile flag:
$ module help compiler/gnu/4.7
'''icx -p ex.c -o ex'''.
</pre>
or the '''man pages''' of each compiler:
<pre>
$ man cpp
$ man gcc
$ man g++
$ man gfortran
</pre>
<br>
<br>
Using the gprof tool, one may manually inspect execution count of each executed line of source code.
For further online documentation visit:
* [http://gcc.gnu.org/onlinedocs/ http://gcc.gnu.org/onlinedocs/]
<br>
<br>
For compiler optimization, recompile your source using
For details on library and include directories of this compiler suite please enter:
'''icx -prof-gen ex.c -o ex'''
<pre>
then execute the most co]]mmon and typical use-case of your application, and then recompile using the generated profile count (and using optimization):
$ module show compiler/gnu/4.7
'''icx -prof-use -O2 ex.c -o ex'''.
</pre>
<br>
<br>
== Further literature ==
Please do not add the gnu compiler module to any automatic environment setup
A tutorial on optimization can be found at [https://www.intel.com/content/www/us/en/developer/articles/technical/vectorization-essential.html Vectorization Essentials]
procedure (neither to ~/.profile nor to ~/.bashrc).
and to get the different optimization options execute
'''icx -help opt'''
or
'''icx -help advanced'''
<br>
<br>
or the previously described catch-all option '''''-v --help'''''.
Please note, that the environment variables and commands are '''only''' available after loading this module.

Latest revision as of 15:17, 9 October 2024

The main documentation is available via module help compiler/intel on the cluster. Most software modules for applications provide working example batch scripts.


Description Content
module load compiler/intel/VERSION and compiler/intel/VERSION_llvm
License Commercial. See $INTEL_HOME/install-doc/EULA.txt. | Intel Product Licensing FAQ
Citing n/a
Links Intel C-Compiler Homepage
Graphical Interface Yes (Intel Debugger GUI-Verison)
Included modules icc | icpc | ifort | idb | gdb-ia


Introduction

The Intel Compiler consists of tools to compile and debug C, C++ and Fortran programs, and currently is in a transition phase: the so-called legacy compiler (based on an Intel in-house development with many optimization hints) and the newer LLVM-based compiler (where many of these optimizations and hints are ported to). To smoothly handle this transition we offer the standard legacy compiler plus the new LLVM-based compiler with the _llvm prefix. The following table shows the preferred names:

Tool Legacy name LLVM-based name
Intel C compiler icc icx
Intel C++ compiler icpc icpx
Intel Fortran compiler ifort ifx
Intel debugger in GUI mode (until version 14 only) idb N/A
Intel GNU debugger in console mode (from version 15) gdb-ia gdb-oneapi
Intel debugger in console mode (until version 14 only) idbc N/A

The intel compiler suite also includes the TBB (Threading Building Blocks), IPP (Integrated Performance Primitives) and oneAPI libraries.

More information about the MPI versions of the Intel Compiler is available here:


Documentation

Online documentation

Optimizations

You can turn on various optimization options to enhance the performance of your program. Which options are the best depends on the specific program and can be determined by benchmarking your code. A command which gives good performance and a decent file size is icx -xHost -O2 ex.c. With the option -xHost instructions for the highest instruction set available on the compilation host processor are generated. If you want to generate optimal code on bwUniCluster for both nodes with Sandy Bridge architecture and nodes with Broadwell architecture, you must compile your code with the options -xAVX -axCORE-AVX2 (instead of -xHost).
There are more aggressive optimization flags and levels (e.g. -O3 or -fast and implied options) but the compiled programs can get quite large due to inlining. Additionally the compilation process will probably take longer. Moreover it may happen that the compiled program is even slower -- or may require installation of additional statically-linked libraries. Such a command would be for example: icx -fast ex.c

Profiling

Profiling an application means augmenting the compiled binary with information on execution counts per source-line (and basic blocks) -- e.g. one may see how many times an if-statement has been evaluated to true. To do so, compile your code with the profile flag: icx -p ex.c -o ex.
Using the gprof tool, one may manually inspect execution count of each executed line of source code.
For compiler optimization, recompile your source using icx -prof-gen ex.c -o ex then execute the most co]]mmon and typical use-case of your application, and then recompile using the generated profile count (and using optimization): icx -prof-use -O2 ex.c -o ex.

Further literature

A tutorial on optimization can be found at Vectorization Essentials and to get the different optimization options execute icx -help opt or icx -help advanced
or the previously described catch-all option -v --help.