BwUniCluster2.0/Software/R/terra: Difference between revisions

From bwHPC Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(39 intermediate revisions by the same user not shown)
Line 1: Line 1:
<span style="color:red"><b>Note that the instructions provided below refer to R 4.4.1 (but not R 4.2.1)!</b></span>


= General information =
= General information =
[https://rspatial.github.io/terra/ '''terra'''] is a R package for spatial data analysis with vectors (points, lines, polygons) and raster (grid) data.


[https://github.com/r-spatial/sf '''sf'''] is a R package that provides [https://en.wikipedia.org/wiki/Simple_Features simple feature] access for R.


In order to install those, we need to fulfill the following [https://cran.r-project.org/web/packages/terra/index.html system requirements]:
sf and terra allow to use the following tools for handling spatial structures in R
* [https://gdal.org/ GDAL] 2.2.3 or higher
* the 'Geospatial' Data Abstraction Library [https://gdal.org/ GDAL]
* [https://proj.org/ PROJ] 4.9.3 or higher
* Projection/transformation operations from the [https://proj.org/ PROJ] library
* [https://libgeos.org/ GEOS] 3.4.0 or higher
* Interface to the open source Geometry Engine [https://libgeos.org/ GEOS]


These packages are not available centrally on the cluster, but can be installed manually (to the <code>$HOME</code>-directory). Specifically, to install these packages, they need to be built from source.




= Installation =
= Installation =


Please enter the following code, presented in the boxes below, directly into your shell/command line on bwUniCluster.
Please enter (or copy & paste) the code, presented in the boxes below, directly into your shell/command line on bwUniCluster. The whole process will take approximately 45 minutes.


First, for compilation we obtain an interactive session with multiple cores (on a compute node):
The whole process will take approximately 45 minutes.
<pre>
# Obtain interactive session
salloc -n 8 -t 60 -p single
</pre>




== Preparations ==
== Preparations ==
Prepare .R directory (if it does not already exists). This is then filled with information how R should compile the packages (so-called 'compilation flags'). These are written (and can be reviewed) into the (text) file Makevars.
Prepare an <code>.R/Makevars</code> file (if it does not already exist). This file specifies how R should compile the packages (i.e., sets some 'compiler flags').

If you already have a .R/Makevars file, check whether these flags are already set.


If an <code>.R/Makevars</code> file is present in your home directory (<code>$HOME</code>), check whether the flags displayed below are already set and apply adjustments, if necessary:
<pre>
<pre>
cat ~/.R/Makevars
cat $HOME/.R/Makevars

CXX14=g++
CXX17=g++
CXXFLAGS = -O3 -fPIC -march=cascadelake -ffp-contract=off -fno-fast-math -fno-signed-zeros -fopenmp -Wno-unknown-warning-option
CXX14FLAGS += -std=c++14
CXX17FLAGS += -std=c++17
</pre>
</pre>


In this case, skip the following block of five commands. If not, please enter the following commands in your shell (command line):
Please run the following lines of code to set the flags, if necessary:
<pre>
<pre>
mkdir -p ~/.R
mkdir -p ~/.R
echo "CXX14=g++" > ~/.R/Makevars

echo "CXX14=icpc" > ~/.R/Makevars
echo "CXX17=g++" >> ~/.R/Makevars
echo "CXXFLAGS = -O3 -fPIC -march=cascadelake -ffp-contract=off -fno-fast-math -fno-signed-zeros -fopenmp -Wno-unknown-warning-option" >> ~/.R/Makevars
echo "CXX17=icpc" >> ~/.R/Makevars
echo "CXX14FLAGS=-O3 -fPIC -std=c++14 -wd308 -axCORE-AVX512,CORE-AVX2,AVX -xSSE4.2 -fp-model strict -qopenmp" >> ~/.R/Makevars
echo "CXX14FLAGS += -std=c++14" >> ~/.R/Makevars
echo "CXX17FLAGS=-O3 -fPIC -std=c++17 -wd308 -axCORE-AVX512,CORE-AVX2,AVX -xSSE4.2 -fp-model strict -qopenmp" >> ~/.R/Makevars
echo "CXX17FLAGS += -std=c++17" >> ~/.R/Makevars
echo "CXXFLAGS += -wd308" >> ~/.R/Makevars
echo "PKG_CXXFLAGS += -std=c++14 -wd308" >> ~/.R/Makevars
</pre>
</pre>


== Install external programs ==


Next, we create the directories for the source code and installation targets, respectively. Furthermore, we load all (software) modules relevant for compilation and ensure that the compilers are found:
First, we download the sources of GDAL, PROJ, GEOS and install the three programs.


<pre>
We will gather them in a folder src, unpack there and then compile.
# We install the libraries into the ~/sw/R directory.
mkdir -p ~/sw/R


# Source directory.
We strongly recommend to use a interactive session with multiple cores.
mkdir -p ~/src


# Load required modules.
<pre>
module purge
salloc -n 4 -t 60 -p single
module load devel/cmake/3.29.3
</pre>
module load devel/python/3.12.3_gnu_13.3


# Check that the GNU compiler 13.3 is loaded.
gcc --version

# Set compiler for cmake and make.
export CC=$(which gcc)
export CXX=$(which g++)
</pre>


The Python 3.12.3 module is missing some required packages which are included in the Python default module. Therefore, we install them in the user environment:
First, provide the source directory (if not yet existing)
<pre>
<pre>
pip3 install --user numpy setuptools
mkdir -p ~/src
cd ~/src
</pre>
</pre>


== Install external programs ==
Then, download and install PROJ

First, we download the sources of GDAL, PROJ, GEOS (and their dependencies) and install them:

=== Install PROJ ===
<pre>
<pre>
# Download and unpack the PROJ source code:
PROJ_VER=9.3.1
PROJ_VER=9.4.1
cd $HOME/src
wget http://download.osgeo.org/proj/proj-$PROJ_VER.tar.gz
wget http://download.osgeo.org/proj/proj-$PROJ_VER.tar.gz
tar xf proj-$PROJ_VER.tar.gz
tar xf proj-$PROJ_VER.tar.gz

cd proj-$PROJ_VER
cd proj-$PROJ_VER
mkdir build && cd build

# Compile and install PROJ:
cmake -DCMAKE_INSTALL_PREFIX="$HOME/sw/R" ..
cmake --build . -j 8
cmake --build . --target install
</pre>

=== Install GDAL ===
Building GDAL requires newer versions of <code>OpenEXR</code> and <code>libdeflate</code> then available on the system.

<pre>
# Download and unpack the libdeflate source code:
cd $HOME/src
git clone https://github.com/ebiggers/libdeflate

cd libdeflate
mkdir build && cd build

# Compile libdeflate
cmake -DCMAKE_INSTALL_PREFIX="$HOME/sw/R" \
-DCMAKE_PREFIX_PATH="$HOME/sw/R" \
-DCMAKE_BUILD_TYPE=Release ..
cmake --build . -j 8
cmake --build . --target install


# Download and unpack the OpenEXR source code:
OPENEXR_VER=3.3.1
cd $HOME/src
wget https://github.com/AcademySoftwareFoundation/openexr/releases/download/v$OPENEXR_VER/openexr-$OPENEXR_VER.tar.gz
tar xf openexr-$OPENEXR_VER.tar.gz


cd openexr-$OPENEXR_VER/
mkdir build
cd build
mkdir build && cd build


# Compile OpenEXR:
cmake -DCMAKE_INSTALL_PREFIX=$HOME/sw/R ..
cmake -DCMAKE_INSTALL_PREFIX="$HOME/sw/R" -DCMAKE_BUILD_TYPE=Release ..
cmake --build . -j 8
cmake --build . -j 8
cmake --build . --target install
cmake --build . --target install
</pre>
</pre>


Now, all necessary dependencies are available and we can build <code>GDAL</code>:
Then, install GDAL


<pre>
<pre>
# Download and unpack the GDAL source code:
cd $HOME/src
cd $HOME/src
GDAL_VER=3.8.3
GDAL_VER=3.9.3
wget http://download.osgeo.org/gdal/$GDAL_VER/gdal-$GDAL_VER.tar.gz
wget http://download.osgeo.org/gdal/$GDAL_VER/gdal-$GDAL_VER.tar.gz
tar xf gdal-$GDAL_VER.tar.gz
tar xf gdal-$GDAL_VER.tar.gz
cd gdal-$GDAL_VER


cd gdal-$GDAL_VER
mkdir build
cd build
mkdir build && cd build


cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=$HOME/sw/R ..
cmake -DCMAKE_INSTALL_PREFIX="$HOME/sw/R" \
-DCMAKE_PREFIX_PATH="$HOME/sw/R" \
-DCMAKE_BUILD_TYPE=Release ..
cmake --build . -j 8
cmake --build . -j 8
cmake --build . --target install
cmake --build . --target install

</pre>
</pre>


Finally, install GEOS
=== Install GEOS ===

The last external package that needs to be compiled and installed is <code>GEOS</code>:
<pre>
<pre>
# Download and unpack the GEOS source code.
cd $HOME/src
cd $HOME/src
GEOS_VER=3.12.1
GEOS_VER=3.13.0
wget http://download.osgeo.org/geos/geos-$GEOS_VER.tar.bz2
wget http://download.osgeo.org/geos/geos-$GEOS_VER.tar.bz2
tar xf geos-$GEOS_VER.tar.bz2
tar xf geos-$GEOS_VER.tar.bz2
cd geos-$GEOS_VER


mkdir _build
cd _build


cd geos-$GEOS_VER
cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=$HOME/sw/R ..
mkdir build && cd build

# Compile GEOS:
cmake -DCMAKE_INSTALL_PREFIX="$HOME/sw/R" \
-DCMAKE_PREFIX_PATH="$HOME/sw/R" \
-DCMAKE_BUILD_TYPE=Release ..
make -j 8
make -j 8
ctest -j 8
ctest -j 8
Line 108: Line 174:
</pre>
</pre>


== Installing the R packages ==
== Install the R packages ==


In order to install the two R packages, we need R to understand where we installed the 3 underlying programs, so we export the necessary paths.
In order to install the two R packages, we need to inform R where to find <code>PROJ</code>, <code>GDAL</code> and <code>GEOS</code>, so we export the necessary paths:


<pre>
<pre>

export LD_LIBRARY_PATH=$HOME/sw/R/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$HOME/sw/R/lib64:$LD_LIBRARY_PATH
export PATH=$PATH:$HOME/sw/R/bin
export PATH=$PATH:$HOME/sw/R/bin
Line 125: Line 190:
<pre>
<pre>
export CFLAGS=-I$HOME/sw/R/include
export CFLAGS=-I$HOME/sw/R/include
export CXX="icpc -std=c++11"
export CXX17=icpc
</pre>
</pre>




Now, we install terra and sf from within R.
Now, we install <code>terra</code> and <code>sf</code> from within <code>R</code>.


<pre>
<pre>
module load math/R/4.1.2
module load math/R/4.4.1-mkl-2022.2.1-gnu-13.3




Line 139: Line 202:
R> install.packages("terra")
R> install.packages("terra")
R> library(terra)
R> library(terra)
terra 1.7.71
terra 1.7.83





R> install.packages("sf")
R> install.packages("sf")
R> library(sf)
R> library(sf)
Linking to GEOS 3.12.1, GDAL 3.8.3, PROJ 9.3.1; sf_use_s2() is TRUE
Linking to GEOS 3.13.0, GDAL 3.9.3, PROJ 9.4.1; sf_use_s2() is TRUE

</pre>
</pre>


== Preparations to use the terra and sf packages ==
= Preparations to use the terra and sf packages =
Since terra and sf depend on the external programs we installed, several environment variables have to be set before using the packages to allow R to address these programs.
Since <code>terra</code> and <code>sf</code> depend on the external programs we installed, several environment variables have to be set before using the packages to allow R to address these programs.


We recommend to add the export commands
We recommend to add the export commands
Line 160: Line 223:
</pre>
</pre>


to your [[BwUniCluster_2.0_Slurm_common_Features#sbatch_Examples | batch job scripts]] that use terra and sf or to run them directly in the command line if you use an [[BwUniCluster_2.0_Batch_Queues | interactive session]].
to your [[BwUniCluster_2.0_Slurm_common_Features#sbatch_Examples | batch job scripts]] that use <code>terra</code> and <code>sf</code> or to use them in an [[BwUniCluster_2.0_Batch_Queues | interactive session]].

Latest revision as of 17:44, 28 October 2024

Note that the instructions provided below refer to R 4.4.1 (but not R 4.2.1)!

General information

terra is a R package for spatial data analysis with vectors (points, lines, polygons) and raster (grid) data.

sf is a R package that provides simple feature access for R.

In order to install those, we need to fulfill the following system requirements:

  • GDAL 2.2.3 or higher
  • PROJ 4.9.3 or higher
  • GEOS 3.4.0 or higher

These packages are not available centrally on the cluster, but can be installed manually (to the $HOME-directory). Specifically, to install these packages, they need to be built from source.


Installation

Please enter (or copy & paste) the code, presented in the boxes below, directly into your shell/command line on bwUniCluster. The whole process will take approximately 45 minutes.

First, for compilation we obtain an interactive session with multiple cores (on a compute node):

# Obtain interactive session
salloc -n 8 -t 60 -p single


Preparations

Prepare an .R/Makevars file (if it does not already exist). This file specifies how R should compile the packages (i.e., sets some 'compiler flags').

If an .R/Makevars file is present in your home directory ($HOME), check whether the flags displayed below are already set and apply adjustments, if necessary:

cat $HOME/.R/Makevars

CXX14=g++
CXX17=g++
CXXFLAGS = -O3 -fPIC -march=cascadelake -ffp-contract=off -fno-fast-math -fno-signed-zeros -fopenmp -Wno-unknown-warning-option
CXX14FLAGS += -std=c++14
CXX17FLAGS += -std=c++17

Please run the following lines of code to set the flags, if necessary:

mkdir -p ~/.R
echo "CXX14=g++" > ~/.R/Makevars
echo "CXX17=g++" >> ~/.R/Makevars
echo "CXXFLAGS = -O3 -fPIC -march=cascadelake -ffp-contract=off -fno-fast-math -fno-signed-zeros -fopenmp -Wno-unknown-warning-option" >> ~/.R/Makevars
echo "CXX14FLAGS += -std=c++14" >> ~/.R/Makevars
echo "CXX17FLAGS += -std=c++17" >> ~/.R/Makevars


Next, we create the directories for the source code and installation targets, respectively. Furthermore, we load all (software) modules relevant for compilation and ensure that the compilers are found:

# We install the libraries into the ~/sw/R directory.
mkdir -p ~/sw/R

# Source directory.
mkdir -p ~/src

# Load required modules.
module purge
module load devel/cmake/3.29.3
module load devel/python/3.12.3_gnu_13.3

# Check that the GNU compiler 13.3 is loaded.
gcc --version

# Set compiler for cmake and make.
export CC=$(which gcc)
export CXX=$(which g++)

The Python 3.12.3 module is missing some required packages which are included in the Python default module. Therefore, we install them in the user environment:

pip3 install --user numpy setuptools

Install external programs

First, we download the sources of GDAL, PROJ, GEOS (and their dependencies) and install them:

Install PROJ

# Download and unpack the PROJ source code:
PROJ_VER=9.4.1
cd $HOME/src
wget http://download.osgeo.org/proj/proj-$PROJ_VER.tar.gz
tar xf proj-$PROJ_VER.tar.gz

cd proj-$PROJ_VER
mkdir build && cd build

# Compile and install PROJ:
cmake -DCMAKE_INSTALL_PREFIX="$HOME/sw/R" ..
cmake --build . -j 8
cmake --build . --target install

Install GDAL

Building GDAL requires newer versions of OpenEXR and libdeflate then available on the system.

# Download and unpack the libdeflate source code:
cd $HOME/src
git clone https://github.com/ebiggers/libdeflate

cd libdeflate
mkdir build && cd build

# Compile libdeflate
cmake -DCMAKE_INSTALL_PREFIX="$HOME/sw/R" \
      -DCMAKE_PREFIX_PATH="$HOME/sw/R" \
      -DCMAKE_BUILD_TYPE=Release ..
cmake --build . -j 8
cmake --build . --target install


# Download and unpack the OpenEXR source code:
OPENEXR_VER=3.3.1
cd $HOME/src
wget https://github.com/AcademySoftwareFoundation/openexr/releases/download/v$OPENEXR_VER/openexr-$OPENEXR_VER.tar.gz
tar xf openexr-$OPENEXR_VER.tar.gz

cd openexr-$OPENEXR_VER/
mkdir build && cd build

# Compile OpenEXR:
cmake -DCMAKE_INSTALL_PREFIX="$HOME/sw/R" -DCMAKE_BUILD_TYPE=Release ..
cmake --build . -j 8
cmake --build . --target install

Now, all necessary dependencies are available and we can build GDAL:

# Download and unpack the GDAL source code:
cd $HOME/src
GDAL_VER=3.9.3
wget http://download.osgeo.org/gdal/$GDAL_VER/gdal-$GDAL_VER.tar.gz
tar xf gdal-$GDAL_VER.tar.gz

cd gdal-$GDAL_VER
mkdir build && cd build

cmake -DCMAKE_INSTALL_PREFIX="$HOME/sw/R" \
      -DCMAKE_PREFIX_PATH="$HOME/sw/R" \
      -DCMAKE_BUILD_TYPE=Release ..
cmake --build . -j 8
cmake --build . --target install

Install GEOS

The last external package that needs to be compiled and installed is GEOS:

# Download and unpack the GEOS source code.
cd $HOME/src
GEOS_VER=3.13.0
wget http://download.osgeo.org/geos/geos-$GEOS_VER.tar.bz2
tar xf geos-$GEOS_VER.tar.bz2


cd geos-$GEOS_VER
mkdir build && cd build

# Compile GEOS:
cmake -DCMAKE_INSTALL_PREFIX="$HOME/sw/R" \
      -DCMAKE_PREFIX_PATH="$HOME/sw/R" \
      -DCMAKE_BUILD_TYPE=Release ..
make -j 8
ctest -j 8
make install

Install the R packages

In order to install the two R packages, we need to inform R where to find PROJ, GDAL and GEOS, so we export the necessary paths:

export LD_LIBRARY_PATH=$HOME/sw/R/lib64:$LD_LIBRARY_PATH
export PATH=$PATH:$HOME/sw/R/bin
export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:$HOME/sw/R/lib64/pkgconfig
export GDAL_DATA=$HOME/sw/R/share/gdal


Additionally, the R package installation features compilation of built-in C++ code, for which we specify compilation options ('compiler flags')

export CFLAGS=-I$HOME/sw/R/include


Now, we install terra and sf from within R.

module load math/R/4.4.1-mkl-2022.2.1-gnu-13.3


R -q
R> install.packages("terra")
R> library(terra)
terra 1.7.83



R> install.packages("sf")
R> library(sf)
Linking to GEOS 3.13.0, GDAL 3.9.3, PROJ 9.4.1; sf_use_s2() is TRUE

Preparations to use the terra and sf packages

Since terra and sf depend on the external programs we installed, several environment variables have to be set before using the packages to allow R to address these programs.

We recommend to add the export commands

export LD_LIBRARY_PATH=$HOME/sw/R/lib64:$LD_LIBRARY_PATH
export PATH=$PATH:$HOME/sw/R/bin
export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:$HOME/sw/R/lib64/pkgconfig
export GDAL_DATA=$HOME/sw/R/share/gdal

to your batch job scripts that use terra and sf or to use them in an interactive session.