BwUniCluster2.0/Software/R/terra: Difference between revisions

From bwHPC Wiki
Jump to navigation Jump to search
No edit summary
 
(47 intermediate revisions by the same user not shown)
Line 1: Line 1:
<span style="color:red"><b>Note that the instructions provided below refer to R 4.4.1 (but not R 4.2.1)!</b></span>


= General information =
= General information =
[https://rspatial.github.io/terra/ '''terra'''] is a R package for spatial data analysis with vectors (points, lines, polygons) and raster (grid) data.


[https://github.com/r-spatial/sf '''sf'''] is a R package that provides [https://en.wikipedia.org/wiki/Simple_Features simple feature] access for R.
<span style="color:red"><b>Please note that THIS SITE IS UNDER CONSTRUCTION!!!</b> </span>


In order to install those, we need to fulfill the following [https://cran.r-project.org/web/packages/terra/index.html system requirements]:
* [https://gdal.org/ GDAL] 2.2.3 or higher
* [https://proj.org/ PROJ] 4.9.3 or higher
* [https://libgeos.org/ GEOS] 3.4.0 or higher


These packages are not available centrally on the cluster, but can be installed manually (to the <code>$HOME</code>-directory). Specifically, to install these packages, they need to be built from source.
sf and terra allow to use the following tools for handling spatial structures in R
* the 'Geospatial' Data Abstraction Library [https://gdal.org/ GDAL]
* Projection/transformation operations from the [https://proj.org/ PROJ] library
* Interface to the open source Geometry Engine [https://libgeos.org/ GEOS]





= Installation =
= Installation =


Please enter the following code, presented in the boxes below, directly into your shell/command line on bwUniCluster.
Please enter (or copy & paste) the code, presented in the boxes below, directly into your shell/command line on bwUniCluster. The whole process will take approximately 45 minutes.


First, for compilation we obtain an interactive session with multiple cores (on a compute node):
The whole process will take approximately 30 minutes.
<pre>
# Obtain interactive session
salloc -n 8 -t 60 -p single
</pre>


== Install external programs ==


== Preparations ==
First, we download the sources of GDAL, PROJ, GEOS and install the three programs.
Prepare an <code>.R/Makevars</code> file (if it does not already exist). This file specifies how R should compile the packages (i.e., sets some 'compiler flags').


If an <code>.R/Makevars</code> file is present in your home directory (<code>$HOME</code>), check whether the flags displayed below are already set and apply adjustments, if necessary:
We will gather them in a folder src, unpack there and then compile.
<pre>
cat $HOME/.R/Makevars


CXX14=g++
We strongly recommend to use a interactive session with multiple cores.
CXX17=g++
CXXFLAGS = -O3 -fPIC -march=cascadelake -ffp-contract=off -fno-fast-math -fno-signed-zeros -fopenmp -Wno-unknown-warning-option
CXX14FLAGS += -std=c++14
CXX17FLAGS += -std=c++17
</pre>


Please run the following lines of code to set the flags, if necessary:
<pre>
<pre>
mkdir -p ~/.R
salloc -n 4 -t 30 -p dev_single
echo "CXX14=g++" > ~/.R/Makevars
echo "CXX17=g++" >> ~/.R/Makevars
echo "CXXFLAGS = -O3 -fPIC -march=cascadelake -ffp-contract=off -fno-fast-math -fno-signed-zeros -fopenmp -Wno-unknown-warning-option" >> ~/.R/Makevars
echo "CXX14FLAGS += -std=c++14" >> ~/.R/Makevars
echo "CXX17FLAGS += -std=c++17" >> ~/.R/Makevars
</pre>
</pre>




Next, we create the directories for the source code and installation targets, respectively. Furthermore, we load all (software) modules relevant for compilation and ensure that the compilers are found:
First, provide the source directory (if not yet existing)

<pre>
<pre>
# We install the libraries into the ~/sw/R directory.
mkdir -p ~/sw/R

# Source directory.
mkdir -p ~/src
mkdir -p ~/src

cd ~/src
# Load required modules.
module purge
module load devel/cmake/3.29.3
module load devel/python/3.12.3_gnu_13.3

# Check that the GNU compiler 13.3 is loaded.
gcc --version

# Set compiler for cmake and make.
export CC=$(which gcc)
export CXX=$(which g++)
</pre>
</pre>


The Python 3.12.3 module is missing some required packages which are included in the Python default module. Therefore, we install them in the user environment:
Then, download and install PROJ
<pre>
<pre>
pip3 install --user numpy setuptools
PROJ_VER=9.3.1
</pre>

== Install external programs ==

First, we download the sources of GDAL, PROJ, GEOS (and their dependencies) and install them:

=== Install PROJ ===
<pre>
# Download and unpack the PROJ source code:
PROJ_VER=9.4.1
cd $HOME/src
wget http://download.osgeo.org/proj/proj-$PROJ_VER.tar.gz
wget http://download.osgeo.org/proj/proj-$PROJ_VER.tar.gz
tar xf proj-$PROJ_VER.tar.gz
tar xf proj-$PROJ_VER.tar.gz

cd proj-$PROJ_VER
cd proj-$PROJ_VER
mkdir build && cd build

# Compile and install PROJ:
cmake -DCMAKE_INSTALL_PREFIX="$HOME/sw/R" ..
cmake --build . -j 8
cmake --build . --target install
</pre>

=== Install GDAL ===
Building GDAL requires newer versions of <code>OpenEXR</code> and <code>libdeflate</code> then available on the system.

<pre>
# Download and unpack the libdeflate source code:
cd $HOME/src
git clone https://github.com/ebiggers/libdeflate

cd libdeflate
mkdir build && cd build

# Compile libdeflate
cmake -DCMAKE_INSTALL_PREFIX="$HOME/sw/R" \
-DCMAKE_PREFIX_PATH="$HOME/sw/R" \
-DCMAKE_BUILD_TYPE=Release ..
cmake --build . -j 8
cmake --build . --target install


# Download and unpack the OpenEXR source code:
OPENEXR_VER=3.3.1
cd $HOME/src
wget https://github.com/AcademySoftwareFoundation/openexr/releases/download/v$OPENEXR_VER/openexr-$OPENEXR_VER.tar.gz
tar xf openexr-$OPENEXR_VER.tar.gz


cd openexr-$OPENEXR_VER/
mkdir build
cd build
mkdir build && cd build


# Compile OpenEXR:
export CMAKE_BUILD_PARALLEL_LEVEL=8
cmake -DCMAKE_INSTALL_PREFIX=$HOME/sw/R ..
cmake -DCMAKE_INSTALL_PREFIX="$HOME/sw/R" -DCMAKE_BUILD_TYPE=Release ..
cmake --build .
cmake --build . -j 8
cmake --build . --target install
cmake --build . --target install
</pre>
</pre>


Now, all necessary dependencies are available and we can build <code>GDAL</code>:
Then, install GDAL


<pre>
<pre>
# Download and unpack the GDAL source code:
cd $HOME/src
cd $HOME/src
GDAL_VER=3.8.3
GDAL_VER=3.9.3
wget http://download.osgeo.org/gdal/$GDAL_VER/gdal-$GDAL_VER.tar.gz
wget http://download.osgeo.org/gdal/$GDAL_VER/gdal-$GDAL_VER.tar.gz
tar xf gdal-$GDAL_VER.tar.gz
tar xf gdal-$GDAL_VER.tar.gz

cd gdal-$GDAL_VER
cd gdal-$GDAL_VER
mkdir build && cd build


cmake -DCMAKE_INSTALL_PREFIX="$HOME/sw/R" \
mkdir build
-DCMAKE_PREFIX_PATH="$HOME/sw/R" \
cd build
-DCMAKE_BUILD_TYPE=Release ..

cmake --build . -j 8
CMAKE_BUILD_PARALLEL_LEVEL=8
cmake -DCMAKE_INSTALL_PREFIX=$HOME/sw/R ..
cmake --build .
cmake --build . --target install
cmake --build . --target install

</pre>
</pre>


Finally, install GEOS
=== Install GEOS ===

The last external package that needs to be compiled and installed is <code>GEOS</code>:
<pre>
<pre>
# Download and unpack the GEOS source code.
cd $HOME/src
cd $HOME/src
GEOS_VER=3.12.1
GEOS_VER=3.13.0
wget http://download.osgeo.org/geos/geos-$GEOS_VER.tar.bz2
wget http://download.osgeo.org/geos/geos-$GEOS_VER.tar.bz2
tar xf geos-$GEOS_VER.tar.bz2
tar xf geos-$GEOS_VER.tar.bz2
cd geos-$GEOS_VER


mkdir _build
cd _build


cd geos-$GEOS_VER
CMAKE_BUILD_PARALLEL_LEVEL=4
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=$HOME/sw/R ..

make
# Compile GEOS:
ctest
cmake -DCMAKE_INSTALL_PREFIX="$HOME/sw/R" \
-DCMAKE_PREFIX_PATH="$HOME/sw/R" \
-DCMAKE_BUILD_TYPE=Release ..
make -j 8
ctest -j 8
make install
make install
</pre>
</pre>


== Installing the R packages ==
== Install the R packages ==


In order to install the two R packages, we need R to understand where we installed the 3 underlying programs, so we export the necessary paths.
In order to install the two R packages, we need to inform R where to find <code>PROJ</code>, <code>GDAL</code> and <code>GEOS</code>, so we export the necessary paths:


<pre>
<pre>

export LD_LIBRARY_PATH=$HOME/sw/R/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$HOME/sw/R/lib64:$LD_LIBRARY_PATH
export PATH=$PATH:$HOME/sw/R/bin
export PATH=$PATH:$HOME/sw/R/bin
export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:$HOME/sw/R/lib/pkgconfig
export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:$HOME/sw/R/lib64/pkgconfig
export GDAL_DATA=$HOME/sw/R/share/gdal
export GDAL_DATA=$HOME/sw/R/share/gdal
</pre>
</pre>
Line 108: Line 190:
<pre>
<pre>
export CFLAGS=-I$HOME/sw/R/include
export CFLAGS=-I$HOME/sw/R/include
export CXX="icpc -std=c++11"
export CXX17=icpc
</pre>
</pre>




Now, we install <code>terra</code> and <code>sf</code> from within <code>R</code>.
Now, we install rgdal and rgeos from within R. Note that, since we install a package from a local repository the order of the package installation is relevant. Installing rgdal will fail if sp is not available.


<pre>
<pre>
module load math/R/4.1.2
module load math/R/4.4.1-mkl-2022.2.1-gnu-13.3


cd
wget https://cran.r-project.org/src/contrib/Archive/rgdal/rgdal_1.6-4.tar.gz
wget https://cran.r-project.org/src/contrib/Archive/rgeos/rgeos_0.6-1.tar.gz


R -q
R -q
> install.packages("sp", repos="https://ftp.gwdg.de/pub/misc/cran/")
R> install.packages("terra")
R> library(terra)
> install.packages("~/rgdal_1.6-4.tar.gz", repos=NULL, type="source")
terra 1.7.83
> install.packages("~/rgeos_0.6-1.tar.gz", repos=NULL, type="source")



> library("rgdal")

> library("rgeos")
R> install.packages("sf")
R> library(sf)
Linking to GEOS 3.13.0, GDAL 3.9.3, PROJ 9.4.1; sf_use_s2() is TRUE
</pre>
</pre>


== Preparations to use the rgdal/rgeos packages ==
= Preparations to use the terra and sf packages =
Since rgdal and rgeos depend on the external programs we installed, several environment variables have to be set before using the packages to allow R to address these programs.
Since <code>terra</code> and <code>sf</code> depend on the external programs we installed, several environment variables have to be set before using the packages to allow R to address these programs.


We recommend to add the export commands
We recommend to add the export commands


<pre>
<pre>
export LD_LIBRARY_PATH=$HOME/sw/R/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$HOME/sw/R/lib64:$LD_LIBRARY_PATH
export PATH=$PATH:$HOME/sw/R/bin
export PATH=$PATH:$HOME/sw/R/bin
export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:$HOME/sw/R/lib/pkgconfig
export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:$HOME/sw/R/lib64/pkgconfig
export GDAL_DATA=$HOME/sw/R/share/gdal
export GDAL_DATA=$HOME/sw/R/share/gdal
</pre>
</pre>


to your [[BwUniCluster_2.0_Slurm_common_Features#sbatch_Examples | batch job scripts]] that use rgdal and rgeos or to run them directly in the command line if you use an [[BwUniCluster_2.0_Batch_Queues | interactive session]].
to your [[BwUniCluster_2.0_Slurm_common_Features#sbatch_Examples | batch job scripts]] that use <code>terra</code> and <code>sf</code> or to use them in an [[BwUniCluster_2.0_Batch_Queues | interactive session]].

Latest revision as of 17:44, 28 October 2024

Note that the instructions provided below refer to R 4.4.1 (but not R 4.2.1)!

General information

terra is a R package for spatial data analysis with vectors (points, lines, polygons) and raster (grid) data.

sf is a R package that provides simple feature access for R.

In order to install those, we need to fulfill the following system requirements:

  • GDAL 2.2.3 or higher
  • PROJ 4.9.3 or higher
  • GEOS 3.4.0 or higher

These packages are not available centrally on the cluster, but can be installed manually (to the $HOME-directory). Specifically, to install these packages, they need to be built from source.


Installation

Please enter (or copy & paste) the code, presented in the boxes below, directly into your shell/command line on bwUniCluster. The whole process will take approximately 45 minutes.

First, for compilation we obtain an interactive session with multiple cores (on a compute node):

# Obtain interactive session
salloc -n 8 -t 60 -p single


Preparations

Prepare an .R/Makevars file (if it does not already exist). This file specifies how R should compile the packages (i.e., sets some 'compiler flags').

If an .R/Makevars file is present in your home directory ($HOME), check whether the flags displayed below are already set and apply adjustments, if necessary:

cat $HOME/.R/Makevars

CXX14=g++
CXX17=g++
CXXFLAGS = -O3 -fPIC -march=cascadelake -ffp-contract=off -fno-fast-math -fno-signed-zeros -fopenmp -Wno-unknown-warning-option
CXX14FLAGS += -std=c++14
CXX17FLAGS += -std=c++17

Please run the following lines of code to set the flags, if necessary:

mkdir -p ~/.R
echo "CXX14=g++" > ~/.R/Makevars
echo "CXX17=g++" >> ~/.R/Makevars
echo "CXXFLAGS = -O3 -fPIC -march=cascadelake -ffp-contract=off -fno-fast-math -fno-signed-zeros -fopenmp -Wno-unknown-warning-option" >> ~/.R/Makevars
echo "CXX14FLAGS += -std=c++14" >> ~/.R/Makevars
echo "CXX17FLAGS += -std=c++17" >> ~/.R/Makevars


Next, we create the directories for the source code and installation targets, respectively. Furthermore, we load all (software) modules relevant for compilation and ensure that the compilers are found:

# We install the libraries into the ~/sw/R directory.
mkdir -p ~/sw/R

# Source directory.
mkdir -p ~/src

# Load required modules.
module purge
module load devel/cmake/3.29.3
module load devel/python/3.12.3_gnu_13.3

# Check that the GNU compiler 13.3 is loaded.
gcc --version

# Set compiler for cmake and make.
export CC=$(which gcc)
export CXX=$(which g++)

The Python 3.12.3 module is missing some required packages which are included in the Python default module. Therefore, we install them in the user environment:

pip3 install --user numpy setuptools

Install external programs

First, we download the sources of GDAL, PROJ, GEOS (and their dependencies) and install them:

Install PROJ

# Download and unpack the PROJ source code:
PROJ_VER=9.4.1
cd $HOME/src
wget http://download.osgeo.org/proj/proj-$PROJ_VER.tar.gz
tar xf proj-$PROJ_VER.tar.gz

cd proj-$PROJ_VER
mkdir build && cd build

# Compile and install PROJ:
cmake -DCMAKE_INSTALL_PREFIX="$HOME/sw/R" ..
cmake --build . -j 8
cmake --build . --target install

Install GDAL

Building GDAL requires newer versions of OpenEXR and libdeflate then available on the system.

# Download and unpack the libdeflate source code:
cd $HOME/src
git clone https://github.com/ebiggers/libdeflate

cd libdeflate
mkdir build && cd build

# Compile libdeflate
cmake -DCMAKE_INSTALL_PREFIX="$HOME/sw/R" \
      -DCMAKE_PREFIX_PATH="$HOME/sw/R" \
      -DCMAKE_BUILD_TYPE=Release ..
cmake --build . -j 8
cmake --build . --target install


# Download and unpack the OpenEXR source code:
OPENEXR_VER=3.3.1
cd $HOME/src
wget https://github.com/AcademySoftwareFoundation/openexr/releases/download/v$OPENEXR_VER/openexr-$OPENEXR_VER.tar.gz
tar xf openexr-$OPENEXR_VER.tar.gz

cd openexr-$OPENEXR_VER/
mkdir build && cd build

# Compile OpenEXR:
cmake -DCMAKE_INSTALL_PREFIX="$HOME/sw/R" -DCMAKE_BUILD_TYPE=Release ..
cmake --build . -j 8
cmake --build . --target install

Now, all necessary dependencies are available and we can build GDAL:

# Download and unpack the GDAL source code:
cd $HOME/src
GDAL_VER=3.9.3
wget http://download.osgeo.org/gdal/$GDAL_VER/gdal-$GDAL_VER.tar.gz
tar xf gdal-$GDAL_VER.tar.gz

cd gdal-$GDAL_VER
mkdir build && cd build

cmake -DCMAKE_INSTALL_PREFIX="$HOME/sw/R" \
      -DCMAKE_PREFIX_PATH="$HOME/sw/R" \
      -DCMAKE_BUILD_TYPE=Release ..
cmake --build . -j 8
cmake --build . --target install

Install GEOS

The last external package that needs to be compiled and installed is GEOS:

# Download and unpack the GEOS source code.
cd $HOME/src
GEOS_VER=3.13.0
wget http://download.osgeo.org/geos/geos-$GEOS_VER.tar.bz2
tar xf geos-$GEOS_VER.tar.bz2


cd geos-$GEOS_VER
mkdir build && cd build

# Compile GEOS:
cmake -DCMAKE_INSTALL_PREFIX="$HOME/sw/R" \
      -DCMAKE_PREFIX_PATH="$HOME/sw/R" \
      -DCMAKE_BUILD_TYPE=Release ..
make -j 8
ctest -j 8
make install

Install the R packages

In order to install the two R packages, we need to inform R where to find PROJ, GDAL and GEOS, so we export the necessary paths:

export LD_LIBRARY_PATH=$HOME/sw/R/lib64:$LD_LIBRARY_PATH
export PATH=$PATH:$HOME/sw/R/bin
export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:$HOME/sw/R/lib64/pkgconfig
export GDAL_DATA=$HOME/sw/R/share/gdal


Additionally, the R package installation features compilation of built-in C++ code, for which we specify compilation options ('compiler flags')

export CFLAGS=-I$HOME/sw/R/include


Now, we install terra and sf from within R.

module load math/R/4.4.1-mkl-2022.2.1-gnu-13.3


R -q
R> install.packages("terra")
R> library(terra)
terra 1.7.83



R> install.packages("sf")
R> library(sf)
Linking to GEOS 3.13.0, GDAL 3.9.3, PROJ 9.4.1; sf_use_s2() is TRUE

Preparations to use the terra and sf packages

Since terra and sf depend on the external programs we installed, several environment variables have to be set before using the packages to allow R to address these programs.

We recommend to add the export commands

export LD_LIBRARY_PATH=$HOME/sw/R/lib64:$LD_LIBRARY_PATH
export PATH=$PATH:$HOME/sw/R/bin
export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:$HOME/sw/R/lib64/pkgconfig
export GDAL_DATA=$HOME/sw/R/share/gdal

to your batch job scripts that use terra and sf or to use them in an interactive session.