Helix/Software/R: Difference between revisions
| H Winkhardt (talk | contribs) No edit summary | H Winkhardt (talk | contribs)  m (Typo) | ||
| (7 intermediate revisions by the same user not shown) | |||
| Line 34: | Line 34: | ||
| R is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License in source code form. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux), Windows and MacOS. | R is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License in source code form. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux), Windows and MacOS. | ||
| The R installation also provides the standalone library libRmath. This library allows you to access R routines from your own C or C++ programs (see section 9 of the 'R Installation and Administration' manual). | The R installation also provides the standalone library '''libRmath'''. This library allows you to access R routines from your own C or C++ programs (see section 9 of the 'R Installation and Administration' manual). | ||
| = Package management = | = Package management = | ||
| Line 60: | Line 60: | ||
| </pre> | </pre> | ||
| In order to set the correct path when using R, the location must also be defined in a configuration file ~/.Renviron in the home directory containing the following line: | In order to set the correct path when using R, the location must also be defined in a configuration file '''~/.Renviron''' in the home directory containing the following line: | ||
| <pre> | <pre> | ||
| Line 72: | Line 72: | ||
| The drawback is that, whenever switching to a new R release, the personal library stack '''must''' be rebuilt with that new R version into the corresponding (version specific) library path. This is considered good practice anyway in order to ensure a consistent personal library stack for any specific R version in use. Mixing libraries built with different major and minor R versions is discouraged, as this may result in unpredictable and subtle errors. Packages that are built and installed with one version of R may be incompatible with a newer version of R, at least when the major or minor version changes. The same is true if several versions are used simultaneously, e.g. a newer R version for a more recently started project and and older version for another project (but eventually picking up libraries built with the newer R version). | The drawback is that, whenever switching to a new R release, the personal library stack '''must''' be rebuilt with that new R version into the corresponding (version specific) library path. This is considered good practice anyway in order to ensure a consistent personal library stack for any specific R version in use. Mixing libraries built with different major and minor R versions is discouraged, as this may result in unpredictable and subtle errors. Packages that are built and installed with one version of R may be incompatible with a newer version of R, at least when the major or minor version changes. The same is true if several versions are used simultaneously, e.g. a newer R version for a more recently started project and and older version for another project (but eventually picking up libraries built with the newer R version). | ||
| Because the default version may change over time, it is highly recommended to always load a specific version | Because the default version may change over time, it is highly recommended to always load a specific version, e.g. | ||
| <pre> | <pre> | ||
| $ module load math/R/4. | $ module load math/R/4.3.3 | ||
| </pre> | </pre> | ||
| Line 84: | Line 84: | ||
| * doMPI | * doMPI | ||
| * doParallel | * doParallel | ||
| == Custom installation options == | |||
| === Makevars === | |||
| You can add custom build and installation flags for new R-packages through the ''~/.R/Makevars'' file. This will take the form of a basic Makefile and will be added to the existing options set by the module. While the module's default should be appropriate to install most packages, it can be helpful or necessary to include additional options, for example to include locally installed dependencies to g++: | |||
| <pre> | |||
| MAKEFLAGS=-j4  | |||
| CXXFLAGS=-I/path/to/dependency/include -L/path/to/dependency/lib -llib | |||
| </pre> | |||
| It is also possible to set more targeted instruction sets, e.g. for AMD processors: | |||
| <pre> | |||
| CXXFLAGS=-march=znver3 | |||
| </pre> | |||
| The defaults can be found under ''$R_HOME_DIR/lib64/R/etc/Makeconf''. It is usually not advisable to change the compilers (gcc, g++) themselves. | |||
| === Configure arguments === | |||
| You can set further ''configure.args'' and ''configure.vars'' as character vectors to ''install.packages()'', e.g.: | |||
| <pre> | |||
| install.packages("Rjags", configure.args = "--enable-rpath") | |||
| </pre> | |||
| === Tips for local dependencies === | |||
| It can make sense to build dependencies with MKL for BLAS/LAPACK in the same that it is used for R. Specifically, the MKL libraries and options that were used to install R are the following:  | |||
| <pre> | |||
| -Wl,--no-as-needed -lmkl_gf_lp64 -Wl,--start-group -lmkl_gnu_thread -lmkl_core -Wl,--end-group -fopenmp -ldl -lpthread -lm | |||
| </pre> | |||
| <br> | <br> | ||
| = Instructions for specific R-packages = | |||
| * [[Helix/Software/R/Rstan | rstan]] | * [[Helix/Software/R/Rstan | rstan]] | ||
Latest revision as of 14:28, 27 January 2025
| The main documentation is available  on the cluster via  | 
| Description | Content | 
|---|---|
| module load | math/R | 
| License | GPL | 
| Citing | n/a | 
| Links | Homepage | Documentation | 
| Graphical Interface | No | 
| Plugins | User dependent | 
Description
R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.
R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.
One of R’s strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.
R is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License in source code form. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux), Windows and MacOS.
The R installation also provides the standalone library libRmath. This library allows you to access R routines from your own C or C++ programs (see section 9 of the 'R Installation and Administration' manual).
Package management
Installing R-Packages into your home folder
Since we cannot provide a software module for every R package, we recommend to install special R packages locally into your home folder. Depending on the package, some dependencies required for the installation may only be available on the login-nodes.
> library()                                                                # List pre-installed packages
> install.packages('package_name', repos="http://cran.r-project.org")      # Install your R package and the dependencies 
> library(package_name)                                                    # Load the package into you R instance
The package is now installed permanently in your home folder and is available every time you start R.
Note:
By default R uses a version (and platform) specific path for personal libraries, such as "$HOME/R/x86_64-pc-linux-gnu-library/x.y" for R version x.y.z. This directory will be created automatically (after confirmation) when installing a personal package for the first time.
Users can customize a common location of their personal library packages, e.g. ~/R_libs, rather than the default location. A customized directory must exist before installing a personal package for the first time, i.e.
$ mkdir -p ~/R_libs
In order to set the correct path when using R, the location must also be defined in a configuration file ~/.Renviron in the home directory containing the following line:
R_LIBS_USER="~/R_libs"
By setting up a (fixed) custom location for personal library packages, any personal package installed into that directory will be visible across different R versions. This may be advantageous if the packages are to be used with different (future) R versions.
A version specific path, such as the default path, allows users to maintain multiple personal library stacks for different (major and minor) R versions and does also prevent users from mixing their stack with libraries built with different R versions.
The drawback is that, whenever switching to a new R release, the personal library stack must be rebuilt with that new R version into the corresponding (version specific) library path. This is considered good practice anyway in order to ensure a consistent personal library stack for any specific R version in use. Mixing libraries built with different major and minor R versions is discouraged, as this may result in unpredictable and subtle errors. Packages that are built and installed with one version of R may be incompatible with a newer version of R, at least when the major or minor version changes. The same is true if several versions are used simultaneously, e.g. a newer R version for a more recently started project and and older version for another project (but eventually picking up libraries built with the newer R version).
Because the default version may change over time, it is highly recommended to always load a specific version, e.g.
$ module load math/R/4.3.3
Pre-installed R-packages
- Rmpi
- iterators
- foreach
- doMPI
- doParallel
Custom installation options
Makevars
You can add custom build and installation flags for new R-packages through the ~/.R/Makevars file. This will take the form of a basic Makefile and will be added to the existing options set by the module. While the module's default should be appropriate to install most packages, it can be helpful or necessary to include additional options, for example to include locally installed dependencies to g++:
MAKEFLAGS=-j4 CXXFLAGS=-I/path/to/dependency/include -L/path/to/dependency/lib -llib
It is also possible to set more targeted instruction sets, e.g. for AMD processors:
CXXFLAGS=-march=znver3
The defaults can be found under $R_HOME_DIR/lib64/R/etc/Makeconf. It is usually not advisable to change the compilers (gcc, g++) themselves.
Configure arguments
You can set further configure.args and configure.vars as character vectors to install.packages(), e.g.:
install.packages("Rjags", configure.args = "--enable-rpath")
Tips for local dependencies
It can make sense to build dependencies with MKL for BLAS/LAPACK in the same that it is used for R. Specifically, the MKL libraries and options that were used to install R are the following:
-Wl,--no-as-needed -lmkl_gf_lp64 -Wl,--start-group -lmkl_gnu_thread -lmkl_core -Wl,--end-group -fopenmp -ldl -lpthread -lm