Difference between revisions of "BwUniCluster2.0/Software/R"

From bwHPC Wiki
Jump to: navigation, search
m (Versions and Availability)
(Optional packages for R)
(47 intermediate revisions by 7 users not shown)
Line 5: Line 5:
 
| module load
 
| module load
 
| math/R
 
| math/R
|-
 
| Availability
 
| [[bwUniCluster]]
 
 
|-
 
|-
 
| License
 
| License
Line 26: Line 23:
   
 
= Description =
 
= Description =
  +
 
'''R''' is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.
 
'''R''' is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.
   
Line 34: Line 32:
 
R is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License in source code form. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux), Windows and MacOS.
 
R is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License in source code form. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux), Windows and MacOS.
   
  +
= Availability =
   
  +
R is available on selected bwHPC-Clusters. A complete list of versions currently installed on the bwHPC-Clusters can be obtained from the [https://www.bwhpc.de/software.html Cluster Information System (CIS)].
= Versions and Availability =
 
A list of versions currently available on all bwHPC-C5-Clusters can be obtained from the
 
<br>
 
<big>
 
   
  +
In order to check which versions of R are installed on the compute cluster, run the following command:
[https://cis-hpc.uni-konstanz.de/prod.cis/ Cluster Information System CIS]
 
 
</big>
 
{{#widget:Iframe
 
|url=https://cis-hpc.uni-konstanz.de/prod.cis/bwUniCluster/math/R
 
|width=99%
 
|height=180
 
}}
 
<br>
 
On the command line interface of any bwHPC cluster, a list of the available R versions using
 
 
<pre>
 
<pre>
 
$ module avail math/R
 
$ module avail math/R
Line 55: Line 42:
   
 
= Usage =
 
= Usage =
The R installation also provides the standalone library libRmath. This library allows you to access R routines from your own C or C++ programs (see section 9 of the 'R Installation and Administration' manual.)
+
The R installation also provides the standalone library libRmath. This library allows you to access R routines from your own C or C++ programs (see section 9 of the 'R Installation and Administration' manual).
   
 
== Loading the module ==
 
== Loading the module ==
  +
You can load the default version of R with the command
 
  +
You can load the default version of R with the following command:
 
<pre>
 
<pre>
 
$ module load math/R
 
$ module load math/R
 
</pre>
 
</pre>
   
The module will try to load modules it needs to function (e.g. compiler/intel). If loading the module fails, check if you have already loaded one of those modules, but not in the version needed for R.
+
The module will try to load all modules it needs to function (e.g. compiler/intel). If loading the module fails, check if you have already loaded one of those modules, but not in the version required by R.
   
If you wish to load a specific (older) version, you can do so using e.g.
+
If you wish to load another (older) version of R, you can do so using
 
<pre>
 
<pre>
$ module load math/R/3.1.2
+
$ module load math/R/<version>
 
</pre>
 
</pre>
  +
with <version> specifying the desired version.
to load the version 3.1.2.
 
   
 
== Program Binaries ==
 
== Program Binaries ==
Line 100: Line 88:
 
<pre>
 
<pre>
 
Man pages: man R man Rscript
 
Man pages: man R man Rscript
Info pages. e.g.: info R-intro info R-FAQ
+
Info pages, e.g.: info R-intro info R-FAQ
 
Manuals: $R_DOC_DIR/manual
 
Manuals: $R_DOC_DIR/manual
 
</pre>
 
</pre>
  +
  +
== Multithreading in R ==
  +
  +
An easy way to use multiple cores on a single node with R is to use the [https://cran.r-project.org/web/packages/doParallel/vignettes/gettingstartedParallel.pdf doParallel] package in combination with [https://cran.r-project.org/web/packages/foreach/vignettes/foreach.pdf foreach].
   
 
= Examples =
 
= Examples =
Line 115: Line 107:
 
</pre>
 
</pre>
   
  +
<!-- This is no longer valid.
Run a first simple example job
 
  +
  +
Run a first simple example job:
 
<pre>
 
<pre>
$ module load math/R # load module
+
$ module load math/R # load module
$ mkdir Rtest # create test directory
+
$ mkdir Rtest # create test directory
$ cp $R_EXA_DIR/bwhpc-r.moab $R_EXA_DIR/fit.R Rtest/ # copy example files to test directory
+
$ cp $R_EXA_DIR/bwhpc-r.moab $R_EXA_DIR/fit.R Rtest/ # copy example files to test directory
$ cd Rtest/ # change to directory
+
$ cd Rtest/ # change to directory
$ nano bwhpc-r.moab # change job options, quit with 'CTRL+X'
+
$ nano bwhpc-r.moab # change job options, quit with 'CTRL+X'
$ msub bwhpc-r.moab # submit job
+
$ msub bwhpc-r.moab # submit job
$ checkjob -v <JOBID> # check state of job
+
$ checkjob -v <JOBID> # check state of job
$ ls # when job finishes the results will be visible in this directory
+
$ ls # when job finishes the results will be visible in this directory
 
</pre>
 
</pre>
  +
  +
-->
   
 
= Installing R-Packages into your home folder =
 
= Installing R-Packages into your home folder =
Since we cannot provide a software module for every R package, we recommend to install special R packages locally into your home folder. One possibility doing this is shown below:
+
Since we cannot provide a software module for every R package, we recommend to install special R packages locally into your home folder. One possibility doing this is from within an interactive R session:
   
 
<pre>
 
<pre>
cp $HOME/.bashrc $HOME/.bashrc.backup # Make a backup copy of your bashrc
+
> library() # List preinstalled packages
  +
> install.packages('package_name', repos="http://cran.r-project.org") # Installing your R package and the dependencies
echo "export R_LIBS=\"${HOME}/R_libs\"" >> $HOME/.bashrc # Setting the environment variable R_LIBS permanently in your bashrc
 
source $HOME/.bashrc # Sourcing bashrc to make R_LIBS available
+
> library(package_name) # Loading the package into you R instance
mkdir $R_LIBS # Create the R_libs folder in your HOME directory
 
module load math/R # Loading the matlab software module
 
R # Loading R
 
install.packages('package_name', repos="http://cran.r-project.org") # Installing your R package and the dependencies
 
library(package_name) # Loading the package into you R instance
 
 
</pre>
 
</pre>
 
   
 
The package is now installed permanently in your home folder and is available every time you start R.
 
The package is now installed permanently in your home folder and is available every time you start R.
   
  +
'''Note:'''
You can restore your old .bashrc if something goes wrong with:
 
  +
  +
By default R uses a version (and platform) specific path for personal libraries, such as
  +
"$HOME/R/x86_64-pc-linux-gnu-library/x.y" for R version x.y.z. This directory will be created automatically (after confirmation) when installing a personal package for the first time.
  +
  +
Users can customize a common location of their personal library packages, e.g. ~/R_libs, rather than the default location. A customized directory must exist before installing a personal package for the first time, i.e.
  +
 
<pre>
 
<pre>
  +
$ mkdir -p ~/R_libs
$ mv $HOME/.bashrc.backup $HOME/.bashrc # Restoring the original bashrc
 
 
</pre>
 
</pre>
   
  +
The location must also be defined in a configuration file ~/.Renviron within the home directory containing the following line:
Installed packages can be deleted by deleting the folder ${HOME}/R_libs.
 
   
  +
<pre>
= Version-Specific Information =
 
  +
R_LIBS_USER="~/R_libs"
  +
</pre>
  +
  +
By setting up a (fixed) custom location for personal library packages, any personal package installed into that directory will be visible across different R versions. This may be advantageous if the packages are to be used with different (future) R versions.
  +
  +
A version specific path, such as the default path, allows users to maintain multiple personal library stacks for different (major and minor) R versions and does also prevent users from mixing their stack with libraries built with different R versions.
  +
  +
The drawback is that, whenever switching to a new R release, the personal library stack '''must''' be rebuild with that new R version into the corresponding (version specific) library path. However, this is considered good practice anyway in order to ensure a consistent personal library stack for any specific R version in use. Mixing libraries built with different major and minor R versions is discouraged, as this may (or may not) result in unpredictable and subtle errors. Packages that are built and installed with one version of R may be incompatible with a newer version of R, at least when the major or minor version changes. The same is true if several versions are used simultaneously, e.g. a newer R version for a more recently started project and and older version for another project (but eventually picking up libraries built with the newer R version).
  +
  +
Special care has also to be taken by users who always load the default version, i.e.
   
  +
<pre>
For information specific to a single version, see the information available via the module system with the command
 
  +
$ module load math/R
  +
</pre>
  +
  +
as the default version number may change any time. Is is therefore highly recommended to always load a specific version, e.g.
  +
  +
<pre>
  +
$ module load math/R/3.6.3
  +
</pre>
  +
  +
== Optional packages for R ==
  +
  +
The following guides provide detailed instructions about how to build optional R packages. Please write a [http://www.support.bwhpc-c5.de ticket] if the instructions do not work for you or are outdated.
  +
  +
* [[Rgdal]]
  +
* [[Rjags]]
  +
* [[Rstan]]
  +
  +
= Version-specific Information =
  +
  +
For specific help about a particular R version, check the information available via the module system with the following command:
 
<pre>
 
<pre>
 
$ module help math/R
 
$ module help math/R
 
</pre>
 
</pre>
   
  +
= Installed R plugins =
  +
* [[CummeRbund_(R-package)|Bioinformatics: cummeRbund]]
  +
<br>
  +
<br>
   
  +
----
[[Category:Mathematics software]][[Category:bwUniCluster]]
 
  +
[[Category:Mathematics software]]
  +
[[Category:BwUniCluster]]
  +
[[Category:BwUniCluster_2.0]]
  +
[[Category:BwForCluster_BinAC]]
  +
[[Category:BwForCluster_MLS&WISO_Production]]

Revision as of 14:54, 21 April 2021

Description Content
module load math/R
License GPL
Citing n/a
Links Homepage | Documentation
Graphical Interface No
Plugins User dependent

1 Description

R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.

R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.

One of R’s strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.

R is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License in source code form. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux), Windows and MacOS.

2 Availability

R is available on selected bwHPC-Clusters. A complete list of versions currently installed on the bwHPC-Clusters can be obtained from the Cluster Information System (CIS).

In order to check which versions of R are installed on the compute cluster, run the following command:

$ module avail math/R

3 Usage

The R installation also provides the standalone library libRmath. This library allows you to access R routines from your own C or C++ programs (see section 9 of the 'R Installation and Administration' manual).

3.1 Loading the module

You can load the default version of R with the following command:

$ module load math/R

The module will try to load all modules it needs to function (e.g. compiler/intel). If loading the module fails, check if you have already loaded one of those modules, but not in the version required by R.

If you wish to load another (older) version of R, you can do so using

$ module load math/R/<version>

with <version> specifying the desired version.

3.2 Program Binaries

Standard usage:

Usage: R [options] [< infile] [> outfile]
    R CMD command [arguments]
  
Example: R CMD BATCH script.R

Executing R in batch mode:

R CMD BATCH --no-save --no-restore <INPUT_FILE>.R

For help run

R --help

For command help run

R CMD command --help

Further information and help

Man pages:          man R               man Rscript
Info pages, e.g.:   info R-intro        info R-FAQ
Manuals:            $R_DOC_DIR/manual

3.3 Multithreading in R

An easy way to use multiple cores on a single node with R is to use the doParallel package in combination with foreach.

4 Examples

As with all processes that require more than a few minutes to run, non-trivial compute jobs must be submitted to the cluster queuing system.

Example scripts are available in the directory $R_EXA_DIR:

$ module show math/R                      # show environment variables, which will be available after 'module load'
$ module load math/R                      # load module
$ ls $R_EXA_DIR                           # show content of directory $R_EXA_DIR
$ cat $R_EXA_DIR/README                   # show examples README


5 Installing R-Packages into your home folder

Since we cannot provide a software module for every R package, we recommend to install special R packages locally into your home folder. One possibility doing this is from within an interactive R session:

> library()                                                                # List preinstalled packages
> install.packages('package_name', repos="http://cran.r-project.org")      # Installing your R package and the dependencies 
> library(package_name)                                                    # Loading the package into you R instance

The package is now installed permanently in your home folder and is available every time you start R.

Note:

By default R uses a version (and platform) specific path for personal libraries, such as "$HOME/R/x86_64-pc-linux-gnu-library/x.y" for R version x.y.z. This directory will be created automatically (after confirmation) when installing a personal package for the first time.

Users can customize a common location of their personal library packages, e.g. ~/R_libs, rather than the default location. A customized directory must exist before installing a personal package for the first time, i.e.

$ mkdir -p ~/R_libs

The location must also be defined in a configuration file ~/.Renviron within the home directory containing the following line:

R_LIBS_USER="~/R_libs"

By setting up a (fixed) custom location for personal library packages, any personal package installed into that directory will be visible across different R versions. This may be advantageous if the packages are to be used with different (future) R versions.

A version specific path, such as the default path, allows users to maintain multiple personal library stacks for different (major and minor) R versions and does also prevent users from mixing their stack with libraries built with different R versions.

The drawback is that, whenever switching to a new R release, the personal library stack must be rebuild with that new R version into the corresponding (version specific) library path. However, this is considered good practice anyway in order to ensure a consistent personal library stack for any specific R version in use. Mixing libraries built with different major and minor R versions is discouraged, as this may (or may not) result in unpredictable and subtle errors. Packages that are built and installed with one version of R may be incompatible with a newer version of R, at least when the major or minor version changes. The same is true if several versions are used simultaneously, e.g. a newer R version for a more recently started project and and older version for another project (but eventually picking up libraries built with the newer R version).

Special care has also to be taken by users who always load the default version, i.e.

$ module load math/R

as the default version number may change any time. Is is therefore highly recommended to always load a specific version, e.g.

$ module load math/R/3.6.3

5.1 Optional packages for R

The following guides provide detailed instructions about how to build optional R packages. Please write a ticket if the instructions do not work for you or are outdated.

6 Version-specific Information

For specific help about a particular R version, check the information available via the module system with the following command:

$ module help math/R

7 Installed R plugins