SAS - bwHPC Wiki SAS - bwHPC Wiki

SAS

From bwHPC Wiki
Jump to: navigation, search
Description Content
module load math/sas
Availability bwForCluster_MLS&WISO_Production
License Commercial.
Citing n/a
Links Homepage | Documentation
Graphical Interface Yes
Comments The usage of SAS may be restricted to certain user communities

according to the terms of the license.

1 Description

SAS is a software suite developed by SAS Institute GmbH for advanced analytics, multivariate analyses, business intelligence, data management, and predictive analytics.

2 Versions and Availability

A list of versions currently available on all bwHPC-C5-Clusters can be obtained from the
Cluster Information System CIS

3 Loading

To check if SAS is available execute

$ module avail math/sas

If SAS is available you can load a specific version or you can load the default version with

$ module load math/sas

4 General Usage

SAS can be used interactively with a graphical front-end or SAS can run a script in batch mode which is useful when submitting batch jobs to the cluster. After loading SAS the different modes can be used as follows.

Interactively with GUI (needs X11 forwarding):
$ sas
Batch mode. Input is taken from script example.sas:
$ sas example.sas
It is recommended to use the local SSD disk on the compute nodes for the SAS work directory with the following option:
$ sas -work $TMPDIR

For an introduction to SAS we refer to the online documentation (SAS Documentation).

5 Parallel Computation

The SAS options CPUCOUNT and THREAD control the parallel execution of procedures like SORT or MEANS on several CPU cores of a compute node. On nodes with 16 cores use CPUCOUNT=16. The option THREAD is activated by default. To check the settings use:

Proc Options Option=Cpucount Threads; Run;

In addition, MPCONNECT from the module SAS/CONNECT allows the parallel execution of several program steps. With RSUBMIT it is possible to send workload to parallel sessions.

The following SAS programs show how to use the two possibilties for parallelization. Both examples use a node up to 16 cores. The second example could use more than one node by employing signon procedures.

5.1 Example 1

Simple sort and print step. Requirement: The folder 'daten' must exist.

Libname meine "~/daten";
Options Cpucount=16 Thread;
Proc Sort Data=sashelp.class Out=meine.class;
   By name;
Run;
Title "Tabelle class sortiert nach Namen";
Footnote "Calculated on %sysfunc(date(), ddmmyyp10.) at %sysfunc(time(), time5.)";
Proc Print Data=meine.class Label;
   Var name age;
   Label age="Alter";
Run;
Title;Footnote;

5.2 Example 2

Two data and sort steps are executed in parallel (taska, taskb and rsubmit) and merged afterwards (normal submit). The names of the work libraries are saved as macro variables. For further infomation see: http://www2.sas.com/proceedings/sugi29/124-29.pdf

Options Sascmd="sas -Nosyntaxcheck" Autosignon Cpucount=16;

Rsubmit taska Wait=No Sysrputsync=Yes;
Data aaa;
   Do i=1 To 1000;
     x=Rannor(-123);
     group=1;
     Output;
   End;
Run;
Proc Sort Data=aaa;
   By x;
Run;
%Sysrput patha=%Sysfunc(Pathname(Work));
Endrsubmit;

Rsubmit taskb Wait=No Sysrputsync=Yes;
Data bbb;
   Do i=1 To 1000;
     x=3+Rannor(-123);
     group=2;
     Output;
   End;
Run;
Proc Sort Data=bbb;
   By x;
Run;
%Sysrput pathb=%Sysfunc(Pathname(Work));
Endrsubmit;

Waitfor _all_ taska taskb;
Libname worka "&patha";
Libname workb "&pathb";

Libname neu "~";
Data neu.gesamt;
   Set worka.aaa workb.bbb;
Run;
Title "Mittel- und weitere Kennwerte von Zufallszahlen";
Footnote "Calculated on %sysfunc(date(), ddmmyyp10.) at %sysfunc(time(), time5.)";
Proc Means Data=neu.gesamt;
   Class group;
Run;
Title; Footnote;

Signoff taska;
Signoff taskb;

6 Batch Example

As with all processes that require more than a few minutes to run, non-trivial compute jobs must be submitted to the cluster queuing system.

An example script is available in the directory $SAS_EXA_DIR:

$ module show math/sas             # show environment variables, which will be available after 'module load'
$ module load math/sas             # load module
$ ls $SAS_EXA_DIR                  # show content of directory $SAS_EXA_DIR

Run a first simple example job

$ module load math/sas                # load module
$ mkdir sastest                       # create test directory
$ cp -r $SAS_EXA_DIR/*  sastest/      # copy example files to test directory
$ cd sastest/                         # change to directory
$ nano bwhpc-sas.moab                 # change job options if desired, quit with 'CTRL+X'
$ msub bwhpc-sas.moab                 # submit job
$ checkjob -v <JOBID>                 # check state of job
$ ls                                  # when job finishes the results will be visible in this directory

7 References

Support for Parallel Processing, Auszug aus SAS 9.4 Language Reference: Concepts, Second Edition
http://documentation.sas.com/?docsetId=lrcon&docsetTarget=n0z5kinpzecv9nn1s45yam93tf6z.htm&docsetVersion=9.4&locale=de

M.M. Buchecker: Parallel Processing Hands-On Workshop. SUGI 29, Paper 124-29,
http://www2.sas.com/proceedings/sugi29/124-29.pdf

G. Jacobsen, R. Lavery: A Parallel Processing Primer. NESUG 17 (Northeast SAS Users Group), Administration & Support,
http://www.lexjansen.com/nesug/nesug04/as/as02.pdf

SAS Techies: SAS/CONNECT Parallel Processing on a SAS SMP Machine.
http://sastechies.blogspot.com/2010/01/sasconnect-parallel-processing-on-sas.html

Parallelisierung mit MP Connect. Aus: Deutschsprachiges SAS-Wiki.
http://saswiki.org/wiki/Parallelisierung_mit_MP_Connect