Helix/Software/SAS
The main documentation is available on the cluster via |
SAS | |
---|---|
module load | math/SAS |
License | commercial (restricted to certain user communities) |
Links | Homepage | Documentation |
Graphical Interface | Yes |
Introduction
SAS is a software suite developed by SAS Institute GmbH for advanced analytics, multivariate analyses, business intelligence, data management, and predictive analytics.
License
The software is available for users of Heidelberg University. It can also be provided for users of other Universities, if these prove to be in possession of a valid license.
General Usage
SAS can be used interactively with a graphical front-end or SAS can run a script in batch mode which is useful when submitting batch jobs to the cluster. After loading SAS the different modes can be used as follows.
Interactively with GUI (needs X11 forwarding):
$ sas
Batch mode. Input is taken from script example.sas:
$ sas example.sas
For an introduction to SAS we refer to the online documentation (SAS Documentation).
Parallel Computation
The SAS options CPUCOUNT and THREAD control the parallel execution of procedures like SORT or MEANS on several CPU cores of a compute node. Start with CPUCOUNT=2 and double the value step by step as long as you see a reasonable improvement of the runtime. The option THREAD is activated by default. To check the settings use:
Proc Options Option=Cpucount Threads; Run;
In addition, MPCONNECT from the module SAS/CONNECT allows the parallel execution of several program steps. With RSUBMIT it is possible to send workload to parallel sessions.
The following SAS programs show how to use the two possibilties for parallelization. Both examples use up to 16 cores on a compute node. The second example could use more than one node by employing signon procedures.
Example 1
Simple sort and print step. Requirement: The folder 'daten' must exist.
Libname meine "~/daten"; Options Cpucount=16 Threads; Proc Sort Data=sashelp.class Out=meine.class; By name; Run; Title "Tabelle class sortiert nach Namen"; Footnote "Calculated on %sysfunc(date(), ddmmyyp10.) at %sysfunc(time(), time5.)"; Proc Print Data=meine.class Label; Var name age; Label age="Alter"; Run; Title;Footnote;
Example 2
Two data and sort steps are executed in parallel (taska, taskb and rsubmit) and merged afterwards (normal submit). The names of the work libraries are saved as macro variables. For further infomation see: http://www2.sas.com/proceedings/sugi29/124-29.pdf
Options Sascmd="sas -Nosyntaxcheck" Autosignon Cpucount=16; Rsubmit taska Wait=No Sysrputsync=Yes; Data aaa; Do i=1 To 1000; x=Rannor(-123); group=1; Output; End; Run; Proc Sort Data=aaa; By x; Run; %Sysrput patha=%Sysfunc(Pathname(Work)); Endrsubmit; Rsubmit taskb Wait=No Sysrputsync=Yes; Data bbb; Do i=1 To 1000; x=3+Rannor(-123); group=2; Output; End; Run; Proc Sort Data=bbb; By x; Run; %Sysrput pathb=%Sysfunc(Pathname(Work)); Endrsubmit; Waitfor _all_ taska taskb; Libname worka "&patha"; Libname workb "&pathb"; Libname neu "~"; Data neu.gesamt; Set worka.aaa workb.bbb; Run; Title "Mittel- und weitere Kennwerte von Zufallszahlen"; Footnote "Calculated on %sysfunc(date(), ddmmyyp10.) at %sysfunc(time(), time5.)"; Proc Means Data=neu.gesamt; Class group; Run; Title; Footnote; Signoff taska; Signoff taskb;
References
Support for Parallel Processing, Auszug aus SAS 9.4 Language Reference: Concepts, Second Edition
http://documentation.sas.com/?docsetId=lrcon&docsetTarget=n0z5kinpzecv9nn1s45yam93tf6z.htm&docsetVersion=9.4&locale=de
M.M. Buchecker: Parallel Processing Hands-On Workshop. SUGI 29, Paper 124-29,
http://www2.sas.com/proceedings/sugi29/124-29.pdf
G. Jacobsen, R. Lavery: A Parallel Processing Primer. NESUG 17 (Northeast SAS Users Group), Administration & Support,
http://www.lexjansen.com/nesug/nesug04/as/as02.pdf
SAS Techies: SAS/CONNECT Parallel Processing on a SAS SMP Machine.
http://sastechies.blogspot.com/2010/01/sasconnect-parallel-processing-on-sas.html