Workspace: Difference between revisions

From bwHPC Wiki
Jump to navigation Jump to search
m (table with site features)
Tag: Reverted
Line 1: Line 1:
'''Workspace tools''' provide temporary scratch space so calles '''workspaces''' for your calculation on a central file storage. They are meant to keep data for a limited time – but usually longer than the time of a single job run.
'''Workspace tools''' provide temporary scratch space called '''workspaces''' for your calculations on a central file storage. They are meant to keep data for a limited time – but usually longer than the time of a single job run.


== No Backup ==
== Important ==


* '''No Backup:''' Data in workspaces is '''not backed up''' and will be '''automatically deleted''' after expiration
Workspaces are not meant for permanent storage, hence data in workspaces is not backed up and may be lost in case of problems on the storage system. Please copy/move important results to $HOME or some disks outside the cluster.
* '''Time-limited:''' Every workspace has a limited lifetime (typically 30-100 days depending on cluster)
* '''Automatic Email Reminders:''' You will receive email notifications before expiration
* '''Backup Important Data:''' Copy important results to appropriate permanent storage before expiration (location depends on your cluster/site policies)


== Quick Start - Most Common Commands ==
== Create workspace ==
To create a workspace you need to state ''name'' of your workspace and ''lifetime'' in days. A maximum value for ''lifetime'' and a maximum number of renewals is defined on each cluster. Execution of:


{| class="wikitable"
$ ws_allocate mySpace 30
|-
!style="width:40%" | Task
!style="width:60%" | Command
|-
|Create workspace for 30 days
|<tt>ws_allocate myWs 30</tt>
|-
|Create group-writable workspace
|<tt>ws_allocate -G groupname myWs 30</tt>
|-
|List all your workspaces
|<tt>ws_list</tt>
|-
|Find workspace path (for scripts)
|<tt>ws_find myWs</tt>
|-
|Check which expire soon
|<tt>ws_list -R</tt>
|-
|Extend workspace by 30 days
|<tt>ws_extend myWs 30</tt>
|-
|Delete/release workspace
|<tt>ws_release myWs</tt>
|-
|Restore released workspace
|<tt>ws_restore -l</tt> then <tt>ws_restore oldname newname</tt>
|}

== Create Workspace ==

To create a workspace you need to specify a '''name''' and '''lifetime''' in days:

$ ws_allocate myWs 30


e.g. returns:
This returns:
Workspace created. Duration is 720 hours.
Workspace created. Duration is 720 hours.
Further extensions available: 3
Further extensions available: 3
/work/workspace/scratch/username-mySpace-0
/work/workspace/scratch/username-myWs-0


'''Important:''' Creating a workspace a second time with the same command is safe - it always returns the same path. This makes it perfect for batch job scripts.
For more information read the program's help, i.e. ''$ ws_allocate -h''.


'''Capture the path in a variable:'''
== List all your workspaces ==


$ WORKSPACE=$(ws_allocate myWs 30)
To list all your workspaces, execute:
$ cd $WORKSPACE

'''For all options and advanced usage,''' see the [[Workspaces/Advanced_Features#Detailed_ws_allocate_Options|Advanced Features guide]].

== List Your Workspaces ==

To see all your workspaces:


$ ws_list
$ ws_list


Shows:
which will return:
* Workspace ID
* Workspace ID
* Workspace location
* Workspace location
* available extensions
* Available extensions
* creation date and remaining time
* Creation date and remaining time


'''Useful options:'''
To list expired workspaces, see [[Workspace#Restore_an_Expired_Workspace|Restore an Expired Workspace]].
* <tt>ws_list -R</tt> - Sort by remaining time (see what expires soon)
* <tt>ws_list -s</tt> - Short format (only names, good for scripts)


== Find workspace location ==
== Find Workspace Path ==


Get the path to a workspace for use in scripts:
Workspace location/path can be prompted for any workspace ''ID'' using '''ws_find''', in case of workspace ''mySpace'':


$ ws_find mySpace
$ ws_find myWs


Returns:
returns the one-liner:


/work/workspace/scratch/username-mySpace-0
/work/workspace/scratch/username-myWs-0


'''In scripts:'''


$ cd $(ws_find myWs)
== Extend lifetime of your workspace ==
$ WORKSPACE=$(ws_find myWs)


== Extend Workspace Lifetime ==
Any workspace's lifetime can be only extended a cluster-specific number of times. There several commands to extend workspace lifetime
#<pre>$ ws_extend mySpace 40</pre> which extends workspace ID ''mySpace'' by ''40'' days from now,
#<pre>$ ws_extend mySpace</pre> which extends workspace ID ''mySpace'' by the number days used previously
#<pre>$ ws_allocate -x mySpace 40</pre> which extends workspace ID ''mySpace'' by ''40'' days from now.
<br>


Extend a workspace before it expires:
== Setting Permissions for Sharing Files ==
The examples will assume you want to change the directory in $DIR. If you want to share a workspace, DIR could be set with <code>DIR=$(ws_find my_workspace)</code>


$ ws_extend myWs 30 # Extend by 30 days from now
=== Workspace Tools ===


Or use:
* ws_share
<code syntax=bash>ws_share share workspacename username</code>


$ ws_allocate -x myWs 30 # Alternative command
allows you to grant the user username read access to the workspace.


'''Note:''' Each extension consumes one of your available extensions (typically 3-100 depending on cluster).
Newer versions of the workspace tools have sharing options to ws_allocate:


== Release (Delete) Workspace ==
* -G option of ws_allocate
<code syntax=bash>ws_allocate -G groupname workspacename duration</code>


{| class="wikitable"
** groupname: name of the group you want to share with
|-
** workspacename: what you want to call your workspace
!style="width:40%" | Works on cluster
** duration: how long the workspace is supposed to last in days
!style="width:10%" | bwUC 3.0
!style="width:10%" | BinAC2
!style="width:10%" | Helix
!style="width:10%" | JUSTUS 2
!style="width:10%" | NEMO2
|-
|<tt>ws_release --delete-data</tt> (immediate deletion)
|style="background-color:#90EE90; text-align:center;" | ✓
| style="text-align:center;" |
| style="text-align:center;" |
| style="text-align:center;" |
|style="background-color:#90EE90; text-align:center;" | ✓
|}


When you no longer need a workspace:
Essentially this tool sets regular unix rwx permissions for the group plus the "suid" bit on the directory to make the permission inheritable.


$ ws_release myWs
=== Regular Unix Permissions ===


'''What happens:'''
Making workspaces world readable/writable using standard unix access rights with <tt>chmod</tt> is only feasible if you are in a research group and you and your co-workers share a common ("bwXXXXX") unix group.
* Workspace becomes inaccessible
* Data is kept for a grace period (can be restored, see below)
* Real deletion happens later (typically during nighttime)


'''To free quota immediately:'''
Do '''not''' make files readable or even writable to everyone or to large common groups ("all students").

$ ws_release --delete-data myWs # Immediate deletion (WARNING: cannot be recovered!)

Or with older workspace tools:

$ WSDIR=$(ws_find myWs) && [ -n "$WSDIR" ] && rm -rf "$WSDIR" # Delete data first (with safety check)
$ ws_release myWs # Then release

== Restore Workspace ==


{| class="wikitable"
{| class="wikitable"
|-
|-
!style="width:45%" | Command
!style="width:40%" | Works on cluster
!style="width:55%" | Action
!style="width:10%" | bwUC 3.0
!style="width:10%" | BinAC2
!style="width:10%" | Helix
!style="width:10%" | JUSTUS 2
!style="width:10%" | NEMO2
|-
|-
|<tt>chgrp -R bw16e001 "$DIR"</tt>
|<tt>ws_restore</tt>
|style="background-color:#90EE90; text-align:center;" | ✓
<tt>chmod -R g+rX "$DIR"</tt>
|style="background-color:#90EE90; text-align:center;" | ✓
|Set group ownership and grant read access to group for files in workspace via unix rights to the group "bw16e001" (has to be re-done if files are added)
|style="background-color:#90EE90; text-align:center;" | ✓
|-
|style="background-color:#90EE90; text-align:center;" | ✓
|<tt>chgrp -R bw16e001 "$DIR"</tt>
|style="background-color:#90EE90; text-align:center;" | ✓
<tt>chmod -R g+rswX "$DIR"</tt>
|Set group ownership and grant read/write access to group for files in workspace via unix rights (has to be re-done if files are added). Group will be inherited by new files, but rights for the group will have to be re-set with chmod for every new file
|-
|}
|}


If you released a workspace by accident or need to recover an expired one, you can restore it within a grace period:
Options used:
* -R: recursive
* g+rwx
** g: group
** + add permissions (- to remove)
** rwx: read, write, execute


'''(1) List restorable workspaces:'''
=== "ACL"s: Access Crontrol Lists ===
ACLs allow a much more detailed distribution of permissions but are a bit more complicated and not visible in detail via "ls". They have the additional advantage that you can set a "default" ACL for a directory, (with a <tt>-d</tt> flag or a <tt>d:</tt> prefix) which will cause all newly created files to inherit the ACLs from the directory. Regular unix permissions only have limited support (only group ownership, not access rights) for this via the suid bit.


$ ws_restore -l
Best practices with respect to ACL usage:
# Take into account that ACL take precedence over standard unix access rights
# The owner of a workspace is responsible for its content and management


'''(2) Create a new target workspace:'''
Please note that <tt>ls</tt> (List directory contents) shows ACLs on directories and files only when run as <tt>ls -l</tt> as in long format, as "plus" sign after the standard unix access rights.

$ ws_allocate restored 60

'''(3) Restore the expired workspace:'''

$ ws_restore username-myWs-0 restored

'''Note:''' Use the '''full name''' from <tt>ws_restore -l</tt> (including username and timestamp), not the short name from <tt>ws_list</tt>.

'''For detailed restore options,''' see the [[Workspaces/Advanced_Features#Restore_an_Expired_Workspace|Advanced Features guide]].

== Share Workspace ==


Examples with regard to "my_workspace":
{| class="wikitable"
{| class="wikitable"
|-
|-
!style="width:45%" | Command
!style="width:40%" | Works on cluster
!style="width:55%" | Action
!style="width:10%" | bwUC 3.0
!style="width:10%" | BinAC2
!style="width:10%" | Helix
!style="width:10%" | JUSTUS 2
!style="width:10%" | NEMO2
|-
|-
|<tt>getfacl "$DIR"</tt>
|<tt>-g</tt> option (group-readable)
| style="text-align:center;" |
|List access rights on $DIR
| style="text-align:center;" |
| style="text-align:center;" |
| style="text-align:center;" |
|style="background-color:#90EE90; text-align:center;" | ✓
|-
|-
|<tt>-G</tt> option (group-writable)
|<tt>setfacl -Rm user:fr_xy1:rX,default:user:fr_xy1:rX "$DIR"</tt>
| style="text-align:center;" |
|Grant user "fr_xy1" read-only access to $DIR
| style="text-align:center;" |
|-
| style="text-align:center;" |
|<tt>setfacl -R -m user:fr_me0000:rwX,default:user:fr_me0000:rwX "$DIR"</tt>
| style="text-align:center;" |
<tt>setfacl -R -m user:fr_xy1:rwX,default:user:fr_xy1:rwX "$DIR"</tt>
|style="background-color:#90EE90; text-align:center;" | ✓
|Grant your own user "fr_me0000" and "fr_xy1" inheritable ("default") read and write access to $DIR, so you can also read/write files put into the workspace by a coworker
|-
|<tt>setfacl -Rm group:bw16e001:rX,default:group:bw16e001:rX "$DIR"</tt>
|Grant group (Rechenvorhaben) "bw16e001" read-only access to $DIR
|-
|<tt>setfacl -Rb "$DIR"</tt>
|Remove all ACL rights. Standard Unix access rights apply again.
|}
|}


You can share workspaces with team members:
Options used:

* -R: recursive
'''Important:''' Not all sharing options are available on all clusters. ACL-based methods like <tt>ws_share</tt> require filesystem support and may not work everywhere. If one method doesn't work, try an alternative approach.
* -m: modify

* user:username:rwX user: next name is a user; rwX read, write, eXecute (only where execute is set for user)
'''Group-readable workspace''' (read-only for group):
* default:[user|group] set the default for user or group for new files or dierctories

$ ws_allocate -g myWs 30

'''Group-writable workspace''' (read-write for group, recommended):

$ ws_allocate -G projectgroup myWs 30

'''Recommended approach:'''
* Use <tt>-g</tt> or <tt>-G</tt> flags during workspace creation
* For read-only sharing: use <tt>-g</tt>
* For collaborative work (read-write): use <tt>-G groupname</tt>
* Set <tt>groupname</tt> in <tt>~/.ws_user.conf</tt> if you always work with the same group

'''For advanced sharing options''' (ACL-based, read-only, less common), see the [[Workspaces/Advanced_Features#Cooperative_Usage_.28Group_Workspaces_and_Sharing.29|Advanced Features guide]].

== Command Overview ==

The workspace tools consist of several commands:

* <tt>ws_allocate</tt> - Create or extend a workspace
* <tt>ws_list</tt> - List all your workspaces
* <tt>ws_find</tt> - Find the path to a workspace
* <tt>ws_extend</tt> - Extend the lifetime of a workspace
* <tt>ws_release</tt> - Release (delete) a workspace
* <tt>ws_restore</tt> - Restore an expired or released workspace
* <tt>ws_register</tt> - Create symbolic links to workspaces

All commands support <tt>-h</tt> or <tt>--help</tt> to show detailed usage information.

== Using Workspaces in Batch Jobs ==

'''Recommended approach:''' Create your workspace manually before submitting jobs, then reference it in your job scripts using <tt>ws_find</tt>.

'''(1) Create workspace once (on login node):'''

$ ws_allocate myProject 60

'''(2) Use in job scripts with ws_find:'''

<pre>
#!/bin/bash
#SBATCH --job-name=my_job
#SBATCH --time=24:00:00

# Find existing workspace
WORKSPACE=$(ws_find myProject)

# Change to workspace
cd $WORKSPACE

# Your computation here
./my_program --input input.dat --output results.dat
</pre>

'''Warning:''' Avoid using <tt>ws_allocate</tt> directly in job scripts that run frequently. While <tt>ws_allocate</tt> is safe to call multiple times on the same workspace name (it returns the existing workspace), you should not create too many workspaces unnecessarily. Create workspaces manually when needed, then use <tt>ws_find</tt> in your job scripts to locate them.

== Advanced Features ==

For detailed information about advanced workspace features, configuration options, and less frequently used commands, see the separate [[Workspaces/Advanced_Features]] guide.

Topics covered in the advanced guide include:
* Complete command reference with all options
* Multiple filesystem locations
* Detailed options for ws_allocate, ws_list, ws_find, ws_extend
* Email and calendar reminders configuration
* Group workspaces and cooperative usage
* Advanced sharing with ws_share (ACL-based, read-only)
* Setting permissions (ACLs and Unix permissions)
* Deleting and restoring workspaces in detail
* Cluster-specific limits and quotas
* Checking workspace quotas
* Registering workspace links

== Best Practices and Recommendations ==

=== For All Users ===

# '''Set up ~/.ws_user.conf''' - Configure default reminder timing, duration, and groupname to avoid typing them repeatedly (see [[Workspaces/Advanced_Features#Example_.7E.2F.ws_user.conf_Configuration|example configuration]])
# '''Email reminders are automatic''' - Notifications are sent automatically using your identity provider email; only use <tt>-r</tt> to customize reminder timing if needed
# '''Custom email only if needed''' - Only use <tt>-m</tt> option to override the email address from your identity provider
# '''Use ws_register''' - Create symbolic links to your workspaces in a convenient directory: <tt>ws_register ~/workspaces</tt>
# '''Create workspaces manually''' - Create workspaces on the login node before submitting jobs, then use <tt>ws_find</tt> in your job scripts
# '''Track your workspaces''' - Regularly run <tt>ws_list -R</tt> to see which workspaces will expire soon
# '''Backup important data''' - Workspaces are temporary and not backed up - copy results to appropriate permanent storage (check your cluster/site policies for backup locations)
# '''Clean up regularly''' - Release workspaces you no longer need to keep filesystems organized

=== For Short-term Jobs (hours to days) ===


# Use default or short durations (1-7 days)
== Delete a Workspace ==
# Consider using a single workspace for a series of related jobs
# Use <tt>ws_find</tt> in job scripts to locate the workspace


=== For Long-term Campaigns (weeks to months) ===
$ ws_release mySpace # Manually erase your workspace mySpace


# Request maximum allowed duration
Note: workspaces are kept for some time after release. To immediately delete and free space e.g. for quota reasons, delete the files with rm before release.
# Email reminders are sent automatically; optionally customize reminder timing with <tt>-r</tt> option
# Use <tt>ws_list -R</tt> regularly to monitor remaining time
# Plan data archival to appropriate permanent storage before expiration (check cluster/site policies)


=== For Collaborative Work ===
Newer versions of workspace tools have a --delete-data flag that immediately deletes data. Note that deleted data from workspaces is permanently lost.


# Use <tt>ws_allocate -G groupname</tt> for shared write access (recommended)
== Restore an Expired Workspace ==
# Set <tt>groupname</tt> in <tt>~/.ws_user.conf</tt> if you always work with the same group
# Use <tt>ws_allocate -g</tt> for read-only sharing within group
# Document the workspace location for your team members
# For advanced sharing scenarios, see the [[Workspaces/Advanced_Features#Cooperative_Usage_.28Group_Workspaces_and_Sharing.29|Advanced Features guide]]


=== For Managing Multiple Filesystems ===
For a certain (system-specific) grace time following workspace expiration, a workspace can be restored by performing the following steps:


# '''Note:''' Most clusters have only one default filesystem - the <tt>-F</tt> option is rarely needed
(1) Display restorable workspaces.
# Use <tt>ws_list -l</tt> first to check if multiple filesystems are available on your cluster
ws_restore -l
# Use <tt>-F</tt> option only if you need specific filesystem for performance or capacity needs (see [[Workspaces/Advanced_Features#Multiple_Filesystem_Locations|filesystem options]])
# '''bwUniCluster 3.0 filesystems:'''
#* '''Default Lustre filesystem:''' Standard workspace location, best for large files and sequential I/O
#* '''Flash filesystem (ffuc):''' SSD-based storage for KIT/HoreKa users, shared between bwUniCluster 3.0 and HoreKa
#* Use flash filesystem for workloads with many small files, random I/O, AI/ML training, or compilation
#* Balance load: use <tt>-F ffuc</tt> when appropriate to reduce load on default filesystem
# '''General guidelines:'''
#* Flash-based filesystems (SSD/NVMe): Use for many small files, low-latency requirements, random I/O
#* Standard Lustre/parallel filesystems: Best for large files and sequential I/O patterns


=== For Different Data Types ===
(2) Create a new workspace as the target for the restore:
ws_allocate restored 60


# '''Large sequential I/O:''' Use standard workspace filesystem (Lustre best for very large files, Weka excellent for both large and small)
(3) Restore:
# '''Many small files or random access:''' Use flash-based workspace filesystem like Weka (NEMO2) or bwUniCluster ffuc, or stage to <tt>$TMPDIR</tt>
ws_restore <full_name_of_expired_workspace> restored
# '''Data read multiple times on single node:''' Copy to <tt>$TMPDIR</tt> at job start for best performance
# '''Temporary data for single node:''' Always use <tt>$TMPDIR</tt>, not workspaces
# '''Multi-node temporary data:''' Use workspaces (not suitable for <tt>$TMPDIR</tt>)
# '''AI/ML training data:''' Use Weka (NEMO2) or flash filesystems for best performance, or stage to <tt>$TMPDIR</tt> for repeated access
# '''Compilation/build directories:''' Use flash-based filesystems (Weka, ffuc) or <tt>$TMPDIR</tt> for better performance


=== For Quota Management ===
The expired workspace has to be specified using the '''full name''', including username prefix and timestamp suffix (otherwise, it cannot be uniquely identified).
The target workspace, on the other hand, must be given with just its short name as listed by <code>ws_list</code>, without the username prefix.


# Delete data before releasing if you need immediate quota relief: <tt>WSDIR=$(ws_find workspace) && [ -n "$WSDIR" ] && rm -rf "$WSDIR"</tt> then <tt>ws_release workspace</tt>
If the workspace is no visible/restorable, it has been '''permanently deleted''' and cannot be restored, not even by us. Please always remember, that workspaces are intended solely for temporary work data, and there is no backup of data in the workspaces.
# Use <tt>ws_release --delete-data</tt> (newer versions) for immediate deletion
# Remember: released workspaces may still count toward quota during grace period

Revision as of 12:01, 18 November 2025

Workspace tools provide temporary scratch space called workspaces for your calculations on a central file storage. They are meant to keep data for a limited time – but usually longer than the time of a single job run.

Important

  • No Backup: Data in workspaces is not backed up and will be automatically deleted after expiration
  • Time-limited: Every workspace has a limited lifetime (typically 30-100 days depending on cluster)
  • Automatic Email Reminders: You will receive email notifications before expiration
  • Backup Important Data: Copy important results to appropriate permanent storage before expiration (location depends on your cluster/site policies)

Quick Start - Most Common Commands

Task Command
Create workspace for 30 days ws_allocate myWs 30
Create group-writable workspace ws_allocate -G groupname myWs 30
List all your workspaces ws_list
Find workspace path (for scripts) ws_find myWs
Check which expire soon ws_list -R
Extend workspace by 30 days ws_extend myWs 30
Delete/release workspace ws_release myWs
Restore released workspace ws_restore -l then ws_restore oldname newname

Create Workspace

To create a workspace you need to specify a name and lifetime in days:

  $ ws_allocate myWs 30

This returns:

  Workspace created. Duration is 720 hours. 
  Further extensions available: 3
  /work/workspace/scratch/username-myWs-0

Important: Creating a workspace a second time with the same command is safe - it always returns the same path. This makes it perfect for batch job scripts.

Capture the path in a variable:

  $ WORKSPACE=$(ws_allocate myWs 30)
  $ cd $WORKSPACE

For all options and advanced usage, see the Advanced Features guide.

List Your Workspaces

To see all your workspaces:

  $ ws_list

Shows:

  • Workspace ID
  • Workspace location
  • Available extensions
  • Creation date and remaining time

Useful options:

  • ws_list -R - Sort by remaining time (see what expires soon)
  • ws_list -s - Short format (only names, good for scripts)

Find Workspace Path

Get the path to a workspace for use in scripts:

  $ ws_find myWs

Returns:

  /work/workspace/scratch/username-myWs-0

In scripts:

  $ cd $(ws_find myWs)
  $ WORKSPACE=$(ws_find myWs)

Extend Workspace Lifetime

Extend a workspace before it expires:

  $ ws_extend myWs 30              # Extend by 30 days from now

Or use:

  $ ws_allocate -x myWs 30         # Alternative command

Note: Each extension consumes one of your available extensions (typically 3-100 depending on cluster).

Release (Delete) Workspace

Works on cluster bwUC 3.0 BinAC2 Helix JUSTUS 2 NEMO2
ws_release --delete-data (immediate deletion)

When you no longer need a workspace:

  $ ws_release myWs

What happens:

  • Workspace becomes inaccessible
  • Data is kept for a grace period (can be restored, see below)
  • Real deletion happens later (typically during nighttime)

To free quota immediately:

  $ ws_release --delete-data myWs  # Immediate deletion (WARNING: cannot be recovered!)

Or with older workspace tools:

  $ WSDIR=$(ws_find myWs) && [ -n "$WSDIR" ] && rm -rf "$WSDIR"    # Delete data first (with safety check)
  $ ws_release myWs                                                # Then release

Restore Workspace

Works on cluster bwUC 3.0 BinAC2 Helix JUSTUS 2 NEMO2
ws_restore

If you released a workspace by accident or need to recover an expired one, you can restore it within a grace period:

(1) List restorable workspaces:

  $ ws_restore -l

(2) Create a new target workspace:

  $ ws_allocate restored 60

(3) Restore the expired workspace:

  $ ws_restore username-myWs-0 restored

Note: Use the full name from ws_restore -l (including username and timestamp), not the short name from ws_list.

For detailed restore options, see the Advanced Features guide.

Share Workspace

Works on cluster bwUC 3.0 BinAC2 Helix JUSTUS 2 NEMO2
-g option (group-readable)
-G option (group-writable)

You can share workspaces with team members:

Important: Not all sharing options are available on all clusters. ACL-based methods like ws_share require filesystem support and may not work everywhere. If one method doesn't work, try an alternative approach.

Group-readable workspace (read-only for group):

  $ ws_allocate -g myWs 30

Group-writable workspace (read-write for group, recommended):

  $ ws_allocate -G projectgroup myWs 30

Recommended approach:

  • Use -g or -G flags during workspace creation
  • For read-only sharing: use -g
  • For collaborative work (read-write): use -G groupname
  • Set groupname in ~/.ws_user.conf if you always work with the same group

For advanced sharing options (ACL-based, read-only, less common), see the Advanced Features guide.

Command Overview

The workspace tools consist of several commands:

  • ws_allocate - Create or extend a workspace
  • ws_list - List all your workspaces
  • ws_find - Find the path to a workspace
  • ws_extend - Extend the lifetime of a workspace
  • ws_release - Release (delete) a workspace
  • ws_restore - Restore an expired or released workspace
  • ws_register - Create symbolic links to workspaces

All commands support -h or --help to show detailed usage information.

Using Workspaces in Batch Jobs

Recommended approach: Create your workspace manually before submitting jobs, then reference it in your job scripts using ws_find.

(1) Create workspace once (on login node):

  $ ws_allocate myProject 60

(2) Use in job scripts with ws_find:

#!/bin/bash
#SBATCH --job-name=my_job
#SBATCH --time=24:00:00

# Find existing workspace
WORKSPACE=$(ws_find myProject)

# Change to workspace
cd $WORKSPACE

# Your computation here
./my_program --input input.dat --output results.dat

Warning: Avoid using ws_allocate directly in job scripts that run frequently. While ws_allocate is safe to call multiple times on the same workspace name (it returns the existing workspace), you should not create too many workspaces unnecessarily. Create workspaces manually when needed, then use ws_find in your job scripts to locate them.

Advanced Features

For detailed information about advanced workspace features, configuration options, and less frequently used commands, see the separate Workspaces/Advanced_Features guide.

Topics covered in the advanced guide include:

  • Complete command reference with all options
  • Multiple filesystem locations
  • Detailed options for ws_allocate, ws_list, ws_find, ws_extend
  • Email and calendar reminders configuration
  • Group workspaces and cooperative usage
  • Advanced sharing with ws_share (ACL-based, read-only)
  • Setting permissions (ACLs and Unix permissions)
  • Deleting and restoring workspaces in detail
  • Cluster-specific limits and quotas
  • Checking workspace quotas
  • Registering workspace links

Best Practices and Recommendations

For All Users

  1. Set up ~/.ws_user.conf - Configure default reminder timing, duration, and groupname to avoid typing them repeatedly (see example configuration)
  2. Email reminders are automatic - Notifications are sent automatically using your identity provider email; only use -r to customize reminder timing if needed
  3. Custom email only if needed - Only use -m option to override the email address from your identity provider
  4. Use ws_register - Create symbolic links to your workspaces in a convenient directory: ws_register ~/workspaces
  5. Create workspaces manually - Create workspaces on the login node before submitting jobs, then use ws_find in your job scripts
  6. Track your workspaces - Regularly run ws_list -R to see which workspaces will expire soon
  7. Backup important data - Workspaces are temporary and not backed up - copy results to appropriate permanent storage (check your cluster/site policies for backup locations)
  8. Clean up regularly - Release workspaces you no longer need to keep filesystems organized

For Short-term Jobs (hours to days)

  1. Use default or short durations (1-7 days)
  2. Consider using a single workspace for a series of related jobs
  3. Use ws_find in job scripts to locate the workspace

For Long-term Campaigns (weeks to months)

  1. Request maximum allowed duration
  2. Email reminders are sent automatically; optionally customize reminder timing with -r option
  3. Use ws_list -R regularly to monitor remaining time
  4. Plan data archival to appropriate permanent storage before expiration (check cluster/site policies)

For Collaborative Work

  1. Use ws_allocate -G groupname for shared write access (recommended)
  2. Set groupname in ~/.ws_user.conf if you always work with the same group
  3. Use ws_allocate -g for read-only sharing within group
  4. Document the workspace location for your team members
  5. For advanced sharing scenarios, see the Advanced Features guide

For Managing Multiple Filesystems

  1. Note: Most clusters have only one default filesystem - the -F option is rarely needed
  2. Use ws_list -l first to check if multiple filesystems are available on your cluster
  3. Use -F option only if you need specific filesystem for performance or capacity needs (see filesystem options)
  4. bwUniCluster 3.0 filesystems:
    • Default Lustre filesystem: Standard workspace location, best for large files and sequential I/O
    • Flash filesystem (ffuc): SSD-based storage for KIT/HoreKa users, shared between bwUniCluster 3.0 and HoreKa
    • Use flash filesystem for workloads with many small files, random I/O, AI/ML training, or compilation
    • Balance load: use -F ffuc when appropriate to reduce load on default filesystem
  5. General guidelines:
    • Flash-based filesystems (SSD/NVMe): Use for many small files, low-latency requirements, random I/O
    • Standard Lustre/parallel filesystems: Best for large files and sequential I/O patterns

For Different Data Types

  1. Large sequential I/O: Use standard workspace filesystem (Lustre best for very large files, Weka excellent for both large and small)
  2. Many small files or random access: Use flash-based workspace filesystem like Weka (NEMO2) or bwUniCluster ffuc, or stage to $TMPDIR
  3. Data read multiple times on single node: Copy to $TMPDIR at job start for best performance
  4. Temporary data for single node: Always use $TMPDIR, not workspaces
  5. Multi-node temporary data: Use workspaces (not suitable for $TMPDIR)
  6. AI/ML training data: Use Weka (NEMO2) or flash filesystems for best performance, or stage to $TMPDIR for repeated access
  7. Compilation/build directories: Use flash-based filesystems (Weka, ffuc) or $TMPDIR for better performance

For Quota Management

  1. Delete data before releasing if you need immediate quota relief: WSDIR=$(ws_find workspace) && [ -n "$WSDIR" ] && rm -rf "$WSDIR" then ws_release workspace
  2. Use ws_release --delete-data (newer versions) for immediate deletion
  3. Remember: released workspaces may still count toward quota during grace period