Workspaces: Difference between revisions

From bwHPC Wiki
Jump to navigation Jump to search
mNo edit summary
 
Line 53: Line 53:
'''Basic usage:'''
'''Basic usage:'''


$ ws_allocate myWs 30 # Create workspace for 30 days
$ ws_allocate myWs 30 # Create workspace for 30 days


'''Common variations:'''
'''Common variations:'''
Line 76: Line 76:
$ cd $WORKSPACE
$ cd $WORKSPACE


'''For all options and advanced usage,''' see the [[Workspaces/Advanced_Features#Detailed_ws_allocate_Options|Advanced Features guide]].
'''For all options and advanced usage,''' see the [[Workspaces/Advanced_Features#ws_allocate_-_Create_and_Extend_Workspaces|Advanced Features guide]].


== List Your Workspaces ==
== List Your Workspaces ==
Line 98: Line 98:
'''Note:''' To list expired workspaces that can be restored, see [[#Restore_Workspace|Restore Workspace]].
'''Note:''' To list expired workspaces that can be restored, see [[#Restore_Workspace|Restore Workspace]].


'''For all options,''' see the [[Workspaces/Advanced_Features#Advanced_ws_list_Options|Advanced Features guide]].
'''For all options,''' see the [[Workspaces/Advanced_Features#ws_list_-_List_Workspaces|Advanced Features guide]].


== Find Workspace Path ==
== Find Workspace Path ==
Line 117: Line 117:
$ WORKSPACE=$(ws_find myWs)
$ WORKSPACE=$(ws_find myWs)


'''For all options,''' see the [[Workspaces/Advanced_Features#Advanced_ws_find_Options|Advanced Features guide]].
'''For all options,''' see the [[Workspaces/Advanced_Features#ws_find_-_Find_Workspace_Path|Advanced Features guide]].


== Extend Workspace Lifetime ==
== Extend Workspace Lifetime ==
Line 137: Line 137:
'''Note:''' Each extension consumes one of your available extensions (see the [[Workspaces/Advanced_Features#Cluster-Specific_Workspace_Limits|Cluster-Specific Workspace Limits]]).
'''Note:''' Each extension consumes one of your available extensions (see the [[Workspaces/Advanced_Features#Cluster-Specific_Workspace_Limits|Cluster-Specific Workspace Limits]]).


'''For all options,''' see the [[Workspaces/Advanced_Features#Advanced_ws_extend_Options|Advanced Features guide]].
'''For all options,''' see the [[Workspaces/Advanced_Features#ws_extend_-_Extend_Workspace_Lifetime|Advanced Features guide]].


== Release (Delete) Workspace ==
== Release (Delete) Workspace ==
Line 178: Line 178:
'''IMPORTANT:''' Only use immediate deletion if you're absolutely certain you don't need the data and have verified any backups. Regular <tt>ws_release</tt> is safer and recommended for most cases.
'''IMPORTANT:''' Only use immediate deletion if you're absolutely certain you don't need the data and have verified any backups. Regular <tt>ws_release</tt> is safer and recommended for most cases.


'''For all options,''' see the [[Workspaces/Advanced_Features#Delete_a_Workspace|Advanced Features guide]].
'''For all options,''' see the [[Workspaces/Advanced_Features#ws_release_-_Release_.28Delete.29_Workspace|Advanced Features guide]].


== Restore Workspace ==
== Restore Workspace ==
Line 209: Line 209:
'''Note:''' Use the '''full name''' from <tt>ws_restore -l</tt> (including username and timestamp), not the short name from <tt>ws_list</tt>.
'''Note:''' Use the '''full name''' from <tt>ws_restore -l</tt> (including username and timestamp), not the short name from <tt>ws_list</tt>.


'''For detailed restore options,''' see the [[Workspaces/Advanced_Features#Restore_an_Expired_Workspace|Advanced Features guide]].
'''For detailed restore options,''' see the [[Workspaces/Advanced_Features#ws_restore_-_Restore_Expired_Workspace|Advanced Features guide]].


== Work with Groups (Share Workspaces) ==
== Work with Groups (Share Workspaces) ==

Latest revision as of 15:49, 21 November 2025

New Workspace Page

WARNING: This is a new Workspaces page, the old safe-to-use page can be found here: Workspace.

Workspace tools provide temporary scratch spaces called workspaces for your calculations on a central file storage. They are meant to keep data for a limited time – but usually longer than the time of a single job run.

Important

  • No Backup: Data in workspaces is not backed up and will be automatically deleted after expiration
  • Time-limited: Every workspace has a limited lifetime (typically 30-100 days depending on cluster, see the Cluster-Specific Workspace Limits)
  • Automatic Email Reminders: You will receive email notifications before expiration
  • Backup Important Data: Copy important results to appropriate permanent storage before expiration (location depends on your cluster/site policies)

Quick Start - Most Common Commands

Task Command
Create workspace for 30 days ws_allocate myWs 30
Create group-writable workspace ws_allocate -G groupname myWs 30
List all your workspaces ws_list
Find workspace path (for scripts) ws_find myWs
Check which expire soon ws_list -R
Extend workspace by 30 days ws_extend myWs 30
Delete/release workspace ws_release myWs
Restore released workspace ws_restore -l then ws_restore oldname newname

Create Workspace

To create a workspace you need to specify a name and lifetime in days:

Basic usage:

  $ ws_allocate myWs 30                    # Create workspace for 30 days

Common variations:

  $ ws_allocate -G groupname myWs 30           # Create group-writable workspace (recommended for teams)
  $ ws_allocate -g myWs 30                     # Create group-readable workspace
  $ ws_allocate -r 7 myWs 30                   # Set reminder 7 days before expiration
  $ ws_allocate -m email@example.com myWs 30   # Use custom email for reminders
  $ ws_allocate -F ffuc myWs 30                # bwUniCluster 3.0: Use flash filesystem for better performance

This returns:

  Workspace created. Duration is 720 hours. 
  Further extensions available: 3
  /work/workspace/scratch/username-myWs-0

Important: Creating a workspace a second time with the same command is safe - it always returns the same path.

Capture the path in a variable:

  $ WORKSPACE=$(ws_allocate myWs 30)
  $ cd $WORKSPACE

For all options and advanced usage, see the Advanced Features guide.

List Your Workspaces

Basic usage:

  $ ws_list                                # List all your workspaces

Shows:

  • Workspace ID
  • Workspace location
  • Available extensions
  • Creation date and remaining time

Common variations:

  $ ws_list -R                             # Sort by remaining time (see what expires soon)
  $ ws_list -s                             # Short format (only names, good for scripts)
  $ ws_list -g                             # List group workspaces (if you're in the same group)

Note: To list expired workspaces that can be restored, see Restore Workspace.

For all options, see the Advanced Features guide.

Find Workspace Path

Get the path to a workspace for use in scripts:

Basic usage:

  $ ws_find myWs                           # Get path to workspace

Returns:

  /work/workspace/scratch/username-myWs-0

In scripts:

  $ cd $(ws_find myWs)
  $ WORKSPACE=$(ws_find myWs)

For all options, see the Advanced Features guide.

Extend Workspace Lifetime

Extend a workspace before it expires:

Basic usage:

  $ ws_extend myWs 30                      # Extend by 30 days from now

Alternative commands:

  $ ws_allocate -x myWs 30                 # Same as ws_extend

Update reminder only (without extending):

  $ ws_allocate -r 7 -x myWs 0             # Update reminder time to 7 days

Note: Each extension consumes one of your available extensions (see the Cluster-Specific Workspace Limits).

For all options, see the Advanced Features guide.

Release (Delete) Workspace

When you no longer need a workspace:

Basic usage:

  $ ws_release myWs                        # Release workspace (recoverable during grace period)

What happens:

  • Workspace becomes inaccessible immediately
  • Data is kept for a short grace period (typically 1 hour) and can be restored with ws_restore
  • Final deletion happens automatically during the next cleanup run (usually nighttime)
  • Released data may still count toward quota until final deletion
Works on cluster bwUC 3.0 BinAC2 Helix JUSTUS 2 NEMO2
ws_release --delete-data (immediate deletion)

If you need to free quota immediately:

  $ ws_release --delete-data myWs          # WARNING: Permanently deletes data, cannot be recovered!

On other clusters (or if --delete-data not available), use the manual deletion method.

IMPORTANT: Only use immediate deletion if you're absolutely certain you don't need the data and have verified any backups. Regular ws_release is safer and recommended for most cases.

For all options, see the Advanced Features guide.

Restore Workspace

Works on cluster bwUC 3.0 BinAC2 Helix JUSTUS 2 NEMO2
ws_restore

If you released a workspace by accident or need to recover an expired one, you can restore it within a grace period:

Basic workflow:

  $ ws_restore -l                          # (1) List restorable workspaces
  $ ws_allocate restored 60                # (2) Create a new target workspace
  $ ws_restore username-myWs-0 restored    # (3) Restore the expired workspace

Note: Use the full name from ws_restore -l (including username and timestamp), not the short name from ws_list.

For detailed restore options, see the Advanced Features guide.

Work with Groups (Share Workspaces)

Works on cluster bwUC 3.0 BinAC2 Helix JUSTUS 2 NEMO2
-g option (group-readable)
-G option (group-writable)

Working with team members is simple using group workspaces:

Create Group Workspace

Group-readable workspace (team can read):

  $ ws_allocate -g myWs 30

Group-writable workspace (team can read and write, recommended):

  $ ws_allocate -G projectgroup myWs 30

Replace projectgroup with your actual group name (e.g., bw11a000).

Tip: Set your default group in ~/.ws_user.conf to avoid typing it every time:

groupname: projectgroup

Then you only need: ws_allocate myWs 30

List Group Workspaces

See all workspaces from your group:

  $ ws_list -g

This shows workspaces that were created with -g or -G by anyone in your group.

Extend Group Workspace

Anyone in the group can extend a group-writable workspace (-G):

  $ ws_extend myWs 30                      # If you created it
  $ ws_allocate -x -u username myWs 30     # If colleague created it

Replace username with the workspace owner's username. This is useful when they're unavailable.

Manage Reminders for Group Workspace

You can update reminder settings and take over responsibility for reminders on a colleague's workspace:

  $ ws_allocate -r 7 -u username -x myWs 0     # Update reminder time and take over
                                               # another user's workspace reminders

This changes the reminder timing to 7 days before expiration and redirects reminder emails to you instead of the original creator. Useful when you're taking over responsibility for a shared workspace.

Why Use Group Workspaces?

  • Simple collaboration: Everyone can access the same data
  • No permission problems: Files automatically get group permissions
  • Independent extensions: Team members can extend without original creator
  • Easy to find: Use ws_list -g to see all team workspaces

For advanced sharing options (sharing with specific users outside your group, ACL-based methods), see the Advanced Features guide.

Command Overview

The workspace tools consist of several commands:

  • ws_allocate - Create or extend a workspace
  • ws_list - List all your workspaces
  • ws_find - Find the path to a workspace
  • ws_extend - Extend the lifetime of a workspace
  • ws_release - Release (delete) a workspace
  • ws_restore - Restore an expired or released workspace
  • ws_register - Create symbolic links to workspaces

All commands support -h or --help to show detailed usage information.

Using Workspaces in Batch Jobs

Recommended approach: Create your workspace manually before submitting jobs, then reference it in your job scripts using ws_find.

(1) Create workspace once (on login node):

  $ ws_allocate myProject 60

(2) Use in job scripts with ws_find:

#!/bin/bash
#SBATCH --job-name=my_job
#SBATCH --time=24:00:00

# Find existing workspace
WORKSPACE=$(ws_find myProject)

# Change to workspace
cd $WORKSPACE

# Your computation here
./my_program --input input.dat --output results.dat

Warning: Avoid using ws_allocate directly in job scripts that run frequently. While ws_allocate is safe to call multiple times on the same workspace name (it returns the existing workspace), you should not create too many workspaces unnecessarily. Create workspaces manually when needed, then use ws_find in your job scripts to locate them.

Advanced Features

For detailed information about advanced workspace features, configuration options, and less frequently used commands, see the separate Workspaces/Advanced Features guide.

Topics covered in the advanced guide include:

  • Complete command reference with all options
  • Multiple filesystem locations
  • Detailed options for ws_allocate, ws_list, ws_find, ws_extend
  • Email and calendar reminders configuration
  • Group workspaces and cooperative usage
  • Advanced sharing with ws_share (ACL-based, read-only)
  • Setting permissions (ACLs and Unix permissions)
  • Deleting and restoring workspaces in detail
  • Cluster-specific limits and quotas
  • Checking workspace quotas
  • Registering workspace links

Best Practices and Recommendations

For All Users

  1. Set up ~/.ws_user.conf - Configure default reminder timing, duration, and groupname to avoid typing them repeatedly (see configuration guide)
  2. Email reminders are automatic - Notifications are sent automatically using your identity provider email; only use -r to customize reminder timing if needed
  3. Custom email only if needed - Only use -m option to override the email address from your identity provider
  4. Use ws_register - Create symbolic links to your workspaces in a convenient directory (see ws_register guide)
  5. Create workspaces manually - Create workspaces on the login node before submitting jobs, then use ws_find in your job scripts (see Using Workspaces in Batch Jobs)
  6. Track your workspaces - Regularly run ws_list -R to see which workspaces will expire soon
  7. Backup important data - Workspaces are temporary and not backed up - copy results to appropriate permanent storage (check your cluster/site policies for backup locations)
  8. Clean up regularly - Release workspaces you no longer need to keep filesystems organized

For Short-term Jobs (hours to days)

  1. Use default or shorter durations
  2. Consider using a single workspace for a series of related jobs
  3. Use ws_find in job scripts to locate the workspace (see Using Workspaces in Batch Jobs)
  4. Copy results to permanent storage when jobs complete
  5. Release workspace when no longer needed (see Release Workspace)

For Long-term Campaigns (weeks to months)

  1. Request maximum allowed duration (see Cluster-Specific Workspace Limits)
  2. Email reminders are sent automatically; optionally customize reminder timing with -r option
  3. Use ws_list -R regularly to monitor remaining time (see List Your Workspaces)
  4. Plan data archival to appropriate permanent storage before expiration (check cluster/site policies)

For Collaborative Work

  1. Use ws_allocate -G groupname for shared write access (see Create Group Workspace)
  2. Set groupname in ~/.ws_user.conf if you always work with the same group (see configuration guide)
  3. Use ws_allocate -g for read-only sharing within group
  4. Use ws_list -g to see all group workspaces (see List Group Workspaces)
  5. Team members can extend group workspaces (see Extend Group Workspace)
  6. Take over reminder responsibility when colleague is unavailable (see Manage Reminders)
  7. Document the workspace location for your team members
  8. For advanced sharing scenarios (ACL-based, ws_share), see the Advanced Features guide

For Managing Multiple Filesystems

  1. Note: Most clusters have only one default filesystem - the -F option is rarely needed
  2. Use ws_list -l first to check if multiple filesystems are available on your cluster
  3. Use -F option only if you need specific filesystem for performance or capacity needs (see filesystem options)
  4. bwUniCluster 3.0 filesystems:
    • Default Lustre filesystem: Standard workspace location, best for large files and sequential I/O
    • Flash filesystem (ffuc): SSD-based storage for KIT/HoreKa users, shared between bwUniCluster 3.0 and HoreKa
    • Use flash filesystem for workloads with many small files, random I/O, AI/ML training, or compilation
    • Balance load: use -F ffuc when appropriate to reduce load on default filesystem
  5. General guidelines:
    • Flash-based filesystems (SSD/NVMe): Use for many small files, low-latency requirements, random I/O
    • Standard Lustre/parallel filesystems: Best for large files and sequential I/O patterns

For Different Data Types

  1. Large sequential I/O: Use standard workspace filesystem (Lustre best for very large files, Weka excellent for both large and small)
  2. Many small files or random access: Use flash-based workspace filesystem like Weka (NEMO2) or bwUniCluster ffuc, or stage to $TMPDIR
  3. Data read multiple times on single node: Copy to $TMPDIR at job start for best performance
  4. Temporary data for single node: Always use $TMPDIR, not workspaces
  5. Multi-node temporary data: Use workspaces (not suitable for $TMPDIR)
  6. AI/ML training data: Use Weka (NEMO2) or flash filesystems for best performance, or stage to $TMPDIR for repeated access
  7. Compilation/build directories: Use flash-based filesystems (Weka, ffuc) or $TMPDIR for better performance

For Quota Management

  • Use ws_release --delete-data for immediate deletion (see Release Workspace)
  • For clusters without --delete-data option, use manual deletion method
  • Remember: released workspaces may still count toward quota during grace period (~1 hour)