NEMO2/Workspaces: Difference between revisions
mNo edit summary |
|||
| (38 intermediate revisions by 3 users not shown) | |||
| Line 1: | Line 1: | ||
<div style="border: 3px solid #ffc107; padding: 15px; background-color: #fff3cd; margin: 10px 0;"> |
|||
'''Note:''' This is the updated Workspaces guide for NEMO2. For other clusters please use: [[Workspace]]. |
|||
| style="padding:8px; background:#FFC05C; font-size:120%; font-weight:bold; text-align:left" | New Workspace Page |
|||
</div> |
|||
|- |
|||
| |
|||
'''Workspace tools''' provide temporary storage on NEMO's fast parallel filesystem (Weka). |
|||
'''WARNING:''' This is a new Workspaces page, the old safe-to-use page can be found here: [[Workspace]]. |
|||
They are meant for data that needs to persist longer than a single job, but not permanently. |
|||
|} |
|||
For advanced features — user config (<tt>~/.ws_user.conf</tt>), reminders, quotas, workspace handover, and more — see [[NEMO2/Workspaces/Advanced_Features|Advanced Features]]. |
|||
== What are Workspaces? == |
|||
'''Use workspaces for:''' |
|||
'''Workspace tools''' provide temporary scratch space called '''workspaces''' for your calculations on a central file storage. They are meant to keep data for a limited time – but usually longer than the time of a single job run. |
|||
* Jobs generating intermediate data |
|||
* Data shared between multiple compute nodes |
|||
* Multi-step workflows |
|||
'''Don't use workspaces for:''' |
|||
== Important == |
|||
* Permanent storage (use HOME or project directories) |
|||
* Single-node temporary files (use <tt>$TMPDIR</tt> instead) |
|||
== Important - Read First == |
|||
* '''No Backup:''' Data in workspaces is '''not backed up''' and will be '''automatically deleted''' after expiration |
|||
* '''Time-limited:''' Every workspace has a limited lifetime (typically 30-100 days depending on cluster, see the [[Workspaces/Advanced_Features#Cluster-Specific_Workspace_Limits|Cluster-Specific Workspace Limits]]) |
|||
* '''Automatic Email Reminders:''' You will receive email notifications before expiration |
|||
* '''Backup Important Data:''' Copy important results to appropriate permanent storage before expiration (location depends on your cluster/site policies) |
|||
* No Backup: Data is '''not backed up''' and will be '''automatically deleted''' after expiration |
|||
== Quick Start - Most Common Commands == |
|||
* Time-limited: Maximum lifetime is 100 days, up to 100 extensions |
|||
* Email Reminders: You receive email notifications before expiration |
|||
* Backup Important Data: Copy results to permanent storage before expiration |
|||
== Command Overview == |
|||
* <tt>ws_allocate</tt> - Create or extend workspace |
|||
* <tt>ws_list</tt> - List your workspaces |
|||
* <tt>ws_find</tt> - Find workspace path (for scripts) |
|||
* <tt>ws_extend</tt> - Extend workspace lifetime |
|||
* <tt>ws_release</tt> - Release (delete) workspace |
|||
* <tt>ws_restore</tt> - Restore expired/released workspace |
|||
* <tt>ws_register</tt> - Create symbolic links |
|||
All commands support <tt>-h</tt> for help. |
|||
== Quick Start == |
|||
{| class="wikitable" |
{| class="wikitable" |
||
| Line 23: | Line 45: | ||
!style="width:60%" | Command |
!style="width:60%" | Command |
||
|- |
|- |
||
|Create workspace |
|Create workspace (100 days) |
||
|<tt>ws_allocate myWs |
|<tt>ws_allocate myWs 100</tt> |
||
|- |
|- |
||
|Create group |
|Create group workspace |
||
|<tt>ws_allocate -G groupname myWs |
|<tt>ws_allocate -G groupname myWs 100</tt> |
||
|- |
|- |
||
|List all |
|List all workspaces |
||
|<tt>ws_list</tt> |
|<tt>ws_list</tt> |
||
|- |
|- |
||
|See what expires soon |
|||
|Find workspace path (for scripts) |
|||
|<tt>ws_list -Rr</tt> |
|||
|- |
|||
|Find path (for scripts) |
|||
|<tt>ws_find myWs</tt> |
|<tt>ws_find myWs</tt> |
||
|- |
|- |
||
|Extend by 100 days |
|||
|Check which expire soon |
|||
|<tt> |
|<tt>ws_extend myWs 100</tt> |
||
|- |
|- |
||
| |
|Delete workspace (permanent, next nightly run) |
||
|<tt>ws_extend myWs 30</tt> |
|||
|- |
|||
|Delete/release workspace |
|||
|<tt>ws_release myWs</tt> |
|<tt>ws_release myWs</tt> |
||
|- |
|- |
||
|Restore |
|Restore expired workspace (30d grace) |
||
|<tt>ws_restore -l</tt> then <tt>ws_restore oldname newname</tt> |
|<tt>ws_restore -l</tt> then <tt>ws_restore oldname newname</tt> |
||
|} |
|} |
||
== |
== Creating Workspaces == |
||
Create a workspace with a '''name''' and '''lifetime''' in days: |
|||
$ ws_allocate myWs |
$ ws_allocate myWs 100 |
||
This returns: |
|||
Workspace created. Duration is 720 hours. |
|||
Further extensions available: 3 |
|||
/work/workspace/scratch/username-myWs-0 |
|||
'''Important:''' Creating a workspace a second time with the same command is safe - it always returns the same path. |
|||
'''Capture the path in a variable:''' |
|||
$ WORKSPACE=$(ws_allocate myWs 30) |
|||
$ cd $WORKSPACE |
|||
'''For all options and advanced usage,''' see the [[Workspaces/Advanced_Features#Detailed_ws_allocate_Options|Advanced Features guide]]. |
|||
== List Your Workspaces == |
|||
To see all your workspaces: |
|||
$ ws_list |
|||
Shows: |
|||
* Workspace ID |
|||
* Workspace location |
|||
* Available extensions |
|||
* Creation date and remaining time |
|||
'''Useful options:''' |
|||
* <tt>ws_list -R</tt> - Sort by remaining time (see what expires soon) |
|||
* <tt>ws_list -s</tt> - Short format (only names, good for scripts) |
|||
== Find Workspace Path == |
|||
Get the path to a workspace for use in scripts: |
|||
$ ws_find myWs |
|||
Returns: |
Returns: |
||
/work/ |
/work/classic/$USER-myWs |
||
''' |
'''Capture path in variable:''' |
||
$ |
$ WORKSPACE=$(ws_allocate myWs 100) |
||
$ cd "$WORKSPACE" |
|||
$ WORKSPACE=$(ws_find myWs) |
|||
'''Important:''' Running the same command again is safe - returns the existing workspace path. |
|||
== Extend Workspace Lifetime == |
|||
== Listing Workspaces == |
|||
Extend a workspace before it expires: |
|||
$ |
$ ws_list # List all workspaces |
||
$ ws_list -Rr # Sort by remaining time, soonest first |
|||
$ ws_list -g # Show group workspaces |
|||
== Extending Workspaces == |
|||
Or use: |
|||
$ |
$ ws_extend myWs 100 # Extend by 100 days from now |
||
'''Alternative:''' <tt>ws_allocate -x myWs 100</tt> |
|||
'''Note:''' Each extension consumes one of your available extensions (see the [[Workspaces/Advanced_Features#Cluster-Specific_Workspace_Limits|Cluster-Specific Workspace Limits]]). |
|||
Each extension consumes one of your available extensions (100 total). |
|||
== Release (Delete) Workspace == |
|||
== Releasing Workspaces == |
|||
When you no longer need a workspace: |
|||
$ ws_release myWs |
$ ws_release myWs |
||
The workspace becomes inaccessible immediately and is permanently deleted at the next nightly expirer run. '''Do not rely on recovering a released workspace.''' |
|||
'''What happens:''' |
|||
* Workspace becomes inaccessible |
|||
* Data is kept for a grace period (can be restored, see below) |
|||
* Real deletion happens later (typically during nighttime) |
|||
== Restoring Workspaces == |
|||
'''To free quota immediately:''' |
|||
Recover workspaces that '''expired naturally''' (reached end of lifetime) within the 30-day grace period: |
|||
$ ws_release --delete-data myWs # Immediate deletion (WARNING: cannot be recovered!) |
|||
$ ws_restore -l # (1) List restorable workspaces |
|||
Or with older workspace tools: |
|||
$ ws_allocate restored 100 # (2) Create target workspace |
|||
$ ws_restore username-myWs-0 restored # (3) Restore |
|||
'''Important:''' Use the '''full name''' from <tt>ws_restore -l</tt> (with username and timestamp), not the short name. |
|||
$ WSDIR=$(ws_find myWs) && [ -n "$WSDIR" ] && rm -rf "$WSDIR" # Delete data first (with safety check) |
|||
Released workspaces (via <tt>ws_release</tt>) can also be restored, but only until the next nightly expirer run — after that they are permanently deleted. |
|||
$ ws_release myWs # Then release |
|||
== |
== Sharing Workspaces == |
||
=== Group workspace (recommended) === |
|||
If you released a workspace by accident or need to recover an expired one, you can restore it within a grace period: |
|||
$ ws_allocate -g myWs 100 # Group-readable (read-only for group) |
|||
'''(1) List restorable workspaces:''' |
|||
$ ws_allocate -G projectgroup myWs 100 # Group-writable (recommended for teams) |
|||
Anyone in the group can use <tt>ws_list -g</tt> to see the workspace and extend it with <tt>ws_allocate -x -u owner myWs 100</tt>. |
|||
$ ws_restore -l |
|||
Using <tt>-G</tt> also enables smooth handover when team members leave — see [[NEMO2/Workspaces/Advanced_Features#Workspace_Handover|Workspace Handover]]. |
|||
''' |
'''Set default group in <tt>~/.ws_user.conf</tt>:''' |
||
$ ws_allocate restored 60 |
|||
'''(3) Restore the expired workspace:''' |
|||
$ ws_restore username-myWs-0 restored |
|||
'''Note:''' Use the '''full name''' from <tt>ws_restore -l</tt> (including username and timestamp), not the short name from <tt>ws_list</tt>. |
|||
'''For detailed restore options,''' see the [[Workspaces/Advanced_Features#Restore_an_Expired_Workspace|Advanced Features guide]]. |
|||
== Share Workspace == |
|||
You can share workspaces with team members: |
|||
'''Important:''' Not all sharing options are available on all clusters. ACL-based methods like <tt>ws_share</tt> require filesystem support and may not work everywhere. If one method doesn't work, try an alternative approach. |
|||
'''Group-readable workspace''' (read-only for group): |
|||
$ ws_allocate -g myWs 30 |
|||
'''Group-writable workspace''' (read-write for group, recommended): |
|||
$ ws_allocate -G projectgroup myWs 30 |
|||
'''Recommended approach:''' |
|||
* Use <tt>-g</tt> or <tt>-G</tt> flags during workspace creation |
|||
* For read-only sharing: use <tt>-g</tt> |
|||
* For collaborative work (read-write): use <tt>-G groupname</tt> |
|||
* Set <tt>groupname</tt> in <tt>~/.ws_user.conf</tt> if you always work with the same group |
|||
'''For advanced sharing options''' (ACL-based, read-only, less common), see the [[Workspaces/Advanced_Features#Cooperative_Usage_.28Group_Workspaces_and_Sharing.29|Advanced Features guide]]. |
|||
== Command Overview == |
|||
The workspace tools consist of several commands: |
|||
* <tt>ws_allocate</tt> - Create or extend a workspace |
|||
* <tt>ws_list</tt> - List all your workspaces |
|||
* <tt>ws_find</tt> - Find the path to a workspace |
|||
* <tt>ws_extend</tt> - Extend the lifetime of a workspace |
|||
* <tt>ws_release</tt> - Release (delete) a workspace |
|||
* <tt>ws_restore</tt> - Restore an expired or released workspace |
|||
* <tt>ws_register</tt> - Create symbolic links to workspaces |
|||
All commands support <tt>-h</tt> or <tt>--help</tt> to show detailed usage information. |
|||
== Using Workspaces in Batch Jobs == |
|||
'''Recommended approach:''' Create your workspace manually before submitting jobs, then reference it in your job scripts using <tt>ws_find</tt>. |
|||
'''(1) Create workspace once (on login node):''' |
|||
$ ws_allocate myProject 60 |
|||
'''(2) Use in job scripts with ws_find:''' |
|||
<pre> |
<pre> |
||
groupname: projectgroup |
|||
#!/bin/bash |
|||
#SBATCH --job-name=my_job |
|||
#SBATCH --time=24:00:00 |
|||
# Find existing workspace |
|||
WORKSPACE=$(ws_find myProject) |
|||
# Change to workspace |
|||
cd $WORKSPACE |
|||
# Your computation here |
|||
./my_program --input input.dat --output results.dat |
|||
</pre> |
</pre> |
||
=== Share after creation === |
|||
'''Warning:''' Avoid using <tt>ws_allocate</tt> directly in job scripts that run frequently. While <tt>ws_allocate</tt> is safe to call multiple times on the same workspace name (it returns the existing workspace), you should not create too many workspaces unnecessarily. Create workspaces manually when needed, then use <tt>ws_find</tt> in your job scripts to locate them. |
|||
== Advanced Features == |
|||
For detailed information about advanced workspace features, configuration options, and less frequently used commands, see the separate [[Workspaces/Advanced_Features]] guide. |
|||
Topics covered in the advanced guide include: |
|||
* Complete command reference with all options |
|||
* Multiple filesystem locations |
|||
* Detailed options for ws_allocate, ws_list, ws_find, ws_extend |
|||
* Email and calendar reminders configuration |
|||
* Group workspaces and cooperative usage |
|||
* Advanced sharing with ws_share (ACL-based, read-only) |
|||
* Setting permissions (ACLs and Unix permissions) |
|||
* Deleting and restoring workspaces in detail |
|||
* Cluster-specific limits and quotas |
|||
* Checking workspace quotas |
|||
* Registering workspace links |
|||
== Best Practices and Recommendations == |
|||
=== For All Users === |
|||
# '''Set up ~/.ws_user.conf''' - Configure default reminder timing, duration, and groupname to avoid typing them repeatedly (see [[Workspaces/Advanced_Features#Example_.7E.2F.ws_user.conf_Configuration|example configuration]]) |
|||
# '''Email reminders are automatic''' - Notifications are sent automatically using your identity provider email; only use <tt>-r</tt> to customize reminder timing if needed |
|||
# '''Custom email only if needed''' - Only use <tt>-m</tt> option to override the email address from your identity provider |
|||
# '''Use ws_register''' - Create symbolic links to your workspaces in a convenient directory: <tt>ws_register ~/workspaces</tt> |
|||
# '''Create workspaces manually''' - Create workspaces on the login node before submitting jobs, then use <tt>ws_find</tt> in your job scripts |
|||
# '''Track your workspaces''' - Regularly run <tt>ws_list -R</tt> to see which workspaces will expire soon |
|||
# '''Backup important data''' - Workspaces are temporary and not backed up - copy results to appropriate permanent storage (check your cluster/site policies for backup locations) |
|||
# '''Clean up regularly''' - Release workspaces you no longer need to keep filesystems organized |
|||
=== For Short-term Jobs (hours to days) === |
|||
# Use default or short durations (1-7 days) |
|||
# Consider using a single workspace for a series of related jobs |
|||
# Use <tt>ws_find</tt> in job scripts to locate the workspace |
|||
=== For Long-term Campaigns (weeks to months) === |
|||
# Request maximum allowed duration |
|||
# Email reminders are sent automatically; optionally customize reminder timing with <tt>-r</tt> option |
|||
# Use <tt>ws_list -R</tt> regularly to monitor remaining time |
|||
# Plan data archival to appropriate permanent storage before expiration (check cluster/site policies) |
|||
=== For Collaborative Work === |
|||
# Use <tt>ws_allocate -G groupname</tt> for shared write access (recommended) |
|||
# Set <tt>groupname</tt> in <tt>~/.ws_user.conf</tt> if you always work with the same group |
|||
# Use <tt>ws_allocate -g</tt> for read-only sharing within group |
|||
# Document the workspace location for your team members |
|||
# For advanced sharing scenarios, see the [[Workspaces/Advanced_Features#Cooperative_Usage_.28Group_Workspaces_and_Sharing.29|Advanced Features guide]] |
|||
=== For Managing Multiple Filesystems === |
|||
# '''Note:''' Most clusters have only one default filesystem - the <tt>-F</tt> option is rarely needed |
|||
# Use <tt>ws_list -l</tt> first to check if multiple filesystems are available on your cluster |
|||
# Use <tt>-F</tt> option only if you need specific filesystem for performance or capacity needs (see [[Workspaces/Advanced_Features#Multiple_Filesystem_Locations|filesystem options]]) |
|||
# '''bwUniCluster 3.0 filesystems:''' |
|||
#* '''Default Lustre filesystem:''' Standard workspace location, best for large files and sequential I/O |
|||
#* '''Flash filesystem (ffuc):''' SSD-based storage for KIT/HoreKa users, shared between bwUniCluster 3.0 and HoreKa |
|||
#* Use flash filesystem for workloads with many small files, random I/O, AI/ML training, or compilation |
|||
#* Balance load: use <tt>-F ffuc</tt> when appropriate to reduce load on default filesystem |
|||
# '''General guidelines:''' |
|||
#* Flash-based filesystems (SSD/NVMe): Use for many small files, low-latency requirements, random I/O |
|||
#* Standard Lustre/parallel filesystems: Best for large files and sequential I/O patterns |
|||
=== For Different Data Types === |
|||
If you didn't use <tt>-g</tt>/<tt>-G</tt> at creation, share read-only with <tt>ws_share</tt>: |
|||
# '''Large sequential I/O:''' Use standard workspace filesystem (Lustre best for very large files, Weka excellent for both large and small) |
|||
# '''Many small files or random access:''' Use flash-based workspace filesystem like Weka (NEMO2) or bwUniCluster ffuc, or stage to <tt>$TMPDIR</tt> |
|||
# '''Data read multiple times on single node:''' Copy to <tt>$TMPDIR</tt> at job start for best performance |
|||
# '''Temporary data for single node:''' Always use <tt>$TMPDIR</tt>, not workspaces |
|||
# '''Multi-node temporary data:''' Use workspaces (not suitable for <tt>$TMPDIR</tt>) |
|||
# '''AI/ML training data:''' Use Weka (NEMO2) or flash filesystems for best performance, or stage to <tt>$TMPDIR</tt> for repeated access |
|||
# '''Compilation/build directories:''' Use flash-based filesystems (Weka, ffuc) or <tt>$TMPDIR</tt> for better performance |
|||
$ ws_share share myWs alice bob # Grant read access |
|||
=== For Quota Management === |
|||
$ ws_share list myWs # Show who has access |
|||
$ ws_share unshare myWs alice # Remove access |
|||
'''Advanced sharing:''' [[NEMO2/Workspaces/Advanced_Features#Sharing|Sharing guide]] for ACL-based per-user permissions. |
|||
# Delete data before releasing if you need immediate quota relief: <tt>WSDIR=$(ws_find workspace) && [ -n "$WSDIR" ] && rm -rf "$WSDIR"</tt> then <tt>ws_release workspace</tt> |
|||
# Use <tt>ws_release --delete-data</tt> (newer versions) for immediate deletion |
|||
# Remember: released workspaces may still count toward quota during grace period |
|||
Latest revision as of 17:37, 12 May 2026
Note: This is the updated Workspaces guide for NEMO2. For other clusters please use: Workspace.
Workspace tools provide temporary storage on NEMO's fast parallel filesystem (Weka). They are meant for data that needs to persist longer than a single job, but not permanently.
For advanced features — user config (~/.ws_user.conf), reminders, quotas, workspace handover, and more — see Advanced Features.
What are Workspaces?
Use workspaces for:
- Jobs generating intermediate data
- Data shared between multiple compute nodes
- Multi-step workflows
Don't use workspaces for:
- Permanent storage (use HOME or project directories)
- Single-node temporary files (use $TMPDIR instead)
Important - Read First
- No Backup: Data is not backed up and will be automatically deleted after expiration
- Time-limited: Maximum lifetime is 100 days, up to 100 extensions
- Email Reminders: You receive email notifications before expiration
- Backup Important Data: Copy results to permanent storage before expiration
Command Overview
- ws_allocate - Create or extend workspace
- ws_list - List your workspaces
- ws_find - Find workspace path (for scripts)
- ws_extend - Extend workspace lifetime
- ws_release - Release (delete) workspace
- ws_restore - Restore expired/released workspace
- ws_register - Create symbolic links
All commands support -h for help.
Quick Start
| Task | Command |
|---|---|
| Create workspace (100 days) | ws_allocate myWs 100 |
| Create group workspace | ws_allocate -G groupname myWs 100 |
| List all workspaces | ws_list |
| See what expires soon | ws_list -Rr |
| Find path (for scripts) | ws_find myWs |
| Extend by 100 days | ws_extend myWs 100 |
| Delete workspace (permanent, next nightly run) | ws_release myWs |
| Restore expired workspace (30d grace) | ws_restore -l then ws_restore oldname newname |
Creating Workspaces
Create a workspace with a name and lifetime in days:
$ ws_allocate myWs 100
Returns:
/work/classic/$USER-myWs
Capture path in variable:
$ WORKSPACE=$(ws_allocate myWs 100) $ cd "$WORKSPACE"
Important: Running the same command again is safe - returns the existing workspace path.
Listing Workspaces
$ ws_list # List all workspaces $ ws_list -Rr # Sort by remaining time, soonest first $ ws_list -g # Show group workspaces
Extending Workspaces
$ ws_extend myWs 100 # Extend by 100 days from now
Alternative: ws_allocate -x myWs 100
Each extension consumes one of your available extensions (100 total).
Releasing Workspaces
$ ws_release myWs
The workspace becomes inaccessible immediately and is permanently deleted at the next nightly expirer run. Do not rely on recovering a released workspace.
Restoring Workspaces
Recover workspaces that expired naturally (reached end of lifetime) within the 30-day grace period:
$ ws_restore -l # (1) List restorable workspaces $ ws_allocate restored 100 # (2) Create target workspace $ ws_restore username-myWs-0 restored # (3) Restore
Important: Use the full name from ws_restore -l (with username and timestamp), not the short name. Released workspaces (via ws_release) can also be restored, but only until the next nightly expirer run — after that they are permanently deleted.
Sharing Workspaces
Group workspace (recommended)
$ ws_allocate -g myWs 100 # Group-readable (read-only for group) $ ws_allocate -G projectgroup myWs 100 # Group-writable (recommended for teams)
Anyone in the group can use ws_list -g to see the workspace and extend it with ws_allocate -x -u owner myWs 100. Using -G also enables smooth handover when team members leave — see Workspace Handover.
Set default group in ~/.ws_user.conf:
groupname: projectgroup
If you didn't use -g/-G at creation, share read-only with ws_share:
$ ws_share share myWs alice bob # Grant read access $ ws_share list myWs # Show who has access $ ws_share unshare myWs alice # Remove access
Advanced sharing: Sharing guide for ACL-based per-user permissions.