Workspaces/Advanced Features/Filesystems: Difference between revisions
Jump to navigation
Jump to search
(Created page with "= Multiple Filesystem Locations = {| class="wikitable" |- !style="width:40%" | Works on cluster !style="width:10%" | bwUC 3.0 !style="width:10%" | BinAC2 !style="width:10%" | Helix !style="width:10%" | JUSTUS 2 !style="width:10%" | NEMO2 |- |<tt>-F</tt> option (multiple filesystems) |style="background-color:#90EE90; text-align:center;" | ✓ |style="background-color:#FFB6C1; text-align:center;" | ✗ |style="background-color:#FFB6C1; text-align:center;" | ✗ |style="b...") |
mNo edit summary |
||
| Line 1: | Line 1: | ||
= Multiple Filesystem Locations = |
= Multiple Filesystem Locations = |
||
'''Most users don't need special filesystem options.''' On all clusters, workspaces are created on the default high-performance filesystem without any options - this works for standard I/O workloads. |
|||
== Do I Need the -F Option? == |
|||
'''For standard I/O (large files, sequential access):''' |
|||
* '''All clusters:''' Just use <tt>ws_allocate myWs 30</tt> (no <tt>-F</tt> needed) |
|||
* The default filesystem handles standard workloads well |
|||
'''For special workloads (AI/ML, many small files, random I/O):''' |
|||
* '''NEMO2:''' Default Weka filesystem works great - no <tt>-F</tt> needed |
|||
* '''bwUniCluster 3.0:''' Use <tt>-F ffuc</tt> for flash filesystem |
|||
* '''Other clusters:''' Use <tt>$TMPDIR</tt> or default workspace |
|||
== Check Available Filesystems == |
|||
$ ws_list -l # List available filesystems |
|||
If only one filesystem is listed, you're all set - just use <tt>ws_allocate</tt> without <tt>-F</tt>. |
|||
== When -F Option is Available == |
|||
{| class="wikitable" |
{| class="wikitable" |
||
| Line 10: | Line 31: | ||
!style="width:10%" | NEMO2 |
!style="width:10%" | NEMO2 |
||
|- |
|- |
||
|<tt>-F</tt> option |
|<tt>-F</tt> option |
||
|style="background-color:#90EE90; text-align:center;" | ✓ |
|style="background-color:#90EE90; text-align:center;" | ✓ |
||
|style="background-color:#FFB6C1; text-align:center;" | ✗ |
|style="background-color:#FFB6C1; text-align:center;" | ✗ |
||
| Line 18: | Line 39: | ||
|} |
|} |
||
Only '''bwUniCluster 3.0''' offers multiple filesystems via <tt>-F</tt> option. |
|||
Some clusters offer multiple filesystem locations for workspaces with different characteristics: |
|||
== Cluster-Specific Information == |
|||
'''bwUniCluster 3.0:''' |
|||
* Default workspace filesystem (Lustre) |
|||
* Flash-based workspace filesystem (<tt>ffuc</tt>) - for KIT/HoreKa users only |
|||
** Lower latency and better performance for small files |
|||
** SSDs instead of hard disks |
|||
** Shared between bwUniCluster 3.0 and HoreKa |
|||
=== NEMO2 === |
|||
'''Example creating workspace on flash filesystem:''' |
|||
'''Default Weka filesystem (no -F needed):''' |
|||
$ ws_allocate -F ffuc myworkspace 60 |
|||
* Excellent for all workloads - standard I/O, small files, random access |
|||
* Handles AI/ML training, compilation, and general workloads efficiently |
|||
* Just use: <tt>ws_allocate myWs 30</tt> |
|||
=== bwUniCluster 3.0 === |
|||
Use <tt>ws_list -l</tt> or <tt>ws_find -l</tt> to see available filesystem locations on your cluster. |
|||
'''Default Lustre filesystem (no -F needed):''' |
|||
== Choosing the Right Filesystem == |
|||
* Best for standard I/O: large files, sequential access |
|||
* General-purpose workload |
|||
* Use: <tt>ws_allocate myWs 30</tt> |
|||
'''Flash filesystem with -F ffuc:''' |
|||
'''Note:''' Most clusters have only one default filesystem - the <tt>-F</tt> option is rarely needed. Use <tt>ws_list -l</tt> first to check if multiple filesystems are available on your cluster. |
|||
* SSD-based storage for special workloads |
|||
* Shared between bwUniCluster 3.0 and HoreKa (KIT/HoreKa users only) |
|||
* Use for: AI/ML datasets, many small files, random I/O, compilation |
|||
* Use: <tt>ws_allocate -F ffuc myWs 30</tt> |
|||
=== Other Clusters (BinAC2, Helix, JUSTUS 2) === |
|||
=== bwUniCluster 3.0 Filesystems === |
|||
* Single default filesystem (no <tt>-F</tt> option available) |
|||
'''Default Lustre filesystem:''' |
|||
* Good for all standard workloads |
|||
* Standard workspace location |
|||
* For special workloads with many small files, consider using <tt>$TMPDIR</tt> |
|||
* Best for large files and sequential I/O |
|||
* General-purpose storage |
|||
== Simple Decision Guide == |
|||
'''Flash filesystem (ffuc):''' |
|||
* SSD-based storage for KIT/HoreKa users |
|||
* Shared between bwUniCluster 3.0 and HoreKa |
|||
* Use for workloads with: |
|||
** Many small files |
|||
** Random I/O patterns |
|||
** AI/ML training |
|||
** Compilation and builds |
|||
* Balance load: use <tt>-F ffuc</tt> when appropriate to reduce load on default filesystem |
|||
{| class="wikitable" |
|||
=== General Guidelines === |
|||
|- |
|||
!style="width:30%" | Your Workload |
|||
!style="width:35%" | NEMO2 |
|||
!style="width:35%" | bwUniCluster 3.0 |
|||
|- |
|||
|Standard I/O (large files) |
|||
|<tt>ws_allocate myWs 30</tt> |
|||
|<tt>ws_allocate myWs 30</tt> |
|||
|- |
|||
|AI/ML training |
|||
|<tt>ws_allocate myWs 30</tt> |
|||
|<tt>ws_allocate -F ffuc myWs 30</tt> |
|||
|- |
|||
|Many small files |
|||
|<tt>ws_allocate myWs 30</tt> |
|||
|<tt>ws_allocate -F ffuc myWs 30</tt> |
|||
|- |
|||
|Random I/O |
|||
|<tt>ws_allocate myWs 30</tt> |
|||
|<tt>ws_allocate -F ffuc myWs 30</tt> |
|||
|- |
|||
|Compilation/builds |
|||
|<tt>ws_allocate myWs 30</tt> |
|||
|<tt>ws_allocate -F ffuc myWs 30</tt> |
|||
|- |
|||
|Single-node temporary |
|||
|colspan="2" style="text-align:center;" | Use <tt>$TMPDIR</tt>, not workspaces |
|||
|} |
|||
== Quick Reference by Data Type == |
|||
'''Flash-based filesystems (SSD/NVMe):''' |
|||
* Use for many small files |
|||
* Best for low-latency requirements |
|||
* Ideal for random I/O patterns |
|||
* Examples: Weka (NEMO2), ffuc (bwUniCluster 3.0) |
|||
{| class="wikitable" |
|||
'''Standard Lustre/parallel filesystems:''' |
|||
|- |
|||
* Best for large files |
|||
!style="width:40%" | Data Type |
|||
* Optimized for sequential I/O patterns |
|||
!style="width:60%" | Where to Store |
|||
* General-purpose workload support |
|||
|- |
|||
|Large files, standard I/O |
|||
=== Data Type Recommendations === |
|||
|Default workspace (no <tt>-F</tt>) on all clusters |
|||
|- |
|||
'''Large sequential I/O:''' |
|||
|AI/ML datasets |
|||
* Use standard workspace filesystem |
|||
|NEMO2: default workspace; bwUniCluster 3.0: <tt>-F ffuc</tt> |
|||
* Lustre: best for very large files |
|||
|- |
|||
* Weka: excellent for both large and small files |
|||
|Many small files |
|||
|NEMO2: default workspace; bwUniCluster 3.0: <tt>-F ffuc</tt> |
|||
'''Many small files or random access:''' |
|||
|- |
|||
* Use flash-based workspace filesystem (Weka, ffuc) |
|||
|Random I/O patterns |
|||
* Or stage to <tt>$TMPDIR</tt> on compute nodes |
|||
|NEMO2: default workspace; bwUniCluster 3.0: <tt>-F ffuc</tt> |
|||
|- |
|||
'''Data read multiple times on single node:''' |
|||
|Single-node temporary |
|||
* Copy to <tt>$TMPDIR</tt> at job start for best performance |
|||
|Always <tt>$TMPDIR</tt>, not workspaces |
|||
|- |
|||
'''Temporary data for single node:''' |
|||
|Multi-node shared data |
|||
* Always use <tt>$TMPDIR</tt>, not workspaces |
|||
|Default workspace on all clusters |
|||
|- |
|||
'''Multi-node temporary data:''' |
|||
|Compilation/builds |
|||
* Use workspaces (not suitable for <tt>$TMPDIR</tt>) |
|||
|NEMO2: default workspace; bwUniCluster 3.0: <tt>-F ffuc</tt> or <tt>$TMPDIR</tt> |
|||
|} |
|||
'''AI/ML training data:''' |
|||
* Use Weka (NEMO2) or flash filesystems for best performance |
|||
* Or stage to <tt>$TMPDIR</tt> for repeated access |
|||
'''Compilation/build directories:''' |
|||
* Use flash-based filesystems (Weka, ffuc) |
|||
* Or <tt>$TMPDIR</tt> for better performance |
|||
For |
For quota information, see [[Workspaces/Advanced_Features/Quotas|Quotas & Limits]]. |
||
Latest revision as of 17:32, 2 December 2025
Multiple Filesystem Locations
Most users don't need special filesystem options. On all clusters, workspaces are created on the default high-performance filesystem without any options - this works for standard I/O workloads.
Do I Need the -F Option?
For standard I/O (large files, sequential access):
- All clusters: Just use ws_allocate myWs 30 (no -F needed)
- The default filesystem handles standard workloads well
For special workloads (AI/ML, many small files, random I/O):
- NEMO2: Default Weka filesystem works great - no -F needed
- bwUniCluster 3.0: Use -F ffuc for flash filesystem
- Other clusters: Use $TMPDIR or default workspace
Check Available Filesystems
$ ws_list -l # List available filesystems
If only one filesystem is listed, you're all set - just use ws_allocate without -F.
When -F Option is Available
| Works on cluster | bwUC 3.0 | BinAC2 | Helix | JUSTUS 2 | NEMO2 |
|---|---|---|---|---|---|
| -F option | ✓ | ✗ | ✗ | ✗ | ✗ |
Only bwUniCluster 3.0 offers multiple filesystems via -F option.
Cluster-Specific Information
NEMO2
Default Weka filesystem (no -F needed):
- Excellent for all workloads - standard I/O, small files, random access
- Handles AI/ML training, compilation, and general workloads efficiently
- Just use: ws_allocate myWs 30
bwUniCluster 3.0
Default Lustre filesystem (no -F needed):
- Best for standard I/O: large files, sequential access
- General-purpose workload
- Use: ws_allocate myWs 30
Flash filesystem with -F ffuc:
- SSD-based storage for special workloads
- Shared between bwUniCluster 3.0 and HoreKa (KIT/HoreKa users only)
- Use for: AI/ML datasets, many small files, random I/O, compilation
- Use: ws_allocate -F ffuc myWs 30
Other Clusters (BinAC2, Helix, JUSTUS 2)
- Single default filesystem (no -F option available)
- Good for all standard workloads
- For special workloads with many small files, consider using $TMPDIR
Simple Decision Guide
| Your Workload | NEMO2 | bwUniCluster 3.0 |
|---|---|---|
| Standard I/O (large files) | ws_allocate myWs 30 | ws_allocate myWs 30 |
| AI/ML training | ws_allocate myWs 30 | ws_allocate -F ffuc myWs 30 |
| Many small files | ws_allocate myWs 30 | ws_allocate -F ffuc myWs 30 |
| Random I/O | ws_allocate myWs 30 | ws_allocate -F ffuc myWs 30 |
| Compilation/builds | ws_allocate myWs 30 | ws_allocate -F ffuc myWs 30 |
| Single-node temporary | Use $TMPDIR, not workspaces | |
Quick Reference by Data Type
| Data Type | Where to Store |
|---|---|
| Large files, standard I/O | Default workspace (no -F) on all clusters |
| AI/ML datasets | NEMO2: default workspace; bwUniCluster 3.0: -F ffuc |
| Many small files | NEMO2: default workspace; bwUniCluster 3.0: -F ffuc |
| Random I/O patterns | NEMO2: default workspace; bwUniCluster 3.0: -F ffuc |
| Single-node temporary | Always $TMPDIR, not workspaces |
| Multi-node shared data | Default workspace on all clusters |
| Compilation/builds | NEMO2: default workspace; bwUniCluster 3.0: -F ffuc or $TMPDIR |
For quota information, see Quotas & Limits.