Workspaces/Advanced Features/Filesystems
Multiple Filesystem Locations
| Works on cluster | bwUC 3.0 | BinAC2 | Helix | JUSTUS 2 | NEMO2 |
|---|---|---|---|---|---|
| -F option (multiple filesystems) | ✓ | ✗ | ✗ | ✗ | ✗ |
Some clusters offer multiple filesystem locations for workspaces with different characteristics:
bwUniCluster 3.0:
- Default workspace filesystem (Lustre)
- Flash-based workspace filesystem (ffuc) - for KIT/HoreKa users only
- Lower latency and better performance for small files
- SSDs instead of hard disks
- Shared between bwUniCluster 3.0 and HoreKa
Example creating workspace on flash filesystem:
$ ws_allocate -F ffuc myworkspace 60
Use ws_list -l or ws_find -l to see available filesystem locations on your cluster.
Choosing the Right Filesystem
Note: Most clusters have only one default filesystem - the -F option is rarely needed. Use ws_list -l first to check if multiple filesystems are available on your cluster.
bwUniCluster 3.0 Filesystems
Default Lustre filesystem:
- Standard workspace location
- Best for large files and sequential I/O
- General-purpose storage
Flash filesystem (ffuc):
- SSD-based storage for KIT/HoreKa users
- Shared between bwUniCluster 3.0 and HoreKa
- Use for workloads with:
- Many small files
- Random I/O patterns
- AI/ML training
- Compilation and builds
- Balance load: use -F ffuc when appropriate to reduce load on default filesystem
General Guidelines
Flash-based filesystems (SSD/NVMe):
- Use for many small files
- Best for low-latency requirements
- Ideal for random I/O patterns
- Examples: Weka (NEMO2), ffuc (bwUniCluster 3.0)
Standard Lustre/parallel filesystems:
- Best for large files
- Optimized for sequential I/O patterns
- General-purpose workload support
Data Type Recommendations
Large sequential I/O:
- Use standard workspace filesystem
- Lustre: best for very large files
- Weka: excellent for both large and small files
Many small files or random access:
- Use flash-based workspace filesystem (Weka, ffuc)
- Or stage to $TMPDIR on compute nodes
Data read multiple times on single node:
- Copy to $TMPDIR at job start for best performance
Temporary data for single node:
- Always use $TMPDIR, not workspaces
Multi-node temporary data:
- Use workspaces (not suitable for $TMPDIR)
AI/ML training data:
- Use Weka (NEMO2) or flash filesystems for best performance
- Or stage to $TMPDIR for repeated access
Compilation/build directories:
- Use flash-based filesystems (Weka, ffuc)
- Or $TMPDIR for better performance
For more information about specific filesystems, see the Quotas & Limits page.