Helix/Hardware: Difference between revisions
S Richling (talk | contribs) No edit summary |
S Richling (talk | contribs) |
||
Line 86: | Line 86: | ||
<!-- Intel nodes tabel (draft) |
<!-- Intel nodes tabel (draft) |
||
Common features of all Intel nodes: |
|||
* Interconnect: 1x EDR |
|||
{| class="wikitable" |
{| class="wikitable" |
||
|- |
|- |
||
Line 175: | Line 179: | ||
| 480 (SSD) |
| 480 (SSD) |
||
| 480 (SSD) |
| 480 (SSD) |
||
|- |
|||
!scope="column" | Interconnect |
|||
| EDR |
|||
| EDR |
|||
| EDR |
|||
| EDR |
|||
| EDR |
|||
| EDR |
|||
| EDR |
|||
| EDR |
|||
| EDR |
|||
| EDR |
|||
|- |
|- |
||
!scope="column" | Coprocessors |
!scope="column" | Coprocessors |
||
Line 226: | Line 218: | ||
|} |
|} |
||
--> |
--> |
||
== Storage Architecture == |
== Storage Architecture == |
Revision as of 13:15, 1 November 2022
System Architecture
The bwForCluster Helix is a high performance supercomputer with high speed interconnect. Is composed of login nodes, compute nodes and parallel storage systems connected by fast data networks. It is connected to the Internet via Baden Württemberg's extended LAN BelWü.
Operating System and Software
- Operating system: RedHat
- Queuing system: Slurm
- Access to application software: Environment Modules
Compute Nodes
AMD Nodes
Common features of all AMD nodes:
- Processors: 2 x AMD Milan EPYC 7513
- Processor Frequency: 2.6 GHz
- Number of Cores per Node: 64
- Local disk: None
CPU Nodes | GPU Nodes | ||||
---|---|---|---|---|---|
Node Type | cpu | fat | gpu4 | gpu8 | |
Quantity | 248 | 12 | 29 | 26 | 2 |
Installed Working Memory (GB) | 256 | 2048 | 256 | 256 | 2048 |
Available Memory for Jobs (GB) | 248 | 2010 | 248 | 248 | 2010 |
Interconnect | 1x HDR100 | 1x HDR100 | 2x HDR100 | 2x HDR200 | 4x HDR200 |
Coprocessors | - | - | 4x Nvidia A40 (48 GB) | 4x Nvidia A100 (40 GB) | 8x Nvidia A100 (80 GB) |
Number of GPUs | - | - | 4 | 4 | 8 |
GPU Type | - | - | A40 | A100 | A100 |
Intel Nodes
Some Intel nodes (Skylake and Cascade Lake) from the predecessor system will be integrated. Details will follow.
Storage Architecture
There is one storage system providing a large parallel file system based on IBM Spectrum Scale for $HOME, for workspaces, and for temporary job data.
Network
The components of the cluster are connected via two independent networks, a management network (Ethernet and IPMI) and an Infiniband fabric for MPI communication and storage access. The Infiniband backbone is a fully non-blocking fabric with 200 Gb/s data speed. The compute nodes are connected with different data speeds according to the node configuration.