Helix/Hardware: Difference between revisions

From bwHPC Wiki
Jump to navigation Jump to search
(39 intermediate revisions by the same user not shown)
Line 17: Line 17:
* Processor Frequency: 2.6 GHz
* Processor Frequency: 2.6 GHz
* Number of Cores per Node: 64
* Number of Cores per Node: 64
* Local disk space: None
* Local disk: None


{| class="wikitable"
{| class="wikitable" style="width:70%;"
|-
|-
! style="width:10%"|
! style="width:20%" |
! style="width:10%" colspan="2" style="text-align:center" | CPU Nodes
! style="width:40%" colspan="2" style="text-align:center" | CPU Nodes
! style="width:10%" colspan="3" style="text-align:center" | GPU Nodes
! style="width:40%" colspan="3" style="text-align:center" | GPU Nodes
|-
|-
!scope="column"| Node Type
!scope="column"| Node Type
Line 32: Line 32:
|-
|-
!scope="column"| Quantity
!scope="column"| Quantity
| xxx
| 332
| 12
| 15
| xxx
| 29
| 25
| 26
| 2
| 3
|-
|-
!scope="column" | Working Memory (GB)
!scope="column" | Installed Working Memory (GB)
| 256
| 256
| 2048
| 2048
Line 44: Line 44:
| 256
| 256
| 2048
| 2048
|-
!scope="column" | Available Memory for Jobs (GB)
| 236
| 2010
| 236
| 236
| 2010
|-
|-
!scope="column" | Interconnect
!scope="column" | Interconnect
Line 55: Line 62:
| -
| -
| -
| -
| 4x [https://www.nvidia.com/en-us/data-center/a40/ Nvidia A40]
| 4x [https://www.nvidia.com/en-us/data-center/a40/ Nvidia A40] (48 GB)
| 4x [https://www.nvidia.com/en-us/data-center/a100/ Nvidia A100]
| 4x [https://www.nvidia.com/en-us/data-center/a100/ Nvidia A100] (40 GB)
| 8x [https://www.nvidia.com/en-us/data-center/a100/ Nvidia A100]
| 8x [https://www.nvidia.com/en-us/data-center/a100/ Nvidia A100] (80 GB)
|-
|-
!scope="column" | Number of GPUs
!scope="column" | Number of GPUs
Line 77: Line 84:


Some Intel nodes (Skylake and Cascade Lake) from the predecessor system will be integrated. Details will follow.
Some Intel nodes (Skylake and Cascade Lake) from the predecessor system will be integrated. Details will follow.

<!-- Intel nodes tabel (draft)

Common features of all Intel nodes:
* Interconnect: 1x EDR

{| class="wikitable"
|-
! style="width:12%"|
! style="width:10%" colspan="2" style="text-align:center" |CPU
! style="width:20%" colspan="8" style="text-align:center" |GPU
|-
!scope="column"| Node Type
| colspan="1" style="text-align:center" | cpu-sky
| colspan="1" style="text-align:center" | cpu-cas
| colspan="4" style="text-align:center" | gpu-sky
| colspan="4" style="text-align:center" | gpu-cas
|-
!scope="column"| Architecture
| colspan="1" style="text-align:center" | Skylake
| colspan="1" style="text-align:center" | Cascade Lake
| colspan="4" style="text-align:center" | Skylake
| colspan="4" style="text-align:center" | Cascade Lake
|-
!scope="column"| Quantity
| 24
| 5
| 1
| 1
| 2
| 3
| 3
| 3
| 3
| 1
|-
!scope="column" | Processors
| 2 x Intel Xeon Gold 6130
| 2 x Intel Xeon Gold 6230
| 2 x Intel Xeon Gold 6130
| 2 x Intel Xeon Gold 6130
| 2 x Intel Xeon Gold 6130
| 2 x Intel Xeon Gold 6130
| 2 x Intel Xeon Gold 6230
| 2 x Intel Xeon Gold 6230
| 2 x Intel Xeon Gold 6240R
| 2 x Intel Xeon Gold 6240R
|-
!scope="column" | Processor Frequency (GHz)
| 2.1
| 2.2
| 2.1
| 2.1
| 2.1
| 2.1
| 2.1
| 2.1
| 2.4
| 2.4
|-
!scope="column" | Number of Cores
| 32
| 40
| 32
| 32
| 32
| 32
| 40
| 40
| 48
| 48
|-
!scope="column" | Working Memory (GB)
| 192
| 384
| 192
| 384
| 384
| 384
| 384
| 384
| 384
| 384
|-
!scope="column" | Local Disk (GB)
| 512 (SSD)
| 480 (SSD)
| 512 (SSD)
| 512 (SSD)
| 512 (SSD)
| 512 (SSD)
| 480 (SSD)
| 480 (SSD)
| 480 (SSD)
| 480 (SSD)
|-
!scope="column" | Coprocessors
| -
| -
| 4 x [https://www.nvidia.com/en-us/titan/titan-xp/ Nvidia Titan Xp (12 GB)]
| 4 x [https://www.nvidia.com/de-de/data-center/tesla-v100/ Nvidia Tesla V100 (16 GB)]
| 4 x [https://www.nvidia.com/de-de/geforce/products/10series/geforce-gtx-1080-ti/ Nvidia GeForce GTX 1080Ti (11 GB)]
| 4 x [https://www.nvidia.com/de-de/geforce/graphics-cards/rtx-2080-ti/ Nvidia GeForce RTX 2080Ti (11 GB)]
| 4 x [https://www.nvidia.com/de-de/data-center/tesla-v100/ Nvidia Tesla V100 (16 GB)]
| 4 x [https://www.nvidia.com/de-de/data-center/tesla-v100/ Nvidia Tesla V100s (32 GB)]
| 4 x [https://www.nvidia.com/de-de/geforce/graphics-cards/30-series/rtx-3090 Nvidia GeForce RTX 3090 (24 GB)]
| 4 x [https://www.nvidia.com/de-de/design-visualization/quadro/rtx-8000/ Nvidia Quadro RTX 8000 (48 GB)]

|-
! scope="column" | Number of GPUs
| -
| -
| 4
| 4
| 4
| 4
| 4
| 4
| 4
| 4
|-
! scope="column" | GPU Type
| -
| -
| TITAN
| V100
| GTX1080
| RTX2080
| V100
| V100S
| RTX3090
| RTX8000
|}
-->


== Storage Architecture ==
== Storage Architecture ==
Line 85: Line 226:


The components of the cluster are connected via two independent networks, a management network (Ethernet and IPMI) and an Infiniband fabric for MPI communication and storage access.
The components of the cluster are connected via two independent networks, a management network (Ethernet and IPMI) and an Infiniband fabric for MPI communication and storage access.
The Infiniband backbone is a fully non-blocking fabric with 200 GB/s data speed. The compute nodes are connected with different data speeds according to the node configuration.
The Infiniband backbone is a fully non-blocking fabric with 200 Gb/s data speed. The compute nodes are connected with different data speeds according to the node configuration.

Revision as of 13:22, 16 October 2023

System Architecture

The bwForCluster Helix is a high performance supercomputer with high speed interconnect. Is composed of login nodes, compute nodes and parallel storage systems connected by fast data networks. It is connected to the Internet via Baden Württemberg's extended LAN BelWü.

Operating System and Software

  • Operating system: RedHat
  • Queuing system: Slurm
  • Access to application software: Environment Modules

Compute Nodes

AMD Nodes

Common features of all AMD nodes:

  • Processors: 2 x AMD Milan EPYC 7513
  • Processor Frequency: 2.6 GHz
  • Number of Cores per Node: 64
  • Local disk: None
CPU Nodes GPU Nodes
Node Type cpu fat gpu4 gpu8
Quantity 332 15 29 26 3
Installed Working Memory (GB) 256 2048 256 256 2048
Available Memory for Jobs (GB) 236 2010 236 236 2010
Interconnect 1x HDR100 1x HDR100 2x HDR100 2x HDR200 4x HDR200
Coprocessors - - 4x Nvidia A40 (48 GB) 4x Nvidia A100 (40 GB) 8x Nvidia A100 (80 GB)
Number of GPUs - - 4 4 8
GPU Type - - A40 A100 A100

Intel Nodes

Some Intel nodes (Skylake and Cascade Lake) from the predecessor system will be integrated. Details will follow.


Storage Architecture

There is one storage system providing a large parallel file system based on IBM Spectrum Scale for $HOME, for workspaces, and for temporary job data.

Network

The components of the cluster are connected via two independent networks, a management network (Ethernet and IPMI) and an Infiniband fabric for MPI communication and storage access. The Infiniband backbone is a fully non-blocking fabric with 200 Gb/s data speed. The compute nodes are connected with different data speeds according to the node configuration.