LUMI’s full system architecture revealed
We are excited to reveal the full specs of LUMI now after AMD published its new generation of AMD Radeon Instinct™ MI250X GPUs.
– The AMD MI250X GPU is in a class of its own now and for a long time to come. The technical supremacy and performance per watt were the primary reasons why AMD’s MI250X GPUs were selected for LUMI, explains Pekka Manninen, Director of LUMI Leadership Computing Facility.
The full system architecture of LUMI is the following:
- The LUMI system is supplied by Hewlett Packard Enterprise (HPE), based on an HPE Cray EX supercomputer.
- The GPU partition will consist of 2560 nodes, each node with one 64 core AMD Trento CPU and four AMD MI250X GPUs.
- Each GPU node features four 200 Gbit/s network interconnect cards, i.e. has 800 Gbit/s injection bandwidth.
- Each MI250X GPU consists of two compute dies, each with 110 compute units each, and each compute unit has 64 stream processors for a total of 14080 stream processors.
- The committed Linpack performance of LUMI-G is 375 Pflop/s.
- The MI250X GPU comes with a total of 128 GB of HBM2e memory offering over 3.2 TB/s of memory bandwidth.
- A single MI250X card is capable of delivering 42.2 TFLOP/s of performance in the HPL benchmarks. More in-depth performance results for the card can be found on AMD’s website.
- In addition to the GPUs in LUMI there is another partition (LUMI-C) using CPU only nodes, featuring 64-core 3rd-generation AMD EPYC™ CPUs, and between 256 GB and 1024 GB of memory. There are 1,536 dual-socket CPU nodes in total. LUMI-C was #5 on the November 2021 Graph500 list and #76 on the November 2021 Top500 list.
- LUMI also has a partition with large memory nodes, with a total of 32 TB of memory in the partition.
- For visualization workloads LUMI has 64 Nvidia A40 GPUs.
- LUMI’s storage system will consist of three components. First, there will be a 8 petabyte all flash Lustre system for short term fast access. Next there is a longer term more traditional 80 petabyte Lustre system based on mechanical hard drives.
- For easy data sharing and project lifetime storage LUMI has 30 petabytes of Ceph based storage.
- All the different compute and storage partitions are connected to the very fast Cray Slingshot interconnect of 200 Gbit/s.
- When completed, LUMI will take almost 400m2 of space, which is about the size of two tennis courts. The weight of the system is nearly 150 000 kilograms (150 metric tons).
Schedule for the second phase deployment
Unfortunately, the second phase installations will start later than anticipated. The delays are due to issues in global microelectronics supply chains.
The second deployment phase will proceed as follows:
- March 2022: install 6 first cabinets (140 Pflop/s)
- March-May 2022: build up the system in four batches of 6 cabinets each
- November 2022: pilot use phase
- December 2022: general availability