Computer clusters at the AEI
General purpose high performance computer clusters
The AEI has operated high performance computing (HPC) clusters since 2003. The first cluster, Peyote, was installed in 2003 and was ranked 395 in the Top 500 list. In 2005 the HPC-cluster Belladonna was installed and superseded in 2007 by Damiana (rank 192 in the Top 500 list of 2007). In 2011 the Datura cluster with its 200 compute nodes with Intel-CPUs was installed.
In 2016, HPC cluster Minerva was installed. Minerva's primarily usage is numerical-relativity simulations of coalescing black holes and neutron stars.
pic: AEI / Armin Okulla
In 2019, Sakura, a HPC cluster located at the MPCDF started its operation. Numerical-relativistic simulations of astrophysical events that generate both gravitational waves and electromagnetic radiation are run on Sakura. pic: © K. Zilker (MPCDF)
In 2016, HPC cluster Minerva was installed. Minerva's primarily usage is numerical-relativity simulations of coalescing black holes and neutron stars in order to calculate the resulting gravitational-wave radiation. The cluster was ranked on place 463 of the top 500 list with 365.0 TFlop/s. It consists of 594 compute nodes (dual-socket, 8-core Intel Haswell E5-2630v3, 2.40 GHz) with a total of 9504 CPU cores. Each core is equipped with 4GB RAM. The cluster has two storage systems with about 500TB disk space running BeeGFS.
In 2019, Sakura, a HPC cluster located at the Max Planck Computing and Data Facility (MPCDF) started its operation. Numerical-relativistic simulations of astrophysical events that generate both gravitational waves and electromagnetic radiation – e.g. mergers of neutron stars – are run on Sakura. The 11,600 CPU core computer cluster is integrated in a fast Omnipath-100 Network and 10Gb Ethernet connections. It consists of head nodes with Intel Xeon Silver 10 core Processors and 192GB to 384GB main memory as well as compute nodes with Intel Xeon Gold 6148 CPUs.
High throughput data analysis computer clusters
The Atlas Computing Cluster at the AEI in Hannover is the world's largest and most powerful resource dedicated to gravitational wave searches and data analysis.
Atlas was officially launched in May 2008 with 1344 quad-core compute nodes running at 2.4 GHz CPU clock speed. One month later it was ranked number 58 on the June 2008 Top-500 list of the world’s fastest computers. At that time it was the sixth fastest computer in Germany.
Atlas was also the world’s fastest computer that used Ethernet as the networking interconnect. This is notable since Ethernet is a relatively inexpensive networking technology. The faster machines on the Top-500 list all used costlier interconnects such as Infiniband or proprietary technologies. Worldwide, Atlas came in at the front of the performance/price competition. In recognition of this, Atlas received an InfoWorld 100 award for being one of the 100 best IT solutions for 2008.
Currently, Atlas consists of more than 2,500 compute servers, with about 40,000 logical central processing unit (CPU) cores in total and about 2,000 graphics processing units (GPUs). Atlas can store 5.5 Petabytes on hard drives and 15 Petabytes on magnetic tape for data archiving. Atlas' peak computing power is of order one PetaFLOP/s. To connect all compute nodes, a total of 15 kilometers of Ethernet cables have been used. The total bandwidth is 20 Terabit/s.
In addition to the Atlas cluster in Hannover, the AEI operates a computer cluster in Potsdam available to scientists at AEI and their collaborators worldwide. After Merlin (2002-2008), Morgane (2007-2014), and Vulcan (2013 – 2019) Hypatia provides about 9,000 processor cores (in 16-core AMD EPYC CPUs) since 2019. Hypatia is dedicated to the analysis of gravitational-wave data for discoveries about the properties of black holes, neutron stars and other potential gravitational-wave sources.
Hypatia is a general purpose gravitational wave data analysis cluster. Within the LIGO-Virgo collaboration this cluster is used for devising and testing new methods, large scale Monte-Carlo simulations, waveform development and study of systematic biases. As a part of the LISA consortium, AEI scientists currently participate in developing the science case for LISA. Within European Pulsar Timing Array collaboration AEI researchers search for gravitational-wave signals from the population of super-massive black hole binaries in the nano-Hertz band.
Like Atlas at AEI Hannover, Hypatia has been designed for High Throughput Computing (HTC), and is best suited for running many, mostly independent, tasks in parallel. It's built from commodity parts, and uses a Gigabit-Ethernet network. The processes running may allocate 34 TB of memory in total, and uses 400 TB of disk storage. As batch scheduler, HTCondor is used. Also, in the larger context of the LIGO Scientific Collaboration (LSC), parts of Hypatia serve as testbeds for software as well as for alternative processor architectures, storage concepts, and operating systems.