Calculable black holes

Cluster computers based on Intel Xeon architecture at the Albert Einstein Institute are now twice as fast

February 26, 2004

The cluster / technical data

A high-performance Linux Compute Cluster is being used. After extension, the cluster consists of 128 computing nodes, 64 of which have 2 Intel XEON P4 processors each (2.66 GHz and 2 GB of RAM). 64 additional nodes also have 2GB RAM, but are equipped with 2 Intel XEON processors with 3.06 GHz clocking. Both types have a 120GB hard drive as local storage space.  The four file servers, each with 1TB storage space, have been increased by another four. At the same time the existing ones were expanded, so that now 8 x 1.5 TB is available (1 terabyte = 1,000 gigabytes). For comparison: an ordinary PC has a hard drive capacity of 80-120 GB.

The core of the high-performance cluster is the network and thereby the corresponding switch that provides for inter-process communication. This switch was provided by the Force10Networks company. In this regard, special importance is placed on short latency periods and delay-free data transmission, both of which are guaranteed by GigaBit Ethernet technology.

Although today Myrinet, a high-performance switching technology, often plays an important role, the choice was made in favour of GigaBit, because it is more or less the standard and therefore also promises favourable extension possibilities in the future and was found to have an optimal price/performance ratio.

Because typical computer runs take several days or even weeks, the runs are administered by a batch system. The users employ management nodes to communicate with the cluster. It is there that programmes are compiled and the results are visualised on nodes. An extremely important part of all computational tasks of scientists at AEI is played by the CACTUS Code (, developed by AEI. It is a flexible selection of tools and makes it easy for all scientists involved to formulate problems in a computer-friendly way and to have calculations carried out.  

Technical data

64 PC nodes with 2 processors @ 2.66 GHz/533 FSB (front side bus)

64 PC nodes with 2 processors 3.06 GHz/533 FSB (front side bus). Processor type: Intel XEON with hyper-threading technology.

Per node:

2 GByte RAM memory

120 GB storage capacity

3 network cards

8 storage nodes, each with 1.5 TB storage capacity

2 head nodes (also called access and management nodes) 

Storage and head nodes are similar to computer nodes, but have 4GB of RAM memory and do not need an interprocess network. The system board and power supply units are designed redundantly.

The operating system Linux with RedHat distribution is installed on all computers.


Each computer node has three network cards for three specific networks. The most important is the InterconnectNetwork which, by using computer nodes with 1,000 Mbits (1Gbit), connects lines by means of a very powerful switch. A Force10Networks company switch is used (for a switch description see below). It has a back plane (BUS circuit board) capacity of 600 Mbits.

The second network, which is also very important, is used to transfer the results of individual nodes to the so-called storage nodes. Because of the enormous data output of the computer nodes, the best approach is to distribute the load over several nodes. For example, the output of 16 nodes is written to one storage node. The network uses an HP ProCurve 4108gl switch. It has a back plane of 36 Gbits. This is sufficient to deal with a load of 4 Mbytes from each [!] computer node at the same time.

The third network ensures that all components of the cluster can be operated. For this purpose a switch from the HP company has been used. In order to keep the cable length as short as possible, two further switches are used in this network. The manufacturer of these is the 3Com company.

Cooling of the cluster 

Because of the small space required for the equipment, SlashTwo housing was chosen. This packed form requires special consideration of air flows in the housing, since the processors give off an enormous amount of heat that has to be transported away as quickly as possible. The temperature of the ambient air should not exceed 20°C. It is necessary to ensure that an air volume of 4 x 1400 m2 is available and can be recirculated. The existing air conditioning system can handle these values and has an output of ca. 50kW. The existing ceiling units are available as a reserve for especially sunny days in the summer months and provide an additional 24 kW.

Power supply                                                                                                               

The cluster is supplied with 6x25A lines. A UPS (uninterrupted power supply) ensures an even power supply of the storage and head nodes for a period of 20 minutes. Special software then ensures that these computers automatically shut down and are turned off.

Further specifications


Pro Rack (19" cabinet):                                                                                                                   

Weight: 400 kg with 16 SlashTwo housing                                                                                     

240 kg for the network cabinet                                                                                                         

250 kg for the rack with the storage nodes and head nodes, including the 6 TB storage units


Force10Networks E600                                                                                                                         

Weight: 110 kg                                                                                                                         

Power consumption: 2800 W                                                                                                               

Waste heat: 1400W - 3500W

Network specifications                                                                                                                         

Back plane capacity: 600 Gbits

Special software                                                                                                                            

Although the cluster can be regarded as one unit, the individual components must nonetheless be capable of individual use as software.

In this regard special software, so-called management software, simplified matters enormously. For this purpose the Megware company has developed a cluster management software by the name of Clustware.

Special features of the Cluster

Peak performance of the Cluster 


1. Cluster half:                                                                                                                                      

128 x 2 x 2,66GHz = 680GFlops (128 CPUs, 2 floating point units per CPU, 2.66GHz per unit)

2. Cluster half:                                                                                                                               

128 v 2 3.06GHz = 783GFlops (128 CPUs, 2 floating point units per CPU, 3.06GHz per unit)                  

Total: 1.46 TFlops                                                                                                                         

The true values will be reflected in the benchmarks.                                                                       

A single PC, when used in the Cluster (2 CPUs) and available as a desktop workstation for individual scientists at the AEI, has a performance level of 10 Gflops.

The communication network is based on Gigabit Ethernet. This network card was chosen, based on the assumption that the development of Ethernet would continue.

The interprocess switch is already designed for 10 Gigabit Ethernet.

Other Interesting Articles

Go to Editor View