PC & Computers

Japan’s Arm-Powered Fugaku Supercomputer Wins TOP500 Crown

The 55th edition of the TOP500 saw some significant additions to the list, spearheaded by a new number one system from Japan. The latest rankings also reflect a steady growth in aggregate performance and power efficiency.

The new top system, Fugaku, turned in a High Performance Linpack (HPL) result of 415.5 petaflops, besting the now second-place Summit system by a factor of 2.8x. Fugaku, is powered by Fujitsu’s 48-core A64FX SoC, becoming the first number one system on the list to be powered by ARM processors. In single or further reduced precision, which are often used in machine learning and AI applications, Fugaku’s peak performance is over 1,000 petaflops (1 exaflops). The new system is installed at RIKEN Center for Computational Science (R-CCS) in Kobe, Japan.

Number two on the list is Summit, an IBM-built supercomputer that delivers 148.8 petaflops on HPL. The system has 4,356 nodes, each equipped with two 22-core Power9 CPUs, and six NVIDIA Tesla V100 GPUs. The nodes are connected with a Mellanox dual-rail EDR InfiniBand network. Summit is running at Oak Ridge National Laboratory (ORNL) in Tennessee and remains the fastest supercomputer in the US.

At number three is Sierra, a system at the Lawrence Livermore National Laboratory (LLNL) in California achieving 94.6 petaflops on HPL. Its architecture is very similar to Summit, equipped with two Power9 CPUs and four NVIDIA Tesla V100 GPUs in each of its 4,320 nodes. Sierra employs the same Mellanox EDR InfiniBand as the system interconnect.

Sunway TaihuLight, a system developed by China’s National Research Center of Parallel Computer Engineering & Technology (NRCPC) drops to number four on the list. The system is powered entirely by Sunway 260-core SW26010 processors. Its HPL mark of 93 petaflops has remained unchanged since it was installed at the National Supercomputing Center in Wuxi, China in June 2016.

At number five is Tianhe-2A (Milky Way-2A), a system developed by China’s National University of Defense Technology (NUDT). Its HPL performance of 61.4 petaflops is the result of a hybrid architecture employing Intel Xeon CPUs and custom-built Matrix-2000 coprocessors. It is deployed at the National Supercomputer Center in Guangzhou, China.

A new system on the list, HPC5, captured the number six spot, turning in an HPL performance of 35.5 petaflops. HPC5 is a PowerEdge system built by Dell and installed by the Italian energy firm Eni S.p.A, making it the fastest supercomputer in Europe. It is powered by Intel Xeon Gold processors and NVIDIA Tesla V100 GPUs and uses Mellanox HDR InfiniBand as the system network.

Another new system, Selene, is in the number seven spot with an HPL mark of 27.58 petaflops. It is a DGX SuperPOD, powered by NVIDIA’s new “Ampere” A100 GPUs and AMD’s EPYC “Rome” CPUs. Selene is installed at NVIDIA in the US. It too uses Mellanox HDR InfiniBand as the system network.

Frontera, a Dell C6420 system installed at the Texas Advanced Computing Center (TACC) in the US is ranked eighth on the list. Its 23.5 HPL petaflops is achieved with 448,448 Intel Xeon cores.

The second Italian system in the top 10 is Marconi-100, which is installed at the CINECA research center. It is powered by IBM Power9 processors and NVIDIA V100 GPUs, employing dual-rail Mellanox EDR InfiniBand as the system network. Marconi-100’s 21.6 petaflops earned it the number nine spot on the list.

Rounding out the top 10 is Piz Daint at 19.6 petaflops, a Cray XC50 system installed at the Swiss National Supercomputing Centre (CSCS) in Lugano, Switzerland. It is equipped with Intel Xeon processors and NVIDIA P100 GPUs.

General highlights
Aggregate list performance is now 2.23 exaflops, up from 1.65 exaflops six months ago. The majority of that increase is the result of the new number one Fugaku supercomputer. The new entry point on the list (system number 500) is 1.24 petaflops, only a slight increase from the previous list. Overall the number of new systems in the list is only 51, a record low since the beginning of the TOP500 in 1993.

China continues to dominate the TOP500 with regard to system count, claiming 226 supercomputers on the list. The US is number two with 114 systems; Japan is third with 30; France has 18; and Germany claims 16. Despite coming in second on system count, the US continues to edge out China in aggregate list performance with 644 petaflops to China’s 565 petaflops. Japan, with its significantly smaller system count, delivers 530 petaflops.

Technology trends
A total of 144 systems on the list are using accelerators or coprocessors, which is nearly the same as the 145 reported six months ago. As has been the case in the past, the majority of the systems equipped with accelerator/coprocessors (135) are using NVIDIA GPUs.

The x86 continues to be the dominant processor architecture, being present in 481 of the 500 systems. Intel claims 469 of these, with AMD installed in 11 and Hygon in the remaining one. Arm processors are present in just four TOP500 systems, three of which employ the new Fujitsu A64FX processor, with the remaining one powered by Marvell’s ThunderX2 processor.

The breakdown of system interconnect share is largely unchanged from six months ago. Ethernet is used in 263 systems, InfiniBand is used in 150, and the remainder employ custom or proprietary networks. Despite Ethernet’s dominance in sheer numbers, those systems account for 471 petaflops, while InfiniBand-based systems provide 803 petaflops. Due to their use in some of the list’s most powerful supercomputers, systems with custom and proprietary interconnects together represent 790 petaflops.

Vendor highlights
Chinese manufacturers dominate the list in the number of installations with Lenovo (180), Sugon (68) and Inspur (64) accounting for 312 of the 500 systems. HPE claims 37 systems, while Cray/HPE has 35 systems. Fujitsu is represented by just 13 systems, but thanks to its number one Fugaku supercomputer, the company leads the list in aggregate performance with 478 petaflops. Lenovo, with 180 systems, comes in second in performance with 355 petaflops.
Green500 results

The most energy-efficient system on the Green500 is the MN-3, based on a new server from Preferred Networks. It achieved a record 21.1 gigaflops/watt during its 1.62 petaflops performance run. The system derives its superior power efficiency from the MN-Core chip, an accelerator optimized for matrix arithmetic. It is ranked number 395 in the TOP500 list.

In second position is the new NVIDIA Selene supercomputer, a DGX A100 SuperPOD powered by the new A100 GPUs. It occupies position seven on the TOP500.

In third position is the NA-1 system, a PEZY Computing/Exascaler system installed at NA Simulation in Japan. It achieved 18.4 gigaflops/watt and is at position 470 on the TOP500.

The number nine system on the Green500 is the top-performing Fugaku supercomputer, which delivered 14.67 gigaflops per watt. It is just behind Summit in power efficiency, which achieved 14.72 gigaflops/watt.

HPCG Results
The TOP500 list has incorporated the High-Performance Conjugate Gradient (HPCG) Benchmark results, which provided an alternative metric for assessing supercomputer performance and is meant to complement the HPL measurement.

The number one TOP500 supercomputer, Fugaku, is also now the leader on the HPCG benchmark with a record 13.4 HPCG-petaflops. The two US Department of Energy systems, Summit at ORNL and Sierra at LLNL, are now second and third, respectively, on the HPCG benchmark. Summit achieved 2.93 HPCG-petaflops and Sierra 1.80 HPCG-petaflops. All the remaining systems achieved less than one HPCG-petaflops.

The first version of what became today’s TOP500 list started as an exercise for a small conference in Germany in June 1993. Out of curiosity, the authors decided to revisit the list in November 1993 to see how things had changed. About that time, they realized they might be onto something and decided to continue compiling the list, which is now a much-anticipated, much-watched and much-debated twice-yearly event.

The TOP500 list is compiled by Erich Strohmaier and Horst Simon of Lawrence Berkeley National Laboratory; Jack Dongarra of the University of Tennessee, Knoxville; and Martin Meuer of ISC Group, Germany.

 

Related posts

Leave a Comment

* By using this form you agree with the storage and handling of your data by this website.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More