There’s been some news recently about Sunway TaihuLight supercomputer which nows top the list of the 500 fastest super computers with 93 PFLOPS achieved with Linpack, and is comprised of 40,960 Sunway SW26010 260 core “ShenWei” processors designed in China. But another interesting development is that ARMv8 are also slowly coming to supercomputers, starting with TianHe-2 super computer which is currently using Intel Xeon & Xeon Phi processors and second in the list, but according to a report on Vrworld, the US government decided to block US companies’ sales (i.e. Intel and AMD) to China as they were not at the top anymore, and also blocked Chinese investments into Intel and AMD, so the Chinese government decided to do it on their own, and are currently adding Phytium Mars 64-core 64-bit ARM processors to expand TianHe-2 processing power. Once the upgrade is complete Tianhe-2 should have 32,000 Xeons (as currently), 32,000 ShenWei processor, and 96,000 Phytium accelerator cards delivering up to 300 PFLOPS.
One other report on The Register explains that the next generation of K-Computer, currently using Fujitsu SPARC64 processor, will instead feature Fujitsu ARMv8 processors in Post-K super computer in 2020 delivering up to 1000 PFLOPS (or 1 Exa FLOPS). Details are sparse right now, but we do know Fujissu “has optimized the processor’s design to accelerate math, and squeeze the most of the die caches, hardware prefetcher and its Tofu interconnect”.
More details will likely be offered during “Towards Extreme-Scale Weather/Climate Simulation: The Post K Supercomputer & Our Challenges” presentation at ISC 2016 in Frankfurt, Germany later today.
Thanks to Sanders and Nicolas.
Jean-Luc started CNX Software in 2010 as a part-time endeavor, before quitting his job as a software engineering manager, and starting to write daily news, and reviews full time later in 2011.
5 Replies to “ARMv8 64-bit Processors To Replace Intel Xeon and SPARC64 Processors in Some Supercomputers”
So … 1 Exa-FLOPS with ARMv8 cores … how many (current) cores do you need?
http://www.anandtech.com/show/8718/the-samsung-galaxy-note-4-exynos-review/4 says about 0.5 GFLOP per A53-core.
So: 1E18 / 0.5E9 = 2E9 = 2.000.000.000 cores … ? Ouch.
Or are they quoting GPU-GLOPS?
Aside from the bit where they’re not using current cores, you’re comparing hypothetical specs (target on-paper supercomputer performance) vs measured performance of the second-weakest aarch64 core (A35 being the absolute leader). But if you’re really curious, in non-memory-bound scenarios A53 does north of 2flops/clock for fp32.
Please have a look into several Fujitsu presentations from HotChips conference. Basically speaking their SPARC64 provides enhanced ISA which provides ISNs to do computation in SIMD/vector mode. Although ARMv8 provides NEON I would guess Fujitsu will still use their vector coprocessors to reach 1 EFLOP.
The super computer at the top of the list currently has over 10 million cores, and as others have mentioned the Fujitsu ARMv8 implementation is likely to be several times faster than a Cortex A53. Cortex A72 is already roughly 3 times faster. So the total number of cores should be significantly lower than the 2 billions you’ve found.
You can also read “X-Gene 3 Challenges Xeon E5” The Linley Group report. ARM solutions are now entering Xeon E5 category (per thread performance and overall silicon performance). X-Gene 3 should be a major step forward and hopefully X-Gene 3 XL (64-core) will push it further. There is also Cavium ThunderX2 and other silicons. All of these would be targeted to cloud or data analytics solutions. China had announced a couple of ARMv8 HPC SOC, but no news if those are taped out, fab’ed or already running. Within 1-2 years we will start seeing competitive ARM products at preferred performance levels.
In some regions it will be easier to build or/and buy ARM based server products.