Applied Micro X-Gene (64-bit ARM) vs Intel Xeon (64-bit x86) Performance and Power Usage

A group of researcher at CERN have evaluated Applied Micro X-Gene 1 64-bit ARM XC-1 development board against Intel Xeon E5-2650 and Xeon Phi SE10/7120 systems, and one of them, David Abdurachmanov, presented their findings at ACAT’ 14 conference (Advanced Computing and Analysis Techniques) by listing some of the issues they had to port their software to 64-bit ARM, and performance efficiency of the three systems for data processing of High Energy Physics (HEP) experiments like those at the Large Hadron Collider (LHC), where performance-per-watt is important, as computing systems may scale to several hundred thousands cores.

Intel Xeon Phi platform based on Many Integrated Cores (MIC) computer architecture was launched the HPC market, and contrary to the table above features 61 physical cores. Applied X-Gene 1 (40nm process) was used instead of X-Gene 2 built on 28-nm process which was not available at the time. The ARM platform ran Fedora 19, whereas the Intel processor used Scientific Linux CERN 6.5.

The researchers run the CERN’s CMSSW applications for testing. Let’s jump to the results.

AOM_X-Gene_1_vs_Intel_XeonAs expected Intel Xeon processor and Phi coprocessor both have more performance than X-Gene 1 ARM SoC.

X-Gene_Intel_Xeon_Phi_Performance_Per_WattHowever, when it comes to performance-per-watt, APM X-Gene 1 is clearly ahead of Intel Xeon E5-2650 and there’s no comparison against Xeon Phi systems.

The conclusion of the report reads as follows:

We have built the software used by the CMS experiment at CERN, as well as portions of the OSG software stack, for ARMv8 64-bit. It has been made available in the official CMS software package repository and via the CVMFS distributed file system used by Grid sites.

Our initial validation has demonstrated that APM X-Gene 1 Server-on-Chip ARMv8 64-bit solution is a relevant and potentially interesting platform for heterogeneous high-density computing. In the absence of platform specific optimizations in the ARMv8 64-bit GCC compiler used, APM X-Gene 1 shows excellent promise that the APM X-Gene hardware will be a valid competitor to Intel Xeon in term of power efficiency as the software evolves. However, Intel Xeon Phi is a completely different category of product. As APM X-Gene 2 is being sampled right now, built on the TMSC 28nm process, we look forward to extending our work to include it into our comparison.

You can read the full report “Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi” for details.

Support CNX Software - Donate via PayPal or become a Patron on Patreon

Leave a Reply

9 Comment threads
0 Thread replies
Most reacted comment
Hottest comment thread
6 Comment authors
Server Performance & Power Usage Comparison – Intel Xeon E3 vs Intel Atom C2720 vs Applied Micro X-Gene 1 vs IBM Power 8Angel GenchevARM SOC Overview 2015 | BitkistlcristianCERN scientists pit ARM vs Xeon in intriguing contest | News | TechRadar Recent comment authors
newest oldest most voted
Notify of

What about the purchase cost/event capacity – does that favour ARM too? How similar are these tasks to say web-serving?

(I don’t dare to even download something titled “Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi 😉


Does Xeon Phi SE10/7120 systems supports 244 threads or is that a typo?

8 x 4 (eight times four) is not 244, it is only 32


just looking at the yellow (ARM) and the red (XEON-E5) lines, i can say that 1,5/30 is less than 4/60. So is the ARM so efficient? Then, you’re just counting the CPU; plus, if you need 2 ARM machines for one Intel you’re even less efficient. Rather strange conclusions coming from CERN? Or do i need to go back to school?


[…] Via CNX-software […]


I think they refer to the potential efficiency benefit from the now sampling 28nm cores.


[…] APM       X-Gene 1               64 Bit, custom,        40 nm, octa-core,   pcie, satav3, 10GbE… […]

Angel Genchev

I need a real world “performance per energy” comparison. I mean (total_work_done)/(total_Energy_consumed).
Example: GZIP given, say 100MB XML file. Total Energy wasted for this work was say 0.5W/h with this CPU, 0.3W/h for that. Because it`s the Energy, not the power what we pay for.
Also strain factors like I/O subsystem should be eliminated, e.t. use a RAM disk.


[…] year, the CERN published a paper comparing Applied Micro X-Gene (64-bit ARM) vs Intel Xeon (64-bit x86) Performance and Power Usage, and they’ve now added IBM Power 8 and Intel Atom Avoton C2750 processor to the mix in a new […]