Home > Debian, Fedora, Hardware, HiSilicon, Linux, Ubuntu, Video > HiSilicon D02 Server Board Supports up to 64 ARM Cortex A57 Cores

HiSilicon D02 Server Board Supports up to 64 ARM Cortex A57 Cores

February 16th, 2015 Leave a comment Go to comments

HiSilicon has showcased their latest server SoC and board at Linaro Connect Hong Kong 2015, with up to two processors with 32 Cortex A57 cores @ 2.1GHz, 8 DIMM DDR3 slots (up to 128 GB RAM), 12 SATA ports, 4 PCIe slots, 10GbE / GbE ports.

HiSilicon_D02D02 board specifications:

  • SoC – Hisilicon PhosphorV660 Hip05 with 16 to 32 ARM Cortex-A57 cores @ up to 2.1GHz and 1MB L2 cache/cluster, 32MB L3 cache
  • System Memory – 2x Memory channel 4x DDR3 DIMM(4x DIMM per processor)
  • Storage
    • 12x SAS 3.0 ports @ 12 Gbps (8 for the first processor, 4 for the second).  SAS port are compatible with SATA drives. You may want to read SAS vs SATA post for more details about SAS.
    • 2x SPI Flash 158Mb BIOS/UEFI
    • 1Gb NorFlash
  • Connectivity – 2×10/100/1000Mbit/s Gigabit Ethernet ports, 2x xGE SFP+ ports (10Gb/s)
  • Expansion – 2x 8x PCI express interfaces per processor (4 in total)USB – 1x USB 2.0 host port
  • Debugging – 1x UART interface, 1x ARM Tracer connector, 1x JTAG interface
  • Misc – RTC battery
  • Power – ATX power supply
  • Dimensions – 305 x xyz mm (SSI-EEB/E-ATX Compatible). xyz = 330, 257, 272, 264, or 267 (Not sure yet)

The board can run Ubuntu, Debian, OpenSUSE, or Fedora. The company has released a hacking manual for D02 board, where you can find more details, and learn how to build the kernel, and hack around with Grub and UEFI among other things.

For example, provided you’ve already installed the right development tools,. including Aarch64 toolchain, you should be able to build the kernel for the board as follows:

Binary files can also be downloaded directly from https://github.com/hisilicon/d02_binary.

Charbax filmed a demo of the board running Ubuntu, Linaro LAVA server, and LXC (Linux Containers). The board currently come with Hip05 SoC with 16 Cortex A57 cores, but in a couple of months, the version with 32 cores will come out, and and Linaro engineers working on ARM64 server should get their hands on several boards.

Via ARMdevices.net

  1. kcg
    February 16th, 2015 at 15:44 | #1

    Sigh to see 10GigE, this makes usually prices much higher than usual… Otherwise very nice board. Something like that may perhaps even replace x64 devel workstation — with some enthusiasm to tolerate lower performance of such solution. 🙂

  2. anon
    February 16th, 2015 at 18:27 | #2

    If those really are stock Cortex-A57 cores, those chips will perform way faster than those ThunderX processors that are using crippled 2-issue pipelines, the stock A57 has 3-issue, and of course those X-Gene processors have 4-issue pipelines… Sure the Nvidia K1 Denver has 7-issue pipeline…

    Stock Cortex-A57 has faster performance (IPC) than AMD Steamroller (yes, the notorious 1st gen) and little slower than Haswell-E (with those ring busses adding latency over smaller socket Haswell)… Of course all depends on the internal interconnects implemented, and the memory bandwidth, and those tend to lack on ARM architecture.

    The quick analysis says that the ThunderX is best as network processor (doing simple things like OSI layer processing), X-Gene/K1 for HPC, and basic A57 (like the AMD Opteron A1100) somewhere in between those extremes.

  3. kcg
    February 17th, 2015 at 05:04 | #3

    anon :
    If those really are stock Cortex-A57 cores, those chips will perform way faster than those ThunderX processors that are using crippled 2-issue pipelines, the stock A57 has 3-issue, and of course those X-Gene processors have 4-issue pipelines… Sure the Nvidia K1 Denver has 7-issue pipeline…

    You are comparing apples and oranges here. ThunderX is 2-issue, but is also in-order design which means similar to cortex-A8. A57 is similar to A15 (except more regs), that means 3-issue out-of-order design. X-Gene, I’m not sure if this is out-of-order or in-order, but hope in OoO. And finally denver is 7-issue but 7-issue of its internal isns of it’s internal ISA. To this ISA you need to translate ARMv8 ISA isns and this also takes some performance so I would guess Denver is somewhere around 3-4-issue comparing to native ARMv8.

  4. anon
    February 17th, 2015 at 08:42 | #4

    Yeah, as I found two different infos on ThunderX OoO vs non-OoO design, just used the “crippled” as an catch-all nasties word. 🙂

    X-Gene is an OoO design, and really looking forward for its 28nm X-Gene 2 variation coming out in 3Q2015.

    The added latency on modern microcode generation is nowadays pretty small, and I bet Nvidia has pretty close to wirespeed implementation on Denver (as the core is quite a bit larger than bog-standard 3-issue one)… Have not personally seen any infos on per instruction clock cycle tables though… That would be interesting thing to see, comparing different implementations of different Cortex-A cores.

  5. kcg
    February 17th, 2015 at 16:03 | #5

    @anon Honestly, X-Gene 2 will be really interesting from perf/watt point of view, that means for server deployment. I’m afraid it’ll not bring much of performance increase.
    W.r.t. Denver, I’m not sure about efficiency since this is not just microcode translation but about whole software translation layer running on top of native ISA. It’s very similar to Transmeta’s Crusoe design and their codemorphing software. Just look at http://www.anandtech.com/show/8670/google-nexus-9-preliminary-findings/2 to see some of the benchmarks.
    Generally speaking the most fastest ARMv8 core is probably in Apple’s A6. This is 6-issue OoO design IIRC. Pity that Apples does not sell boards with it…

  6. anon
    February 17th, 2015 at 22:29 | #6

    Oh, I had not looked that deeply into K1 (as the thing with it was the Nvidia GPU from software point of view), too bad that the Transmeta failed, it was an interesting option for future architectures and was rooting for them to become new kid of the block, even had once an Efficeon box from Sharp that I got as used just to give it a try, too slow compared to the other players in the x86 market, especially the AMD64.

    And the 40nm to 28nm change on X-Gene is pretty much only to reach perf/watt, but it still is an interesting on HPC stuff (I’m running mainly Gentoo boxes, so need the compilation grunt) as the AArch64 market is still not too crowded.

    Yeah, Apple’s in-house team (ex P.A. Semi people?) did an excellent job with their design.

  7. kcg
    February 18th, 2015 at 00:30 | #7

    @anon Heck, I also think that those are P.A. Semi people behind the Apple’s Ax designs. For Apple this was really excellent purchase. Otherwise X-Gene would be nice to have, but let’s see if prices go a little bit down still…

  8. tl
    February 23rd, 2015 at 09:01 | #8

    @anon when you are comparing a processor, you should compare from whole processor perceptive instead of comparing from core to core. Eventually, it is how much performance you can get with given power consumption and cost.

    If you really want to look at core to core comparison, nobody is going to be able to compete with Intel. Intel Xeon single core performance is 3-4x vs Cortex A57.

  9. tl
    February 23rd, 2015 at 09:03 | #9

    Didn’t see much traction in the market for X-Gene yet.

  10. anon
    February 23rd, 2015 at 10:05 | #10

    Yeah, sure my Ivy Bridge Xeon beats these per core, but overall perf/watt goes to ARM camp, even my Cortex-A9/A15 cores already beat them on compilation/time/watt (Gentoo user).

    And I would personally love to be able to throw out my last x86 systems in favor for a new (smaller, an underdog) company with something that would be faster than Intel systems, X-Gene 2+ might well be a nice step towards that, 16-core X-Gene 2 should be pretty close to at least the small socket (115x) Intel systems.

    And I would not mind give more money to that underdog per performance, just like if I had the money, I would rather buy the Tesla, instead of any other car company, just because they are an new company compared to the other well established ones… Give extra $50k just for the principle of giving to a company that does more R&D, instead of countless “facelifts”.

  11. February 8th, 2016 at 17:28 | #11

    A Hisilicon D03 server board is also worked on -> http://open-estuary.org/d03/
    No details for now, except it will have more CPU power (Cortex A72?), and I/O bandwith.

  1. No trackbacks yet.