Qualcomm Provides Details about 64-bit ARM Falkor CPU Cores used in Centriq 2400 Server-on-Chip

Orange Pi Development Boards

Qualcomm officially announced they started sampling Centriq 2400 SoC with 48 ARMv8 cores for datacenters & cloud workloads using a 10nm process, but at the time the company did not provide that many details about the solution or the customization made to the CPU cores.

Qualcomm has now announced that Falkor is the custom CPU design in Centriq 2400 SoC with the key features listed by the company including:

  • Fully custom core design – Designed specifically for the cloud datacenter server market, with a 64-bit only micro-architecture based on ARMv8 (Aarch64).
  • Scalable building block The Falkor core duplex includes two custom Falkor CPUs, a shared L2 cache and a shared bus interface to the Qualcomm System Bus (QSB) ring interconnect.
  • Designed for performance, optimized for power
    • 4-issue, 8-dispatch heterogeneous pipeline designed to optimize performance per unit of power, with variable length pipelines that are tuned per function to maximize throughput and minimize idle hardware.
    • power management techniques: independent p-state control for each of the CPUs and L2, with entry to and exit from low-power states controlled by hardware state machines, and hardware state retention for power-collapsed sleep states with ultra-fast recovery.
  • Performance under memory-intensive workloads Falkor is designed to fulfill the demand for larger instruction footprints using an innovative split instruction cache comprised of a single-cycle, low-power 24KB L0 I-cache complementing its 64KB L1 I-cache. The core also supports a 32KB L1 D-cache with a 3-cycle load-use latency. The L1 D-cache is augmented by a sophisticated multi-level hardware prefetch engine that dynamically adapts to system conditions.
  • Datacenter features
    • ARM Execution Levels (EL0-EL3) and TrustZone secure execution environment.
    • ARMv8 instruction extensions to accelerate cryptographic transform and secure hash operations such as AES, SHA1, and SHA2-256
    • RAS mechanisms needed to keep a datacenter running, such as fault isolation, reporting, and handling techniques.
  • System on a chip – The 48 Falkor CPUs are brought together in a fully-integrated SoC with high-bandwidth and low-latency ring interconnect, large L3 cache and multiple memory controllers. It also includes an on-die hardware-based immutable root of trust that authenticates firmware before the first line of firmware is ever executed

Centriq 2400 SoC is scheduled to start shipping later this year. You’ll find an in-depth overview of Falkor micro-architecture, and more slides on Anandtech.

8
Leave a Reply

avatar
8 Comment threads
0 Thread replies
3 Followers
 
Most reacted comment
Hottest comment thread
6 Comment authors
tkaisernobeNobody of ImportwillyPhilipp Recent comment authors
  Subscribe  
newest oldest most voted
Notify of
blu
Guest
blu

“Segmented ring bus” sounds like a latency optimisation over a classic ring bus. I’d love to see more details on the subject, though. I mean, a torus bus is a ‘segmented ring bus’ in a way.

Philipp
Guest
Philipp

That’s a machine! I’m really interested in the price. I guess it’s around 200 Dollar?!

willy
Guest
willy

@Philipp
I think you can reasonably add a “1” in front of your guessed price 🙂

Nobody of Import
Guest
Nobody of Import

@willy

Only a “1”?

nobe
Guest
nobe

no info yet on its power consumption…

it also seems qualcomm will support windows server with this soc

tkaiser
Guest
tkaiser

nobe :
it also seems qualcomm will support windows server with this soc

Any sources or information on this?

nobe
Guest
nobe

@tkaiser
the anandtech link provided by cnxsoft

see last page, called “closing thoughts” : Qualcomm is going to be supporting Windows Server on Centriq 2400-series SoCs

blu
Guest
blu

@tkaiser
MS/QCOMM talks of cloud partnership predate their talks of notebook partnership.