Qualcomm Provides Details about 64-bit ARM Falkor CPU Cores used in Centriq 2400 Server-on-Chip

Qualcomm officially announced they started sampling Centriq 2400 SoC with 48 ARMv8 cores for datacenters & cloud workloads using a 10nm process, but at the time the company did not provide that many details about the solution or the customization made to the CPU cores.

Qualcomm has now announced that Falkor is the custom CPU design in Centriq 2400 SoC with the key features listed by the company including:

  • Fully custom core design – Designed specifically for the cloud datacenter server market, with a 64-bit only micro-architecture based on ARMv8 (Aarch64).
  • Scalable building block The Falkor core duplex includes two custom Falkor CPUs, a shared L2 cache and a shared bus interface to the Qualcomm System Bus (QSB) ring interconnect.
  • Designed for performance, optimized for power
    • 4-issue, 8-dispatch heterogeneous pipeline designed to optimize performance per unit of power, with variable length pipelines that are tuned per function to maximize throughput and minimize idle hardware.
    • power management techniques: independent p-state control for each of the CPUs and L2, with entry to and exit from low-power states controlled by hardware state machines, and hardware state retention for power-collapsed sleep states with ultra-fast recovery.
  • Performance under memory-intensive workloads Falkor is designed to fulfill the demand for larger instruction footprints using an innovative split instruction cache comprised of a single-cycle, low-power 24KB L0 I-cache complementing its 64KB L1 I-cache. The core also supports a 32KB L1 D-cache with a 3-cycle load-use latency. The L1 D-cache is augmented by a sophisticated multi-level hardware prefetch engine that dynamically adapts to system conditions.
  • Datacenter features
    • ARM Execution Levels (EL0-EL3) and TrustZone secure execution environment.
    • ARMv8 instruction extensions to accelerate cryptographic transform and secure hash operations such as AES, SHA1, and SHA2-256
    • RAS mechanisms needed to keep a datacenter running, such as fault isolation, reporting, and handling techniques.
  • System on a chip – The 48 Falkor CPUs are brought together in a fully-integrated SoC with high-bandwidth and low-latency ring interconnect, large L3 cache and multiple memory controllers. It also includes an on-die hardware-based immutable root of trust that authenticates firmware before the first line of firmware is ever executed

Centriq 2400 SoC is scheduled to start shipping later this year. You’ll find an in-depth overview of Falkor micro-architecture, and more slides on Anandtech.

Share this:
FacebookTwitterHacker NewsSlashdotRedditLinkedInPinterestFlipboardMeWeLineEmailShare

Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress

ROCK Pi 4C Plus

8 Replies to “Qualcomm Provides Details about 64-bit ARM Falkor CPU Cores used in Centriq 2400 Server-on-Chip”

  1. “Segmented ring bus” sounds like a latency optimisation over a classic ring bus. I’d love to see more details on the subject, though. I mean, a torus bus is a ‘segmented ring bus’ in a way.

  2. @tkaiser
    the anandtech link provided by cnxsoft

    see last page, called “closing thoughts” : Qualcomm is going to be supporting Windows Server on Centriq 2400-series SoCs

Leave a Reply

Your email address will not be published. Required fields are marked *

Khadas VIM4 SBC
Khadas VIM4 SBC