Qualcomm officially announced they started sampling Centriq 2400 SoC with 48 ARMv8 cores for datacenters & cloud workloads using a 10nm process, but at the time the company did not provide that many details about the solution or the customization made to the CPU cores.
- Fully custom core design – Designed specifically for the cloud datacenter server market, with a 64-bit only micro-architecture based on ARMv8 (Aarch64).
- Scalable building block – The Falkor core duplex includes two custom Falkor CPUs, a shared L2 cache and a shared bus interface to the Qualcomm System Bus (QSB) ring interconnect.
- Designed for performance, optimized for power
- 4-issue, 8-dispatch heterogeneous pipeline designed to optimize performance per unit of power, with variable length pipelines that are tuned per function to maximize throughput and minimize idle hardware.
- power management techniques: independent p-state control for each of the CPUs and L2, with entry to and exit from low-power states controlled by hardware state machines, and hardware state retention for power-collapsed sleep states with ultra-fast recovery.
- Performance under memory-intensive workloads – Falkor is designed to fulfill the demand for larger instruction footprints using an innovative split instruction cache comprised of a single-cycle, low-power 24KB L0 I-cache complementing its 64KB L1 I-cache. The core also supports a 32KB L1 D-cache with a 3-cycle load-use latency. The L1 D-cache is augmented by a sophisticated multi-level hardware prefetch engine that dynamically adapts to system conditions.
- Datacenter features
- ARM Execution Levels (EL0-EL3) and TrustZone secure execution environment.
- ARMv8 instruction extensions to accelerate cryptographic transform and secure hash operations such as AES, SHA1, and SHA2-256
- RAS mechanisms needed to keep a datacenter running, such as fault isolation, reporting, and handling techniques.
- System on a chip – The 48 Falkor CPUs are brought together in a fully-integrated SoC with high-bandwidth and low-latency ring interconnect, large L3 cache and multiple memory controllers. It also includes an on-die hardware-based immutable root of trust that authenticates firmware before the first line of firmware is ever executed
Jean-Luc started CNX Software in 2010 as a part-time endeavor, before quitting his job as a software engineering manager, and starting to write daily news, and reviews full time later in 2011.
“Segmented ring bus” sounds like a latency optimisation over a classic ring bus. I’d love to see more details on the subject, though. I mean, a torus bus is a ‘segmented ring bus’ in a way.
That’s a machine! I’m really interested in the price. I guess it’s around 200 Dollar?!
@Philipp
I think you can reasonably add a “1” in front of your guessed price 🙂
@willy
Only a “1”?
no info yet on its power consumption…
it also seems qualcomm will support windows server with this soc
Any sources or information on this?
@tkaiser
the anandtech link provided by cnxsoft
see last page, called “closing thoughts” : Qualcomm is going to be supporting Windows Server on Centriq 2400-series SoCs
@tkaiser
MS/QCOMM talks of cloud partnership predate their talks of notebook partnership.