We already knew Qualcomm had given up on their Centriq processor since mid June 2018, but earlier this year, it seemed the solution had found a new life with HuaXinTong StarDragon 4800 Server SoC born out of a joint venture between Qualcomm and Guizhou provincial government. The processor was allegedly a customized version of the original Centriq 2460 48-core Arm SoC. But recent reports point out that 10 employees from HuaXinTong Semiconductor (aka HXT) have claimed the joint venture is closing down with executives at the company said the venture would shut down by April 30 in an internal meeting. HXT representatives declined to comment on the rumor. The Arm server market is really brutal. Qualcomm and Guizhou government had invested a combined $570 million in HXT as of August 2018 according to company’s fillings. Broadcom and AMD gave up one Arm server chips a little while ago. AFAIK, this […]
Huaxintong StarDragon 4800 Server SoC is Based on Qualcomm Centriq 2400 Processor
Qualcomm started shipping samples of their Arm based Centriq 2400 server processors in 2016, before launching mass production the next year with three parts including Qualcomm Centriq 2460 48-core processor. Development seemed to go along nicely until Qualcomm allegedly decided to exit the server market in the middle of last year. The story got confusing when GIGABYTE still decided to launch their H221-Q20 server powered by Qualcomm Centriq 2400 processor last November, which would not make sense if Qualcomm is really existing the server market, and parts would not become unavailable after a short. But today, as I read the slides of GIGABYTE presentation at the HPC Asia workshop that took place on January 14-16, I realized Centriq 2460 is still alive but just changed owners… So the GIGABYTE H221-Q20 server is compatible with StarDragon 4800… What is that? StarDragon sounds familiar, a bit like Qualcomm Snapdragon. It turns out […]
Optimizing JPEG Transformations on Qualcomm Centriq Arm Servers with NEON Instructions
Arm servers are already deployed in some datacenters, but they are pretty new compared to their Intel counterparts, so at this stage software may not always be optimized as well on Arm as on Intel. Vlad Krasnow working for Cloudflare found one of those unoptimized cases when testing out Jpegtran – a utility performing lossless transformation of JPEG files – on one of their Xeon Silver 4116 Server:
1 2 3 4 5 |
vlad@xeon:~$ time ./jpegtran -outfile /dev/null -progressive -optimise -copy none test.jpg real 0m2.305s user 0m2.059s sys 0m0.252s |
and comparing it to one based on Qualcomm Centriq 2400 Arm SoC:
1 2 3 4 5 |
vlad@arm:~$ time ./jpegtran -outfile /dev/null -progressive -optimise -copy none test.jpg real 0m8.654s user 0m8.433s sys 0m0.225s |
Nearly four times slower on a single core. Not so good, as the company aims for at least 50% of the performance since the Arm processor has double the number of cores. Vlad did some optimization on The Intel processor using SSE instructions before, so he decided to look into optimization the Arm code with NEON instructions instead. First step was to check which functions may slowdown the […]
Red Hat Enterprise Linux 7.4 Now Fully Supports Arm servers
When hardware vendors announced Arm based servers they also claim support for operating systems such as Ubuntu 16.04 LTS and Red Hat Enterprise Linux, so I assumed software support was more or less where it needed to be with regards to Arm server. But apparently, it may not have been so, as Red Hat only announced full support for Arm servers in Red Hat Enterprise Linux for ARM a few days ago. It also started with SBSA (Server Base System Architecture) specifications in 2014, that aimed to provide a single operating platform that works across all 64-bit ARMv8 server SoCs that complies with the said specification. Red Hat then released a developer preview of the OS for silicon and OEM vendors in 2015, and earlier this week, the company released Red Hat Enterprise Linux 7.4 for Arm, the first commercial release for this architecture. RHEL 7.4 for Arm come with […]
Qualcomm Centriq 2400 ARM SoC Launched for Datacenters, Benchmarked against Intel Xeon SoCs
Qualcomm Centriq 2400 ARM Server-on-Chip has been four years in the making. The company announced sampling in Q4 2016 using 10nm FinFET process technology with the SoC featuring up to 48 Qualcomm Falkor ARMv8 CPU cores optimized for datacenter workloads. More recently, Qualcomm provided a few more details about the Falkor core, fully customized with a 64-bit only micro-architecture based on ARMv8 / Aarch64. Finally, here it is as the SoC formally launched with the company announcing commercial shipments of Centriq 2400 SoCs. Qualcom Centriq 2400 key features and specifications: CPU – Up to 48 physical ARMv8 compliant 64-bit only Falkor cores @ 2.2 GHz (base frequency) / 2.6 GHz (peak frequency) Cache – 64 KB L1 instructions cache with 24 KB single-cycle L0 cache, 512 KB L2 cache per duplex; 60 MB unified L3 cache; Cache QoS Memory – 6 channels of DDR4 2667 MT/s for up to 768 […]
Qualcomm Provides Details about 64-bit ARM Falkor CPU Cores used in Centriq 2400 Server-on-Chip
Qualcomm officially announced they started sampling Centriq 2400 SoC with 48 ARMv8 cores for datacenters & cloud workloads using a 10nm process, but at the time the company did not provide that many details about the solution or the customization made to the CPU cores. Qualcomm has now announced that Falkor is the custom CPU design in Centriq 2400 SoC with the key features listed by the company including: Fully custom core design – Designed specifically for the cloud datacenter server market, with a 64-bit only micro-architecture based on ARMv8 (Aarch64). Scalable building block – The Falkor core duplex includes two custom Falkor CPUs, a shared L2 cache and a shared bus interface to the Qualcomm System Bus (QSB) ring interconnect. Designed for performance, optimized for power 4-issue, 8-dispatch heterogeneous pipeline designed to optimize performance per unit of power, with variable length pipelines that are tuned per function to maximize […]
Linux 4.9 Release – Main Changes, ARM and MIPS Architectures
Linus Torvalds released Linux 4.9 on Sunday: So Linux 4.9 is out, and the merge window for 4.10 is thus open. With the extra week for 4.9, the timing for the merge window is obviously a bit awkward, and it technically closes in two weeks on Christmas Day. But that is a pure technicality, because I will certainly stop pulling on the 23rd at the latest, and if I get roped into Xmas food prep, even that date might be questionable. I could extend the merge window rather than cut it short, but I’m not going to. I suspect we all want a nice calm winter break, so if your stuff isn’t ready to be merged early, the solution is to just not merge it yet at all, and wait for 4.11. Just so you all know (I already bcc’d the main merge window suspects in a separate mailing last […]
Qualcomm Starts Sampling of Qualcomm Centriq 2400 ARM Server SoC with Up to 48 ARMv8 Cores
Qualcomm has announced commercial sampling of Qualcomm Centriq 2400 series server SoC built with 10nm FinFET process technology and featuring up to 48 Qualcomm Falkor custom ARMv8 CPU cores “highly optimized to both high performance and power efficiency, and designed to tackle the most common datacenter workloads”. Qualcomm Datacenter Technologies demonstrated the new processor in a Live demo showing Apache, Spark, Java, and Hadoop on Linux running on a SBSA compliant server powered by Qualcomm Centriq 2400 processor, but the company did not provide any further technical details or preliminary benchmark results for the solution. The Qualcomm Centriq 2400 processor series is now sampling to select customers and is expected to be commercially available in H2 2017. That’s about all we know from the press release. However, Linaro have been working on Qualcomm Technologies QDF2432 based board for several months with support for Debian 8.x ‘Jessie’ and CentOS 7 operating […]