SiFive Performance P550 is the fastest 64-bit RISC-V processor so far

SiFive has announced two RISC-V “Performance” cores with Performance P550 that should be the fastest 64-bit RISC-V processor so far with a SPECInt 2006 score of 8.65/GHz, as well as a Performance P270 Linux capable processor with full support for the RISC-V vector extension v1.0 rc.

SiFive Performance P550

SiFive Performance P550 fastest RISC-V processor
Image source: LinuxGizmos

P550 highlights:

  • 13 stage, 3-issue high-performance out-of-order pipeline
  • Supports multicore coherence with up to 4 cores in a core complex
  • Private 32KB+32KB L1 cache and a private 256KB L2 cache per core
  • Up to 4MB L3 cache in a four-core cluster
  • SPECint 2006 – 8.65/GHz
  • 2.4 GHz in 7nm with a footprint of less than 0.25 mm

Performance P550 vs Cortex-A75SiFive compares the Performance P550 core to Arm’s Cortex-A75 with higher performance in SPECint2006 and SPECfp2006 integer/floating-point benchmark, all a much smaller area which would enable a quad-core P550 cluster on about the same footprint as a single Cortex-A75 core.

There have been some rumors about Intel’s plan to acquire SiFive in recent days, and while the acquisition news still has to be confirmed, Intel says it will be using the P550 Performance core in an upcoming 7nm Intel Horse Creek platform. Amber Huffman, Intel Fellow and CTO of IP engineering group at Intel, says:

We are pleased to be a lead development partner with SiFive to showcase to mutual customers the impressive performance of their P550 on our 7nm Horse Creek platform . By combining Intel’s leading-edge interface IP such as DDR and PCIe with SiFive’s highest performance processor, Horse Creek will provide a valuable and expandable development vehicle for cutting-edge RISC-V applications.

It’s unclear whether Horse Creek will mix x86 cores with the RISC-V core in a hybrid processor like the Lakefield platform, or as a pure RISC-V processor. Anandtech has more speculations about the “development platform”.

SiFive Performance P270

SiFive Performance P270P270 highlights:

  • 8-Stage, dual-issue, in-order processor
  • 256-bit Vector Unit enabling compute capabilities with full support for the RISC-V Vector Extension v 1.0RC
  • Supports multicore coherence with up to 4 cores in a core complex
  • Private 32KB+32KB L1 cache and a private 256KB L2 cache per core
  • SPECint 2006 – 4.6/GHz

Just like the previously announced SiFive Intelligence X280, the Performance P270 core supports the translation of legacy SIMD/NEON code to SiFive vector assembly using the SiFive Recode utility.

SiFive Performance, Intelligence, and Essential families

SiFive Performance, Intelligence & Essential

Patrick Little joined SiFive as President & CEO last year, and one of the things he did was to split SiFive’s products into three families:

  • Essential – Legacy SiFive embedded RISC-V core likes U-Series, S-Series, E-Series
  • Intelligence – RISC-V cores optimized for low-power AI & ML acceleration
  • Performance – Highest performance RISC-V cores designed for networking,
    edge compute, autonomous machines, 5G base stations, virtual/augmented reality

The product page about the new SiFive Performance cores only has limited information at this point in time, and the company will host a webinar on July 14 to provide more details.

Share this:
FacebookTwitterHacker NewsSlashdotRedditLinkedInPinterestFlipboardMeWeLineEmailShare

Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress

ROCK Pi 4C Plus

22 Replies to “SiFive Performance P550 is the fastest 64-bit RISC-V processor so far”

  1. It would not make any sense to create a “hybrid” x86/RISCV processor. The OS would require heterogeneous CPU support which does not exist in Linux or Windows (to my knowledge). Lakefield is x86/x86 just like ARM big.LITTLE is arm/arm.

    1. I doubt if it would be used as an application processor. More likely it will be part of some controller or low-power subsystem.

    2. AmigaOS (+PowerUP/WarpOS) managed to do this yonks ago. Mainly 68K OS communicating with tasks running on lightweight kernel on a PPC and it looked transparent to the user.

      You don’t need to have the kernel running on both ISAs. You need to make two kernels running on the same machine look like one instead of like two machines connected via a fast network connection like Intel’s Xeon Phi did.

      1. Running a variant ISA on select cores might have possibilities even for processors of the same architecture.

        To run legacy code while the main OS is 64 bit. e.g. NTVDM doesn’t work on 64 bit Windows because of long mode; 16 bit programs from Windows 3.0 could run on a 32 bit core.

        Bi-endian systems could alternate. e.g. ppc64 on one core handballing to ppc64le on another to run a particular algorithm.

        (Yes, I know there’s virtualization and other techniques.)

    3. I’m sure they could add support for mixed architectures if they wanted to.

      x86 + ARM (high performance, not Cortex-M) on the same system would be interesting, but it might be better to just emulate one or the other like Apple does with Rosetta 2.

        1. As the industry continues to node shrink, it becomes trivial to just throw extra junk on a chip. That’s why we’re about to have 24-core consumer desktop CPUs with integrated graphics.

    4. Yet.

      But with today’s Android on Windows 11 announcement, Intel are promoting multi-architecture XPU technologies. I’d assume that’s just a software compiler from ARM to x86.

      But suppose risc-v gains some traction as an Android platform. Offloading risc-v native code to dedicated risc-v cores when your main x86 CPU is running Windows 11 is definitely on the cards.

  2. I hope SiFive is acquired by Intel and they don’t screw it up.
    RISCV cores + Intels IPs that have good mainline support is a winner vs ARMs cores + a bunch of different junk IPs from whatever vendor was cheapest that day that we see time and time again.

    1. RISCV makes a lot of a sense when you are replacing a licensed designed like Cortex-M0 with an in-house royalty free implementation. These scenarios are low cost / low complexity with high volume.

      It makes less sense when you are attempting to replace high cost / high complexity implementations that are low volume. You just replace one license fee (ARM) with another (SiFive). While the latter is presumably cheaper, it is not free as everyone seems to perceive whenever RISCV is mentioned. It also precludes having a high core count design as each core is adding to the license fee charged. With the additional development costs in using an immature and unproven technology (RISC-V Vector Extension v 1.0RC), a company may be better of using the equivalent four year old ARM core license.

      There is no reason that Intel would require a purchase of SiFive. Intel is certainly capable of creating their own RISCV core and licensing it with additional IP (DRAM+PCIe).

      RISCV is a standard. While many think of it as “the Linux of hardware”, its closer to “the POSIX of hardware”. SiFive creates licensed implementation of the standard. This is similar to how Sun created Solaris. Intel purchasing SiFive would likely end the same as when Oracle purchased Solaris.

      1. >There is no reason that Intel would require a purchase of SiFive.
        >Intel is certainly capable of creating their own RISCV core

        Sure. But that means throwing money at it and waiting for the result.
        Buying SiFive would instantly give them the IP they want, all of the people they need to keep developing, and they can have it right now (as long as there isn’t a ton of governments getting involved).

  3. So … for certain specs, faster than Arm’s Cortex-A75 … which was introduced in 2017. Sounds OK-ish.

    1. That’s a big leap from previous garbage RISC-V cores, to significantly outperform a Raspberry Pi 4 and many other phone and SBC SoCs.

      But the real winner is performance per area. If it’s 43% the die area of a Cortex-A75, you can have 8-core P550 in less die area (cheaper) than 4-core Cortex-A75, and 2.28 to 2.62 times the multi-threaded performance from the doubled core count and better performance per core.

      Did they say anything about power efficiency vs. the Cortex-A75?

    2. Note that nothing was said about multi-core performance. Efficiently using 4 cores in multi-threaded workloads adds yet another dimension which usually requires a long experience, and I doubt that RISCV reached that point yet. That doesn’t mean it’s not interesting of course, just that *some* workload might not shine as well as the graphs above, especially compared to A75 which does have LSE atomics.

  4. “all a much smaller area which would enable a quad-core P550 cluster on the same footprint as a single Cortex-A75 core.”

    That is not shown by the chart. They get to 3x the performance per mm^2 by assuming 1.31x the performance in 43% the area. So 1.31 / 0.43 = 3.05x

    But putting a quad-core in the same die area would need 25% on that chart, not 43%. 1 / 0.43 = 2.33. They can put a dual-core in a 14% smaller area than a single Cortex-A75.

    1. The chart is made up. The footnote states that it is an estimate. Additionally, notice that it does not not start at zero making it a non-linear representation. The conclusion is that it was creating by the marketing department, not the engineering department.

      1. Of course it’s an estimate, the core isn’t out yet. As long as it ends up being close to being accurate, that’s no problem.

        Non-linear representation also doesn’t matter. It’s misleading to the eyes, but I just used the numbers instead.

        My problem is that I don’t know where Jean-Luc is getting the idea you can fit 4x P550 in the area of 1x Cortex-A75. Unless I’m overlooking something, that is MORE OPTIMISTIC than SiFive’s marketing claims. Maybe he was tricked by the bar graph? Or it has something to do with the L3 cache.

        1. That comes from the press release:

          Performance P550 scales up to four-core complex configurations that use a similar amount of area as a single Arm Cortex-A75 while delivering a significant performance-per-area advantage.

          1. Thanks. I don’t get the 43% on the chart then. Maybe the quad-core has half the L3 cache in the comparison.

  5. > P550: private 32KB+32KB L1 cache and a private 256KB L2 cache per core, up to 4MB L3 cache in a four-core cluster

    With announcement of the HiFive 550 Pro board they’re now talking about 128KB L2 cache per core and 2MB shared L3 cache 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *

Khadas VIM4 SBC
Khadas VIM4 SBC