Sipeed NanoCluster palm-sized cluster board takes up to 7 system-on-modules

Sipeed NanoCluster is a palm-sized cluster board with seven slots for Raspberry Pi CM4/CM5, Sipeed LM3H (Allwinner H618), and/or Sipeed M4N (AXera AX650N AI SoC) system-on-modules, and other compatible SoMs might also work.

The board handles inter-module communication through an 8-port RISC-V-based Gigabit switch, and supports up to 60W USB-C PD or PoE (optional) power. The NanoCluster also offers independent UART and power control for each module, and is suitable as an entry-level/educational platform for HomeLab users working with distributed computing, Kubernetes, Docker, and edge computing.

Mini Cluster Raspberry Pi CM4
Sipeed NanoCluster specifications:

  • Supported SOMs
    • Raspberry Pi CM4 – Broadcom BCM2711 quad-core Cortex-A72 SoC @ 1.5 GHz with VideoCore VI GPU, 1GB to 8GB RAM, up to 64GB eMMC flash (optional)
    • Raspberry Pi CM5 – Broadcom BCM2712 quad-core Cortex-A76 SoC @ 2.4 GHz with VideoCore VII GPU, 1GB to 16GB RAM, up to 64GB eMMC flash (optional)
    • Sipeed LM3H – Allwinner H618 quad-core Cortex-A53 SoC with Arm Mali-G31 MP2 GPU, 2GB or 4GB RAM, 32GB eMMC flash (optional)
    • Sipeed M4N – AXera AX650N octa-core Arm Cortex-A55 @ 1.7 GHz with 18 TOPS NPU (no GPU), 8GB RAM, 32GB eMMC flash
    • Other SoMs compatible with CM4/CM5 might also work, but have not been tested (by Sipeed)
  • SoM slots –  7x dual M.2 M-Key vertical slots (Raspberry Pi Compute Module 4/5 are compatible through an adapter)
  • Storage – M.2 PCIe socket for M.2 SSD on the adapter board for CM4, CM5, and M4N modules
  • Video Output – HDMI port connected to slot 1
  • Networking
    • Gigabit Ethernet RJ45 port
    • Integrated JL6108 8-port RISC-V Gigabit Ethernet switch to interface with the modules (10/100Mbps Ethernet for LM3H, Gigabit Ethernet for other SoMs)
  • USB – 1x USB-A host port, 1x USB-A OTG port connected to slot 1
  • Serial – 7x independent UARTs for debugging and control, optional quad-serial USB module available
  • Misc
    • 60mm 2-pin cooling fan
    • 7x SYS LED indicators for node status
    • Boot and Reset buttons for the Slot 1 module
  • Power Supply
    • Up to 20V USB-C PD (60W max)
    • Optional 60W PoE module
    • 2x 5V8A DCDC
    • Power Management – Slot 1 centrally manages other slots and switches power through an I/O expansion chip
    • Power Consumption
      • Board only – 3.6 W
      • CM4 module – 3W (idle), 4.5W (full load), 4.6W (peak)
      • CM5 module – 4W (idle), 7.6W (full load), 8W (peak)
      • LM3H module – 1.2W (idle), 2.6W (full load), 3.7W (peak)
      • M4N module – 3W (idle), 8.3W (full load), 9W (peak)
  • Dimensions
    • PCBA – 88x57mm
    • Fully assembled with SoMs and fan – About 100x60x60mm

Sipeed NanoCluster specifications

NanoCluster Adapter Boards
Two types of adapter boards

Sipeed provides instructions to get started with the NanoCluster, and applications such as K3s lightweight version of Kubernetes, distcc distributed compilation, and Nomad PlayBook automated deployment script written with Ansible.

Jeff Geeling tested an early prototype of the NanoCluster a couple of months ago with Raspberry Pi CM5 modules.  While there are seven slots, it is recommended to only use four to five CM5s due to power and cooling issues, especially when also connecting an M.2 SSD on the adapter for each module. When he tried a stress test with six Raspberry Pi CM5 modules without SSD, he lost connectivity and also experienced thermal throttling since there was not enough space for cooling. There were no issues when running a K3s cluster, however. The CM4 and LM3H modules consume less, so it might be possible to use seven, bearing in mind that throttling will still occur.

Enclosure for Sipeed NanoCluster board
One of the optional enclosures for the NanoCluster board (there’s also another white and red case with a different design)

It’s mostly useful for education and experimentation, as it will certainly not compete against high-performance HPC solutions. It does compete against other Arm Clusterboards like the Turing Pi 2.5 or DeskPi Super6C.

Sipeed sells the bare NanoCluster board for $49 on AliExpress with the red and white case (as I understand it). You’ll also find bundles with four M4N modules ($699), seven LM3H modules ($299), and seven CM4/CM5 module adapters ($99), as well as accessories like a power supply and PoE module. Alternatively, you can also purchase the board through Amazon for $139.99 (CM4/CM5 adapter kit only) or the “pre-order” link on the product page.

Share this:
FacebookTwitterHacker NewsSlashdotRedditLinkedInPinterestFlipboardMeWeLineEmailShare

Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress. We also use affiliate links in articles to earn commissions if you make a purchase after clicking on those links.

Radxa Orion O6 Armv9 mini-ITX motherboard

10 Replies to “Sipeed NanoCluster palm-sized cluster board takes up to 7 system-on-modules”

  1. It may clearly only target educational purposes since there’s nothing to eliminate heat beyond that fan which will blow through PCBs. The chips will reach the max temperature in a matter of a few seconds and will throttle. When you think that this thing packs 60W of heat into this small volume! But that can be nice to teach the realities of high-density computing, which is what compute clusters are about. Hot spots here are as real as in real life!

    1. Jeff Geerling already reviewed it. Per Sipeed it supports maximum 4 power-hungry SBCs because if you populate all the slots, they’re in physical contact and there’s no airflow at all. Even with 4 Jeff had throttling problems, and they weren’t all even high power boards.

      1. 7xCM4 is ok, it only 35W; 5xCM5 with small heatsink is ok too, jeff don’t install any heatsink, it makes heat dissipation efficiency lower

        1. Thanks I have mine just was trying to figure out how many more cm5 I could get away with ..

          May experiment with some cooling ideas.

          Thanks!

    2. To be honest, what kind of workload would you run on small SBCs that would max out all CPUs all the time? In a scenario like that, you’re in the wrong place with small SBCs anyway. It doesn’t matter if the RPi is perfectly cooled or not. It’s too weak and definitely out of place. A smaller x86 would certainly make more sense than 7x cm4/5 (if possible).

      I already have a NanoCluster at home and have gained a little experience with it. With Kubernetes or similar, you don’t have a constant load and the CM modules all the time and it can easily handle the short peaks. Even with the standard fan, it only has to generate “moving air” when idle and doesn’t have to work very hard. You don’t notice short peaks, and even with a slightly higher continuous load, the fan can still cool well in a closed design (Jeff Geerling didn’t have a case in his test, so the airflow is really poor and maybe a lot worser).

      I created a small tool to read the temp from all SBCs and slow down the fan, when it is not needed and with that, it fits with a very little fan speed in idle: https://github.com/meteyou/Sipeed-NanoCluster-Server

      Because the fan was very loud, i changed it to a Noctua A6x25: https://www.printables.com/model/1376174-sipeed-nanocluster-case-noctua-edition

      1. I had built a few build farms in the past based on small SBCs that were less powerful than these ones (rk3288 then rk3399), and there are definitely some valid use cases for this, and they’ve served me for several years at home and at the office.

        One aspect that is often forgotten about such clusters is the total memory bandwidth that can quickly surpass the available one in a single PC. Here even with the laughable 32-bit RAM bus of the CM5, with 7 boards that becomes a theoretical 120GB/s total BW that no low-power PC can reach. This matters for some workloads such as compilation. Do that with a more serious chip (e.g. RK3588 in lpddr5 mode) and you can reach 215GB/s total.

        But in any case, the thing needs to be properly cooled regardless of the boards, and here it looks particuarly difficult, or even impossible.

        1. Can you pls describe a little bit, how you can use the full memory bandwidth in the cluster? is the 1gbit network between the nodes enough for that?

          At 24/7 full load, I agree with you. The cooling may not be sufficient. However, at an average load of up to 50% and then occasional peaks of 100% for seconds (or even a minute), I don’t see any problems with the cooling at present.

Leave a Reply

Your email address will not be published. Required fields are marked *

Boardcon MINI1126B-P AI vision system-on-module wit Rockchip RV1126B-P SoC
Boardcon MINI1126B-P AI vision system-on-module wit Rockchip RV1126B-P SoC