M5Stack LLM-8850 card – An M.2 M-Key AI accelerator module based on Axera AX8850 24 TOPS SoC

M5Stack LLM‑8850 card is an M.2 M-Key 2242 AI acceleration module powered by an Axera AX8850 SoC delivering 24 TOPS ( INT8) of performance, and suitable for host devices such as Raspberry Pi 5, Rockchip RK3588 SBCs, and even x86 PCs like mini PCs with a spare M.2 Key-M socket.

The card ships with 8GB RAM, a 32Mbit SPI NOR flash, and also supports H.265/H.264 8Kp30 video encoding and 8Kp60 video decoding, with up to 16 channels for 1080p videos. It is also equipped with an active cooling system to maintain stable temperatures and prevent thermal degradation inside enclosures.

M5Stack LLM-AX8850 CardM5Stack LLM‑8850 card specifications:

  • SoC – Axera AX8850
    • CPU – Octa-core Cortex‑A55 processor at 1.7 GHz
    • NPU – 24 TOPS @ INT8
    • VPU
      • Video Encoder – 8K @ 30 fps H.264/H.265 encoding, supports scaling / cropping
      • Video Decoder – 8K @ 60 fps H.264/H.265 decoding, supports 16 channels 1080p parallel decoding, supports scaling / cropping
  • Memory – 8GB 64‑bit LPDDR4x @ 4266 Mbps
  • Storage – 32Mbit QSPI NOR Flash (for Bootloader only)
  • Host Interface – PCIe 2.0 x2 via M.2 Key-M edge connector
  • Cooling – Micro turbo fan + integrated aluminum alloy CNC heatsink
  • Power Supply – 3.3V via edge connector
  • Power Consumption – Up to 7 Watts
  • Dimensions – 42.6 x 24.0 x 9.7mm
  • Weight – 14.7 grams
  • Temperature Range
    • Operating – 0 to 60°C
    • Full load temperature at room temperature – 70°C

LLM-AX8850 Card description

The card works with Ubuntu 20.02, 22.04, and 24.04 as well as Debian 12, but software support is not available for other operating systems, such as Windows, macOS, or even Windows WSL. That’s because it relies on the axcl-smi driver only available in Linux.

Once the driver is installed on your Raspberry Pi 5 or another Linux SBC or mini PC, you can download various demos listed in the wiki to get started. A fairly long list of models is provided:

  • Vision – YOLO11, Yolo-World-V2, Yolov7-face, Depth-Anything-V2, MixFormer-V2, Real-ESRGAN, SuperResolution, RIFE
  • Large Language – Qwen3-0.6B, Qwen3-1.7B, Qwen2.5-0.5B-Instruct, Qwen2.5-1.5B-Instruct, DeepSeek-R1-Distill-Qwen-1.5B, MiniCPM4-0.5B
  • Multimodal – InternVL3-1B, Qwen2.5-VL-3B-Instruct, SmolVLM2-500M-Video-Instruct, LibCLIP
  • Audio – Whisper, MeloTTS, SenseVoice, CosyVoice2, 3D-Speaker-MT
  • Generative – lcm-lora-sdv1-5, SD1.5-LLM8850, LivePortrait

Raspberry Pi 5 Axera AX8850 AI accelerator module
LLM-8850 card connected to a Raspberry Pi 5 through the official M.2 HAT+ M Key adapter

M5Stack did not provide benchmark results or try to compare the Axera AX8850 SoC against other AI accelerators, although the wiki reveals some numbers, such as 12.88 tokens/s for Qwen3-0.6B with w8a16 quantization. Based on the arguably flawed TOPS number, it looks like a potential competitor to the 26 TOPS Hailo-8 AI accelerator. However, the latter is optimized for computer vision applications, and for tasks like LLMs, the Axera 8850 will be vastly superior.

The price for the LLM-8850 card is within the same range, and even a bit cheaper than a product such as the Raspberry Pi AI HAT+ 26 TOPS ($110), and much more affordable than the ~$200 Hailo-8 M.2 card. M5Stack sells the Axera AX8850 M.2 module for $99 on AliExpress and its own online store.

YouTube video player

Share this:
FacebookTwitterHacker NewsSlashdotRedditLinkedInPinterestFlipboardMeWeLineEmailShare

Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress. We also use affiliate links in articles to earn commissions if you make a purchase after clicking on those links.

Radxa Orion O6 Armv9 mini-ITX motherboard

10 Replies to “M5Stack LLM-8850 card – An M.2 M-Key AI accelerator module based on Axera AX8850 24 TOPS SoC”

    1. Actually it looks good with regards to heat deception.
      Also, this is not just to run “GPTs” but also classic AI, nice to have competition to Google Coral and Helio.

  1. Thank you for this article. It means there is hope in the Rockchip RK3588’s world. I could try deepseek with

    1. How’s that related to RK3588 ? The problem with NPUs is always the same: the multiplicity of ecosystems and incompatible frameworks, resulting in each vendor trying to support one or a few, which are never the ones the user would like to use. That’s why they all end up showing a demo of their stuff doing something really limited. Some even fork existing tools to add support for their hardware there, and their fork becomes outdated in a few weeks since these environments evolve very quickly.

      What’s really needed is a standard API to communicate with these devices, that would allow to implement regular drivers that expose a /dev entry, and that ioctl/mmap/read/write etc permit to entirely manipulate them, so that the various environments can simply implement generic support for them. If graphics cards would have been handled the way NPUs are handled, we would still be using teletype printers but on the printer there would be a screen exhibiting the vendor’s logo in 3D rotating to prove they can do more… And if storage device had been done that way, we’d never have been able to exchange data between machines using USB thumb drives, and M.2 storage would never have succeeded. So it’s up to NPU vendors and application developers now to sit around the same table and define a usable low-level API that they agree to follow. Ideally, just like with storage/graphics/USB there would be a common generic support so that plugging a device just works with the generic driver. Right now it remains easier to use a GPU for AI because as long as CUDA/OpenCL/Vulkan is supported, you have a chance to get it to work.

      1. To be fair drawing graphics was around much longer before standardized APIs became available versus how long NPUs have been around.

        I think it will happen sooner or later.

      2. My apologies but you didn’t get it. AI booster module for sbc are fitted for raspberry pi. I mainly use Orange pi 5 plus and radxa rock5 b+. For once there is AI booster not only for vision like Google dual edge tpu, hailo or Metis AI Accelerators. With RK3588 we could use tis AI booster module for something else than vision. And that is pretty good news. And… It can be added in the oprange pi 5 plus case or Radxa Rock 5B+.

        1. Sorry but indeed I still don’t get it. You’re speaking about physical compatibility. OK it’s M2 and can fit in any board having an M2 slot like rock5b and orange pi 5. But there’s still the software problem. Usually you end up with a binary blob which is already outdated at time of release and limited to only a few use cases. This one here even lists a few tiny LLMs (up to 1.5B params, i.e. only suitable for basically summarizing), so probably they’re providing their forked version of llama.cpp with hard-coded support for their driver that will never receive any update. Yes it might be sufficient to perform video processing for pattern detection on video-surveillance. But that’s also what the NPU in the RK3588 used to promise, which is why I’m not seeing how this device with comparable specs completes that.

          And quite frankly, being limited to 1.5B when you have 8GB RAM is a shame. It’s sufficient to run a chat with a 13-14B model quantized at Q4 or a 7B at Q8. In any case as soon as you’re running inference, the token generation rate will be limited by the DRAM bandwidth, which is here the exact same as Rock5B (not even 5+), i.e. LPDDR4X-4266 in 64-bit, and that’s not counting the model loading time from the host.

          Thus I’m still wondering what applications concretely benefit from such a device on an RK3588 which already has the same thing inside with similar memory BW limitations.

Leave a Reply

Your email address will not be published. Required fields are marked *

Boardcon MINI1126B-P AI vision system-on-module wit Rockchip RV1126B-P SoC
Boardcon MINI1126B-P AI vision system-on-module wit Rockchip RV1126B-P SoC