Axelera Metis M.2 Max Edge AI module doubles LLM and VLM processing speed

Axelera AI’s Metis M.2 Max is an M.2 module based on an upgraded Metis AI processor unit (AIPU) delivering twice the memory bandwidth of the current Metis M.2 module for compute-intensive Edge AI inference applications such as large language models (LLMs) and vision language models (VLMs).

The new Metis M.2 Max also offers a slimmer profile, advanced thermal management features, and additional security capabilities. It is equipped with up to 16 GB of memory, and versions for both a standard operating temperature range (-20°C to +70°C) and an extended operating temperature range (-40°C to +85°C) will be offered. These enhancements make Metis M.2 Max ideal for applications in industrial manufacturing, retail, security, healthcare, and public safety.

Metis M.2 Max

Axelera AI Metis M.2 Max specifications and host requirements:

  • Accelerator – Metis AIPU’
  • System Memory – 1GB, 4GB, 8GB, or 16GB memory
  • Host Interface – M.2 2280 M-key edge connector with PCIe Gen. 3.0 x4
  • Compatibility – Intel Core processors, AMD Ryzen processors, Arm64 (aarch64) based processors.
  • Security – Firmware integrity via secure boot and secure upgrade features built on a hardware Root-of-Trust
  • Misc
    • An onboard power probe that can be used to automatically adjust performance to specific settings for power- and thermal-constrained deployments.
    • Optional low-profile heatsink for cooling (reduces the height of the card by 27% over the current M.2 card)
  • Power – Compliant with PCI-SIG’s M.2 Specification revision 4.0 (11.55 W average power, 23.1 W peak power).
  • Temperature Range
    • Standard – -20°C to +70°C
    • Extended – -40°C to +85°C

Axelera Metis M.2 Max Top Bottom

Like the previous Metis M.2 module, the Max variant is supported by the Voyager SDK. Axelera AI claims that native Linux support is tested on Ubuntu 22.04, while a docker guide (sign-in required) is available for other Linux distributions.

The company provides additional information about performance in the press release:

M.2 Max delivers a 33% performance uplift in convolutional neural networks (CNNs) and double the token/second for LLMs and VLMs, all while staying within a typical average power range of 6.5W.

Interestingly, they’ve not included any information about TOPS this time around, or even clear benchmark results. For reference, the Metis AIPU can deliver up to 214 TOPS, albeit likely not in the M.2 form factor.

Metis M2 heatsink
First-generation Metis M.2 module with heatsink. The new Metis M.2 Max will ship with a thinner heatsink (no picture for now)

The Metis M.2 Max will start shipping in Q4 2025. It is listed on the Axelera webstore without price information for now, but for reference, the existing Metis M.2 card is sold for €229,95 without a cooling solution or €241,95 with the active cooling solution shown above. The new model should be more expensive, maybe in the 300 to 400 Euros range.

Thanks to TLS for the tip.

Share this:
FacebookTwitterHacker NewsSlashdotRedditLinkedInPinterestFlipboardMeWeLineEmailShare

Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress. We also use affiliate links in articles to earn commissions if you make a purchase after clicking on those links.

Radxa Orion O6 Armv9 mini-ITX motherboard

3 Replies to “Axelera Metis M.2 Max Edge AI module doubles LLM and VLM processing speed”

  1. They claim that it doubles the memory BW compared to their previous model, but given that they don’t provide that metric for either model, it remains a bit opaque. At first glance they seem to have adopted 128-bit memory, so we could possibly hope for DDR5-6400, i.e. 100GB/s, or 1/3 of a 4060Ti for half the price and 1/25 the power draw. At least it indicates that they’ve understood the criticality of this metric and we may even hope that it will double in next model.

  2. I hope they add a 32 GB, or 48 GB model as well. Especially since all the LLM models I want to use are the larger models. Plus if you put two, or more of these M.2 accelerator devices together, then you can split the models up to run on multiple devices. But with only 16 GB versions, then you would have to buy 6 of these in order to accomodate 96 GB in total. That’s too many M.2 devices. Where as putting three 32 GB M.2 devices together on one machine is not only easier, put the power consuption is much less, and the bus only needs to communicate with only 3 devices, instead of 6.

Leave a Reply

Your email address will not be published. Required fields are marked *

Boardcon MINI1126B-P AI vision system-on-module wit Rockchip RV1126B-P SoC
Boardcon MINI1126B-P AI vision system-on-module wit Rockchip RV1126B-P SoC