Axelera AI’s Metis M.2 Max is an M.2 module based on an upgraded Metis AI processor unit (AIPU) delivering twice the memory bandwidth of the current Metis M.2 module for compute-intensive Edge AI inference applications such as large language models (LLMs) and vision language models (VLMs).
The new Metis M.2 Max also offers a slimmer profile, advanced thermal management features, and additional security capabilities. It is equipped with up to 16 GB of memory, and versions for both a standard operating temperature range (-20°C to +70°C) and an extended operating temperature range (-40°C to +85°C) will be offered. These enhancements make Metis M.2 Max ideal for applications in industrial manufacturing, retail, security, healthcare, and public safety.
Axelera AI Metis M.2 Max specifications and host requirements:
- Accelerator – Metis AIPU’
- System Memory – 1GB, 4GB, 8GB, or 16GB memory
- Host Interface – M.2 2280 M-key edge connector with PCIe Gen. 3.0 x4
- Compatibility – Intel Core processors, AMD Ryzen processors, Arm64 (aarch64) based processors.
- Security – Firmware integrity via secure boot and secure upgrade features built on a hardware Root-of-Trust
- Misc
- An onboard power probe that can be used to automatically adjust performance to specific settings for power- and thermal-constrained deployments.
- Optional low-profile heatsink for cooling (reduces the height of the card by 27% over the current M.2 card)
- Power – Compliant with PCI-SIG’s M.2 Specification revision 4.0 (11.55 W average power, 23.1 W peak power).
- Temperature Range
- Standard – -20°C to +70°C
- Extended – -40°C to +85°C
Like the previous Metis M.2 module, the Max variant is supported by the Voyager SDK. Axelera AI claims that native Linux support is tested on Ubuntu 22.04, while a docker guide (sign-in required) is available for other Linux distributions.
The company provides additional information about performance in the press release:
M.2 Max delivers a 33% performance uplift in convolutional neural networks (CNNs) and double the token/second for LLMs and VLMs, all while staying within a typical average power range of 6.5W.
Interestingly, they’ve not included any information about TOPS this time around, or even clear benchmark results. For reference, the Metis AIPU can deliver up to 214 TOPS, albeit likely not in the M.2 form factor.

The Metis M.2 Max will start shipping in Q4 2025. It is listed on the Axelera webstore without price information for now, but for reference, the existing Metis M.2 card is sold for €229,95 without a cooling solution or €241,95 with the active cooling solution shown above. The new model should be more expensive, maybe in the 300 to 400 Euros range.
Thanks to TLS for the tip.

Jean-Luc started CNX Software in 2010 as a part-time endeavor, before quitting his job as a software engineering manager, and starting to write daily news, and reviews full time later in 2011.
Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress. We also use affiliate links in articles to earn commissions if you make a purchase after clicking on those links.






They claim that it doubles the memory BW compared to their previous model, but given that they don’t provide that metric for either model, it remains a bit opaque. At first glance they seem to have adopted 128-bit memory, so we could possibly hope for DDR5-6400, i.e. 100GB/s, or 1/3 of a 4060Ti for half the price and 1/25 the power draw. At least it indicates that they’ve understood the criticality of this metric and we may even hope that it will double in next model.
I hope they add a 32 GB, or 48 GB model as well. Especially since all the LLM models I want to use are the larger models. Plus if you put two, or more of these M.2 accelerator devices together, then you can split the models up to run on multiple devices. But with only 16 GB versions, then you would have to buy 6 of these in order to accomodate 96 GB in total. That’s too many M.2 devices. Where as putting three 32 GB M.2 devices together on one machine is not only easier, put the power consuption is much less, and the bus only needs to communicate with only 3 devices, instead of 6.
Exactly. There is not much usable models to fit in 16GB.