$150 Axelera M.2 AI accelerator module claims to deliver up to 214 TOPS

Axelera M.2 AI accelerator module is said to deliver up to 214 TOPS of AI inference and up to 3200 FPS with ResNet -50 in a compact M.2 2280 form factor.

Few details are available at this time, but the module is based on the company’s Metis AIPU (AI Processing Unit) using in-memory computing based on arrays of SRAM memory devices used to “store a matrix and perform matrix-vector multiplications “in-place” without intermediate movement of data”. This technology is said to “radically” increase the number of operations per computer cycle with without suffering from issues such as noise or lower accuracy.

Axelera M.2 AI accelerator

The Metis AI platform delivers 50+ TOPS per core (RISC-V-controlled dataflow engine), offers FP32 equivalent accuracy, and has a 15 TOPS/W energy efficiency. The last point is impressive, but that means 214 TOPS won’t be reachable with the module shown above, since the M.2 form factor is designed to handle up to around 8W with proper cooling (thick heatsink). So the best we could hope for would be 120 TOPS. That would still be a serious jump compared to the Hailo 8 M.2 card with up to 26 TOPS, and even more when compared to the popular Coral Edge or Myriad X accelerators.

TOPS efficiency vs utilization
Axelara Metis efficiency improves with higher utilization and tops 15W/TOPS at 100%

For reference, the NVIDIA Jetson AGX Orin module (32GB) delivers up to 200 TOPS @ 50W, or 4 TOPS/W energy efficiency. So Axelera is making bold claims, especially since the M.2 module is expected to sell for $149 / 149 Euros, and this would have to be confirmed with third-party testing since comparing public AI benchmarks is close to impossible. We’ll find out once the module becomes available.

Metis In Memory Computing
Metis In-memory computing architecture

Axelera Metis is programmable with the Voyager SDK which is currently optimized for the development of computer vision applications for the Edge. It abstracts the internal workings of the Medis AIPU and can also quantize and compile models trained for PyTorch and Tensorflow in order to simplify the work of developers. The company also provides access to the Axelera Model Zoo, accessible on the Web and via cloud APIs, with pipelines for common use cases such as image classification, object detection, segmentation, key point detection, and face recognition.

Axalera PCIe card AI accelerator

The company will also be offering a PCIe card with four Metis AIPUs for up to 856 TOPS of AI inference capable of up to 12,800 fps with ResNET50 or 38,884​ fps with MobileNet V2-1.0. The PCIe card will sell for $499/499 Euros.

Axelera AI gatewayFinally, an AI gateway with the Metis M.2 card will be able to handle up to 8 cameras for deployments in robotics and retail, and we’re told it will offer performance in excess of 100 TOPS, which looks more feasible than the 200+ TOPS claim. The design of the enclosure makes me think it was designed by AAEON, but I may be wrong here.

Both the Axelera M.2 module, PCIe card, and AI gateway will be available for order in Q1 2023. More details may be found on the company’s website, where you can also register your interest.

Via Naveen PS on LinkedIn.

Share this:
FacebookTwitterHacker NewsSlashdotRedditLinkedInPinterestFlipboardMeWeLineEmailShare

Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress

ROCK Pi 4C Plus

14 Replies to “$150 Axelera M.2 AI accelerator module claims to deliver up to 214 TOPS”

  1. M.2 … so can you use this within an M.2 slot in your laptop/computer, which you normally use for a SSD/NVME? Or an M.2 slot meant for Wifi?

    How does that work?

    1. Yes, It’s a PCIe interface, and it looks to be a Key M module, so it should work in the M.2 socket for NVMe SSDs.

      It’s the same for the Google Coral M.2 module, but they offer it with various keys namely B+M, A+E, or E, so it can work in more M.2 sockets with PCIe.

  2. Can it inference models larger than resnet50 and process images larger than 224×244 pixels? If the answer is not, I am not sure I would be very excited about 38884 fps.

        1. I don’t have AltGr on my keyboard. After searching a bit, I can type € with Shilf+Ctrl then U20AC (Unicode for the Euro sign). Not the most convenient. I could also use the special character table in WordPress, but I’ll probably keep writing Euros.

          1. And really we don’t care, only sales people and online shops use the euro symbol. Everyone else writes EUR or Euro(s), precisely because it’s not on every keyboard.

    1. @Vladimir asked: “You wrote: “$499/499 Euros” ‘$’ this is a dollar sign, What is the “Euro”? sign?

      Here’s a brief list of currency symbols that usually work (copy/paste) in simple text editors – but I dunno about here in the comment editor. Let’s try it…

      https://en.wikipedia.org/wiki/Currency_symbol
      ¤ = Generic Currency Sign
      $ = Dollar, USD
      ¢ = Cent
      ₿ = Bitcoin, BTC
      € = Euro, EUR
      £ = British Pound, GBP
      ₹ = Indian Rupee, INR
      ₩ = North Korean Won, KPW; South Korean Won, KRW
      ₱ = Philippine Peso Sign, PHP
      ₽ = Russian Ruble, RUB
      ฿ = Thai Baht, THB
      ¥ = Japanese Yen, JPY; Chinese Renminbi Yuan, RMB, CNY

  3. I bet it’s one of those “comparing ours INT4/INT2/INT1 to competitor’s INT8 performance”. Literally the only one instance of directly comparing same-model execution on same accuracy per-metric I saw was Greenwaves’ GAP9 prototype crushing STM32H7, and that is still not GA yet.

      1. General Availability, right now only selected partners can even request to buy GAP9s, hope that changes soon.

  4. I presume becuase it is in-memory computing based on arrays of SRAM memory devices that are basically memory modules with some basic high speed instructions built in.
    That means at least the rated speed is an in memory inference and model size at least for that speed is limited by the amount of memory on the card.

  5. I had a short video call with Axelera yesterday. Some takeaways

    1. A large part of the high efficiency is because of having memory-on-chip, but also because they optimized the chip, and did an early tapeout to work on optimizing power consumption.
    2. While they are only working with select partners right now to make sure the technology delivers the expected results, development kits will become more broadly available later this year.
    3. Pricing for the M.2 ($149) is sample price, and not volume pricing.
    4. They told me they’d provide additional benchmarks and information like image size soon.

Leave a Reply

Your email address will not be published. Required fields are marked *

Khadas VIM4 SBC
Khadas VIM4 SBC