Hailo-10 M.2 Key-M module brings Generative AI to the edge with up to 40 TOPS of performance

Hailo-10 is a new M.2 Key-M module that brings Generative AI  capabilities to the edge with up to 40 TOPS of performance at low power. It targets AI PCs supporting only the Windows 11 operating system on x86 or Aarch64 targets at this time.

Hailo claims the Hailo-10 is faster and more energy efficient than integrated neural processing unit (NPU) solutions found in Intel SoCs and delivers at least twice the performance at half the power of Intel’s Core Ultra “AI Boost” NPU.

Hailo-10 M.2 module generative AI for the edge

Hailo-10 module specifications:

  • AI accelerator – Hailo-10H
  • System Memory – 8GB LPDDR4 on module
  • Host interface – 4-lane PCIe Gen 3
  • Power consumption – Less than 3.5W (typical) for the chip
  • Form factor – M.2 Key M 2242 / 2280
  • Supported AI frameworks – TensorFlow, TensorFlow Lite, Keras, PyTorch & ONNX

The Hailo-10 can run Llama2-7B with up to 10 tokens per second (TPS) at under 5W of power, while it can generate one image from text in under 5 seconds using Stable Diffusion 2.1 in the same power envelope. Like other many technologies, generative AI is moving from the cloud to the device itself enabling lower latency and offline support.

 

Generative AI vs discriminative AI
Evolution of edge AI from discriminative AI to generative AI – Source: Hailo’s blog

The Hailo-10 is supported by the AI software suite as its predecessors (Hailo-8 and Hailo-15) with a dataflow compiler, a model zoo with TensorFlow and ONNX model formats, Hailo TAPPAS pre-trained AI application, HailoRT runtime software for the host processor, and the Hailo-10H firmware.
Hailo AI software suite

The Hailo-10H-based M.2 modules can be plugged into existing PCs and edge devices with a spare M.2 PCIe socket to add generative AI capabilities.  The company says that the Hailo-10 AI accelerator modules will initially target PCs and automotive infotainment systems to power on-device chatbots, copilots, personal assistants, and speech-operated operating systems. It’s the second generative AI chip for the edge we’ve covered on CNX Software, as the Ambarella N1 SoC that combines 16 Arm Cortex-A78AE cores and an AI accelerator into a single chip was unveiled in January 2024.

The company says it will begin shipping samples of the Hailo-10 GenAI accelerator in Q2 of 2024. The previous Hailo-8 AI accelerator found its way into many systems from various embedded PC vendors, but the Hailo-10 will also be suitable for consumer devices. It may take a while before it becomes more broadly available, as for instance, the Hailo-15 was first introduced in March 2023, but the first commercial device, the SoliRun Hailo-15 SoM, was just announced a few days ago. Additional information, including a product brief, may be found on the product page.

Share this:
FacebookTwitterHacker NewsSlashdotRedditLinkedInPinterestFlipboardMeWeLineEmailShare

Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress

ROCK 5 ITX RK3588 mini-ITX motherboard

6 Replies to “Hailo-10 M.2 Key-M module brings Generative AI to the edge with up to 40 TOPS of performance”

  1. I would actually like to see a chip that can have many TOPS that just improves old 240p video to at least 1080p or 4k, can make all sorts of grudgy video and compression artifacts to look pristine that can just hook up to an extra nvme slot.

Leave a Reply

Your email address will not be published. Required fields are marked *

Khadas VIM4 SBC
Khadas VIM4 SBC