Hailo-10 M.2 Key-M module brings Generative AI to the edge with up to 40 TOPS of performance

Hailo-10 is a new M.2 Key-M module that brings Generative AI  capabilities to the edge with up to 40 TOPS of performance at low power. It targets AI PCs supporting only the Windows 11 operating system on x86 or Aarch64 targets at this time.

Hailo claims the Hailo-10 is faster and more energy efficient than integrated neural processing unit (NPU) solutions found in Intel SoCs and delivers at least twice the performance at half the power of Intel’s Core Ultra “AI Boost” NPU.

Hailo-10 M.2 module generative AI for the edge

Hailo-10 module specifications:

  • AI accelerator – Hailo-10H
  • System Memory – 8GB LPDDR4 on module
  • Host interface – 4-lane PCIe Gen 3
  • Power consumption – Less than 3.5W (typical) for the chip
  • Form factor – M.2 Key M 2242 / 2280
  • Supported AI frameworks – TensorFlow, TensorFlow Lite, Keras, PyTorch & ONNX

The Hailo-10 can run Llama2-7B with up to 10 tokens per second (TPS) at under 5W of power, while it can generate one image from text in under 5 seconds using Stable Diffusion 2.1 in the same power envelope. Like other many technologies, generative AI is moving from the cloud to the device itself enabling lower latency and offline support.

 

Generative AI vs discriminative AI
Evolution of edge AI from discriminative AI to generative AI – Source: Hailo’s blog

The Hailo-10 is supported by the AI software suite as its predecessors (Hailo-8 and Hailo-15) with a dataflow compiler, a model zoo with TensorFlow and ONNX model formats, Hailo TAPPAS pre-trained AI application, HailoRT runtime software for the host processor, and the Hailo-10H firmware.
Hailo AI software suite

The Hailo-10H-based M.2 modules can be plugged into existing PCs and edge devices with a spare M.2 PCIe socket to add generative AI capabilities.  The company says that the Hailo-10 AI accelerator modules will initially target PCs and automotive infotainment systems to power on-device chatbots, copilots, personal assistants, and speech-operated operating systems. It’s the second generative AI chip for the edge we’ve covered on CNX Software, as the Ambarella N1 SoC that combines 16 Arm Cortex-A78AE cores and an AI accelerator into a single chip was unveiled in January 2024.

The company says it will begin shipping samples of the Hailo-10 GenAI accelerator in Q2 of 2024. The previous Hailo-8 AI accelerator found its way into many systems from various embedded PC vendors, but the Hailo-10 will also be suitable for consumer devices. It may take a while before it becomes more broadly available, as for instance, the Hailo-15 was first introduced in March 2023, but the first commercial device, the SoliRun Hailo-15 SoM, was just announced a few days ago. Additional information, including a product brief, may be found on the product page.

Share this:

Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress

ROCK Pi 4C Plus
Subscribe
Notify of
guest
The comment form collects your name, email and content to allow us keep track of the comments placed on the website. Please read and accept our website Terms and Privacy Policy to post a comment.
6 Comments
oldest
newest
Bernstein
Bernstein
28 days ago

Why bother when all upcoming cpus have this builtin?

Ani
Ani
28 days ago

Why bother making video cards when almost all cpus have GPU builtin?

japori
japori
28 days ago

for all the already deployed and in use machines without npu coprocessor

Meco
Meco
28 days ago

I would love to try it out with Windows on Rockchip EDK2-RK3588. https://github.com/HeyMeco/Rockchip-pcie-devices could use a NPU category

John Smith
John Smith
28 days ago

I would actually like to see a chip that can have many TOPS that just improves old 240p video to at least 1080p or 4k, can make all sorts of grudgy video and compression artifacts to look pristine that can just hook up to an extra nvme slot.

Tom Cubie
26 days ago

@John Smith

Radxa will provide a micro edge device for your application. Upscale old videos to 4K locally. Stay tuned.

Khadas VIM4 SBC