AMD Kria K24 Zynq Ultrascale+ system-on-module targets motor control and DSP applications

The AMD Kria K24 System-on-Module (SOM) with a custom-built Zynq UltraScale+ MPSoC and the KD240 Drives Starter Kit are designed for the development of cost-sensitive industrial and commercial edge applications.

The new Kria K24 is about half the size of a credit card and uses half the power of the larger, but connector-compatible, Kria K26 SOM that was introduced in 2021 for computer vision applications when the company was still known as Xilinx. That means existing K26 carrier boards can be reused with the Kria K24 SOM without modifying the PCB, but note there’s only one 240-pin connector on the new module, plus an extra 40-pin connector.

AMD Kria 24

AMD Kria K24 specifications:

  • MPSoC – Custom-built Zynq Ultrascale+ XCK24
    • Quad-core Arm Cortex-A53 processor  up to 1.3 GHz
    • Dual-core Arm Cortex-R5F real-time processor up to 533 MHz
    • Mali-400 MP2 GPU up to 600 MHz
    • FPGA fabric with 154K logic cells
    • AMD Deep Learning Processor – B2304 DPU with 852 GOPS
    • 9.4 Mbit on-chip SRAM (distributed RAM & block RAM)
  • System Memory – 2GB 32-bit LPDDR4 @ 1066 Mbps with ECC on industrial-grade module
  • Storage – TBD
  • 1x 240-pin board-to-board connector and 1x 40-pin board-to-board connector
    • Networking – Up to 4x 1 Gbps Ethernet (2x PS GEM, 2x PL GEM)
    • Motor Drive & Control – Three-phase inverters, Quadrature encoder, Brake control, Torque sensor interface
    • USB – 2x USB 2.0 / 3.0
    • Other – CAN, RS-485, GPIOs, etc…
  • Security – IEC 62443 standard with hardware root of trust (RSA, AES, and SHA) ; discrete TPM 2.0 device
  • Dimensions (with heat spreader) – 60 x 42 x 11mm
  • Two versions
    • Commercial – 0 to 85°C temperature range, 2-year warranty, 5-year operating lifetime, 10-year availability
    • Industrial – –40 to 100°C temperature range, ECC memory support, 3-year warranty, 10-year operating lifetime, 10-year availability


Kria K24 latency power efficiency
Kira K24 latency and power efficiency

What makes the K24 SOM especially suitable for motor control and DSP applications are respectively its low latency (120 ns) which AMD claims is twice lower than the Texas Instruments AM64x processors, and its high efficiency for DSP processing compared to GPU-based solutions from NVIDIA such as the Jetson TX2 and Jetson Nano consuming only 2.5W of power.

The latency comparison relies on latency results reported by TI for a full control loop implementation on a Texas Instruments TMDS64EVM board using
a Texas Instruments benchmark vs. the latency results of a full control loop implementation using a Field Oriented Control algorithm designed by Qdesys running on the KD240 starter kit. AMD further claims the latency advantage improves up to 7x as the number of motor axes increases. The idle/default power draw was measured with the xmutil platform utility on the Kria K24 SOM while loading a bitstream to the SoM, AMD took the number published by NVIDIA for the default/idle power for the Jetson Nano (10W) and the Jetson TX2 (15W).

The Kria K24 SoM runs Yocto-based PetaLinux or Ubuntu Server 22.04 Linux distributions and supports pre-built HW acceleration with Vitis motor control libraries. Additional apps can be installed from the Kria App Store, and Python and the MATLAB Simulink environment can also be used to program the module and KD240 Drives Starter Kit shown below. More details about the software can be found in the wiki.


Kria KD240 Drives Starter Kit

KD240 Drives Starter Kit specifications:

  • SoM – Kria K24 SOM as described above fitted with a heastsink
  • Storage – 512 Mbit QSPI flash, MicroSD card slot both bootable
  • Networking
    • 2x PL Gigabit Ethernet RJ45 port with time-sensitive networking (TSN) and EtherCAT support
    • 1x PS Gigabit Ethernet RJ45 port
  • USB – 2x USB 3.0 Type-A ports
  • Serial – CAN Bus and RS485 terminal blocks
  • Motor control
    • Torque sensor connector
    • 3-phase motor connector
    • Brake control connector
      DC Link connector
    • Single-ended QEI (Quadrature Encoder Interface) connector
    • Differential QEI connector
  • Expansion
    • 12-pin PMOD connector
    • 1-wire interface
  • Debugging – JTAG connector, micro USB port for JTAG/serial
  • Misc – Fan connector
  • Power Supply – DC jack
  • Dimensions – 124 x 142 x 37 mm
  • Weight – 237 grams

AMD motor control and DSP development kit block diagram

Target applications include electric motor systems, robotics for factory automation, power generation, public transportation such as elevators and trains, surgical robotics, medical equipment (e.g. MRI beds), and EV charging stations.

The K24 SOM and KD240 Drives Starter Kit are available to order now, but while the K24 commercial version is shipping today, the industrial version should start shipping in Q4. The Kria KD240 Drives Starter Kit can be purchased for $399 directly on TI online shop or via distributors, and a $199 motor accessories pack may be a useful addition for quick evaluation. Further details may be found on the product page and the press release.

Share this:
FacebookTwitterHacker NewsSlashdotRedditLinkedInPinterestFlipboardMeWeLineEmailShare

Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress

ROCK 5 ITX RK3588 mini-ITX motherboard

7 Replies to “AMD Kria K24 Zynq Ultrascale+ system-on-module targets motor control and DSP applications”

  1. 399usd for the starter kit? I thought I’d would be well over 1k.
    Of course an FPGA solution of going to require less power and possible have lower latency (did they use pru’s for the foc in the TI example?) than a Linux/rtos implementation. The question is if it’s worth the hassle of implementing using HDL instead of cpp – or are they using one of those translation layers?

    1. It is pretty expensive actually. The K24 SOM cost as much as the KV260 dev board, and the KD240 dev board is the most expensive of them all.

      I guess KV260 and KR260 are subsided and below costs but still…

      I brought my KV260 for $250, and I believe it was launched at $200.

      The question is if it’s worth the hassle of implementing using HDL instead of cpp

      You can actually use HLS for Xillinx FPGAs which is a lot easier than HDL. You write a lot in conventional C/Cpp(though ofc, there is still a lot of parts that have ntohing to do with cpp).

      There is also some tools with Matlab integration, so you design the control loop there and then export it to HDL.

      I don’t know what they have used though.

  2. I don’t understand their “lower latency” argument here in the context of motor control, it smells like fishy marketing. the 150ns difference is more or less a power MOSFET rise or fall time, and in any case way less than a cycle time on a stepper motor (unless they’re planning to operate motors at millions of RPM of course). And even if they had hardware capable of reacting to this, the time saved corresponds to a distance of 4 micrometers for a vehicle driving at 100 km/h. For sure a self-driving vehicle doesn’t know where it exactly is with a 4 microns precision! And the latency of image capture and processing, that’s used to take some decisions (e.g. braking) is 5 orders of magnitude higher.

    1. Well, yeah, faster loops aren’t always better with mechanical systems. Back in the 1990’s, I used an Adept robot with 6 axes of smooth, fast motion — all running on a 500 Hz control loop on a Motorola 68040 🙂

      Of course, there was probably hand-coded assembly code involved, but still, quite impressive.

    2. I would assume it’s not for vehicular applications. Notice however, that this is probably some pretty basic Control loop.

      If your application require say, a DSP application with MSamples (i.e. many millions of samples) for some 256-tap filter then well, you will have no way but to use a FPGA. CPUs would likely take over over 1 us per sample, while in FPGAs, it could be pipelined for much better results.

  3. Odd design. Yes, it would be wonderful to have a Zynq bodule+board w/ all the industrial IO, where enough of the fabric I/O is broken out (and isolated) such that someone could write all the microstep-generators, pulse-generators, encoder resolvers, etc in the fabric, to have them work without any CPU time,
    then also add a RISC-V or Microblaze MCU to do realtime control,
    and then tie it all to the A-core to run Klipper or LinuxCNC or something else.
    (assuming an Ubuntu port is done)

    That would be a GREAT design to have, especially if set up for 4 axes, with expandability to more. There are even some active projects on YouTube talking about doing EXACTLY such a design.
    Xilinx could make “on board to rule them all” for the entire CNC/3DP space.

    Oh well …

Leave a Reply

Your email address will not be published. Required fields are marked *

Khadas VIM4 SBC
Khadas VIM4 SBC