AMD Kria K24 Zynq Ultrascale+ system-on-module targets motor control and DSP applications

The AMD Kria K24 System-on-Module (SOM) with a custom-built Zynq UltraScale+ MPSoC and the KD240 Drives Starter Kit are designed for the development of cost-sensitive industrial and commercial edge applications.

The new Kria K24 is about half the size of a credit card and uses half the power of the larger, but connector-compatible, Kria K26 SOM that was introduced in 2021 for computer vision applications when the company was still known as Xilinx. That means existing K26 carrier boards can be reused with the Kria K24 SOM without modifying the PCB, but note there’s only one 240-pin connector on the new module, plus an extra 40-pin connector.

AMD Kria K24 specifications:

MPSoC – Custom-built Zynq Ultrascale+ XCK24
- Quad-core Arm Cortex-A53 processor up to 1.3 GHz
- Dual-core Arm Cortex-R5F real-time processor up to 533 MHz
- Mali-400 MP2 GPU up to 600 MHz
- FPGA fabric with 154K logic cells
- AMD Deep Learning Processor – B2304 DPU with 852 GOPS
- 9.4 Mbit on-chip SRAM (distributed RAM & block RAM)
System Memory – 2GB 32-bit LPDDR4 @ 1066 Mbps with ECC on industrial-grade module
Storage – TBD
1x 240-pin board-to-board connector and 1x 40-pin board-to-board connector
- Networking – Up to 4x 1 Gbps Ethernet (2x PS GEM, 2x PL GEM)
- Motor Drive & Control – Three-phase inverters, Quadrature encoder, Brake control, Torque sensor interface
- USB – 2x USB 2.0 / 3.0
- Other – CAN, RS-485, GPIOs, etc…
Security – IEC 62443 standard with hardware root of trust (RSA, AES, and SHA) ; discrete TPM 2.0 device
Dimensions (with heat spreader) – 60 x 42 x 11mm
Two versions
- Commercial – 0 to 85°C temperature range, 2-year warranty, 5-year operating lifetime, 10-year availability
- Industrial – –40 to 100°C temperature range, ECC memory support, 3-year warranty, 10-year operating lifetime, 10-year availability

Kria K24 latency power efficiency — Kira K24 latency and power efficiency

What makes the K24 SOM especially suitable for motor control and DSP applications are respectively its low latency (120 ns) which AMD claims is twice lower than the Texas Instruments AM64x processors, and its high efficiency for DSP processing compared to GPU-based solutions from NVIDIA such as the Jetson TX2 and Jetson Nano consuming only 2.5W of power.

The latency comparison relies on latency results reported by TI for a full control loop implementation on a Texas Instruments TMDS64EVM board using
a Texas Instruments benchmark vs. the latency results of a full control loop implementation using a Field Oriented Control algorithm designed by Qdesys running on the KD240 starter kit. AMD further claims the latency advantage improves up to 7x as the number of motor axes increases. The idle/default power draw was measured with the xmutil platform utility on the Kria K24 SOM while loading a bitstream to the SoM, AMD took the number published by NVIDIA for the default/idle power for the Jetson Nano (10W) and the Jetson TX2 (15W).

The Kria K24 SoM runs Yocto-based PetaLinux or Ubuntu Server 22.04 Linux distributions and supports pre-built HW acceleration with Vitis motor control libraries. Additional apps can be installed from the Kria App Store, and Python and the MATLAB Simulink environment can also be used to program the module and KD240 Drives Starter Kit shown below. More details about the software can be found in the wiki.

KD240 Drives Starter Kit specifications:

SoM – Kria K24 SOM as described above fitted with a heastsink
Storage – 512 Mbit QSPI flash, MicroSD card slot both bootable
Networking
- 2x PL Gigabit Ethernet RJ45 port with time-sensitive networking (TSN) and EtherCAT support
- 1x PS Gigabit Ethernet RJ45 port
USB – 2x USB 3.0 Type-A ports
Serial – CAN Bus and RS485 terminal blocks
Motor control
- Torque sensor connector
- 3-phase motor connector
- Brake control connector
  DC Link connector
- Single-ended QEI (Quadrature Encoder Interface) connector
- Differential QEI connector
Expansion
- 12-pin PMOD connector
- 1-wire interface
Debugging – JTAG connector, micro USB port for JTAG/serial
Misc – Fan connector
Power Supply – DC jack
Dimensions – 124 x 142 x 37 mm
Weight – 237 grams

Target applications include electric motor systems, robotics for factory automation, power generation, public transportation such as elevators and trains, surgical robotics, medical equipment (e.g. MRI beds), and EV charging stations.

The K24 SOM and KD240 Drives Starter Kit are available to order now, but while the K24 commercial version is shipping today, the industrial version should start shipping in Q4. The Kria KD240 Drives Starter Kit can be purchased for $399 directly on TI online shop or via distributors, and a $199 motor accessories pack may be a useful addition for quick evaluation. Further details may be found on the product page and the press release.

Jean-Luc Aufranc (CNXSoft)

Jean-Luc started CNX Software in 2010 as a part-time endeavor, before quitting his job as a software engineering manager, and starting to write daily news, and reviews full time later in 2011.

Share this:

Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress. We also use affiliate links in articles to earn commissions if you make a purchase after clicking on those links.

7 Replies to “AMD Kria K24 Zynq Ultrascale+ system-on-module targets motor control and DSP applications”

399usd for the starter kit? I thought I’d would be well over 1k.
Of course an FPGA solution of going to require less power and possible have lower latency (did they use pru’s for the foc in the TI example?) than a Linux/rtos implementation. The question is if it’s worth the hassle of implementing using HDL instead of cpp – or are they using one of those translation layers?

persondb says:

September 25, 2023 at 02:14

It is pretty expensive actually. The K24 SOM cost as much as the KV260 dev board, and the KD240 dev board is the most expensive of them all.

I guess KV260 and KR260 are subsided and below costs but still…

I brought my KV260 for $250, and I believe it was launched at $200.

The question is if it’s worth the hassle of implementing using HDL instead of cpp

You can actually use HLS for Xillinx FPGAs which is a lot easier than HDL. You write a lot in conventional C/Cpp(though ofc, there is still a lot of parts that have ntohing to do with cpp).

There is also some tools with Matlab integration, so you design the control loop there and then export it to HDL.

I don’t know what they have used though.

Reply

I don’t understand their “lower latency” argument here in the context of motor control, it smells like fishy marketing. the 150ns difference is more or less a power MOSFET rise or fall time, and in any case way less than a cycle time on a stepper motor (unless they’re planning to operate motors at millions of RPM of course). And even if they had hardware capable of reacting to this, the time saved corresponds to a distance of 4 micrometers for a vehicle driving at 100 km/h. For sure a self-driving vehicle doesn’t know where it exactly is with a 4 microns precision! And the latency of image capture and processing, that’s used to take some decisions (e.g. braking) is 5 orders of magnitude higher.

TonyT says:

September 23, 2023 at 04:12

Well, yeah, faster loops aren’t always better with mechanical systems. Back in the 1990’s, I used an Adept robot with 6 axes of smooth, fast motion — all running on a 500 Hz control loop on a Motorola 68040 🙂

Of course, there was probably hand-coded assembly code involved, but still, quite impressive.

Reply
persondb says:

September 25, 2023 at 02:05

I would assume it’s not for vehicular applications. Notice however, that this is probably some pretty basic Control loop.

If your application require say, a DSP application with MSamples (i.e. many millions of samples) for some 256-tap filter then well, you will have no way but to use a FPGA. CPUs would likely take over over 1 us per sample, while in FPGAs, it could be pipelined for much better results.

Reply

BTW, Digikey lists the SM-K24-XCL2GC SOM at $307.50, and the SM-K24-XCL2GI at $430.50, with none in stock.

Odd design. Yes, it would be wonderful to have a Zynq bodule+board w/ all the industrial IO, where enough of the fabric I/O is broken out (and isolated) such that someone could write all the microstep-generators, pulse-generators, encoder resolvers, etc in the fabric, to have them work without any CPU time,
then also add a RISC-V or Microblaze MCU to do realtime control,
and then tie it all to the A-core to run Klipper or LinuxCNC or something else.
(assuming an Ubuntu port is done)

That would be a GREAT design to have, especially if set up for 4 axes, with expandability to more. There are even some active projects on YouTube talking about doing EXACTLY such a design.
Xilinx could make “on board to rule them all” for the entire CNC/3DP space.

Oh well …

Boardcon LGA3576 Rockchip RK3576 System-on-Module designed for AI and IoT applications

and.elf says:

September 22, 2023 at 09:55

399usd for the starter kit? I thought I’d would be well over 1k.
Of course an FPGA solution of going to require less power and possible have lower latency (did they use pru’s for the foc in the TI example?) than a Linux/rtos implementation. The question is if it’s worth the hassle of implementing using HDL instead of cpp – or are they using one of those translation layers?

1. persondb says:
  
  September 25, 2023 at 02:14
  
  It is pretty expensive actually. The K24 SOM cost as much as the KV260 dev board, and the KD240 dev board is the most expensive of them all.
  
  I guess KV260 and KR260 are subsided and below costs but still…
  
  I brought my KV260 for $250, and I believe it was launched at $200.
  
  The question is if it’s worth the hassle of implementing using HDL instead of cpp
  
  You can actually use HLS for Xillinx FPGAs which is a lot easier than HDL. You write a lot in conventional C/Cpp(though ofc, there is still a lot of parts that have ntohing to do with cpp).
  
  There is also some tools with Matlab integration, so you design the control loop there and then export it to HDL.
  
  I don’t know what they have used though.
  
Willy says:

September 22, 2023 at 10:45

I don’t understand their “lower latency” argument here in the context of motor control, it smells like fishy marketing. the 150ns difference is more or less a power MOSFET rise or fall time, and in any case way less than a cycle time on a stepper motor (unless they’re planning to operate motors at millions of RPM of course). And even if they had hardware capable of reacting to this, the time saved corresponds to a distance of 4 micrometers for a vehicle driving at 100 km/h. For sure a self-driving vehicle doesn’t know where it exactly is with a 4 microns precision! And the latency of image capture and processing, that’s used to take some decisions (e.g. braking) is 5 orders of magnitude higher.

1. TonyT says:
  
  September 23, 2023 at 04:12
  
  Well, yeah, faster loops aren’t always better with mechanical systems. Back in the 1990’s, I used an Adept robot with 6 axes of smooth, fast motion — all running on a 500 Hz control loop on a Motorola 68040 🙂
  
  Of course, there was probably hand-coded assembly code involved, but still, quite impressive.
  
2. persondb says:
  
  September 25, 2023 at 02:05
  
  I would assume it’s not for vehicular applications. Notice however, that this is probably some pretty basic Control loop.
  
  If your application require say, a DSP application with MSamples (i.e. many millions of samples) for some 256-tap filter then well, you will have no way but to use a FPGA. CPUs would likely take over over 1 us per sample, while in FPGAs, it could be pipelined for much better results.
  
TonyT says:

September 23, 2023 at 04:14

BTW, Digikey lists the SM-K24-XCL2GC SOM at $307.50, and the SM-K24-XCL2GI at $430.50, with none in stock.

Andrew says:

September 29, 2023 at 02:18

Odd design. Yes, it would be wonderful to have a Zynq bodule+board w/ all the industrial IO, where enough of the fabric I/O is broken out (and isolated) such that someone could write all the microstep-generators, pulse-generators, encoder resolvers, etc in the fabric, to have them work without any CPU time,
then also add a RISC-V or Microblaze MCU to do realtime control,
and then tie it all to the A-core to run Klipper or LinuxCNC or something else.
(assuming an Ubuntu port is done)

That would be a GREAT design to have, especially if set up for 4 axes, with expandability to more. There are even some active projects on YouTube talking about doing EXACTLY such a design.
Xilinx could make “on board to rule them all” for the entire CNC/3DP space.

Oh well …

7 Replies to “AMD Kria K24 Zynq Ultrascale+ system-on-module targets motor control and DSP applications”

Leave a Reply Cancel reply

Leave a Reply