VDPU381 and VDPU383 video decoders are found in Rockchip RK3588 and RK3576 SoCs and variants like the RK3588S and RK3576J. So far, we had to rely on the Rockchip BSP to support hardware video decoding, but Collabora has just announced upstream/mainline Linux support for H.264 (AVC) and H.265 (HEVC) video decoding for RK3588 and RK3576 SoCs.
Highlights of H.265/H.264 video decoder implementation on mainline Linux:
- A 17-patch series adding decoder support, in addition to dt-bindings and device tree nodes
- New V4L2 HEVC UAPI controls for explicit short-term and long-term RPS (Reference Picture Set) handling
- Fixing a non-obvious IOMMU restore issue caused by decoder-embedded IOMMU resets
- Struct-based register programming model to enforce completeness, ordering, and future multi-core readiness

The new V4L2 UAPI controls for HEVC long and short-term Reference Picture Set (RPS) are required for the VDPU381 (RK3588) and VDP383 (RK3576) video decoders, contrary to some other decoders (e.g., VeriSilicon) that can ignore those. So an API was needed for userspace to pass fully described short-term and long-term RPS tables to the kernel. The company also added support in the Virtual Stateless Decoder (visl) driver that shows ftraces with all control parameters. The V4L2 UAPI controls were implemented in GStreamer 1.28 (merged) and FFmpeg (preliminary). The new API also enables compatibility with the Vulkan Video Decode.
The IOMMU restore issue is an interesting one. The IOMMU core is embedded into the Rockchip decoders, so when the decoder is reset, the internal IOMMU is also reset, clearing all previous address mappings. However, the kernel still considers the IOMMU mapping valid after a decoder reset. A patch fixes this issue by explicitly restoring cached IOMMU mappings after a decoder reset. This also impacts other IP blocks in Rockchip SoC,s like the RGA 2D graphics accelerator.
Rockchip decoder registers have some default values listed in the datasheet. However, the hardware may be in an inconsistent state if the kernel code skips a write to a register, even to a default value. That means it’s safer to just write all registers, and Collabora engineers also realized that write order matters, and writing values in the wrong sequence can also break the decoder. For these two reasons, they decided to rely on a C struct for register programming, instead ad-hoc writel() calls or regmap.  See the commit for details.
Support has been merged in Linux 7.0 (See comments section). This type of news is both great and frustrating, because we can get a relatively recent and powerful processor with a vendor BSP, or we can get a 4 to 5-year-old processor with mainline Linux, but we can’t seem to get a new processor with mainline Linux.
Going forward, Collabora will work on multi-core support on RK3588 since it has two VDPU381 decoder cores, AV1 support on RK3576, VP9 code support on RK3588, and also add support for the VDPU346 decoder, a variant of VDPU381 found on RK356X SoCs (RK3562/RK3566/RK3568). A more detailed summary of the changes can be found on the Collabora website.

Jean-Luc started CNX Software in 2010 as a part-time endeavor, before quitting his job as a software engineering manager, and starting to write daily news, and reviews full time later in 2011.
Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress. We also use affiliate links in articles to earn commissions if you make a purchase after clicking on those links.



