Linux hardware video encoding on Amlogic A311D2 processor

I’ve spent a bit more time with Ubuntu 22.04 on Khadas VIM4 Amogic A311D2 SBC, and while the performance is generally good features like 3D graphics acceleration and hardware video decoding are missing. But I was pleased to see a Linux hardware video encoding section in the Wiki, as it’s not something we often see supported early on. So I’ve given it a try…

First, we need to make a video in NV12 pixel format that’s commonly outputted from cameras. I downloaded a 45-second 1080p H.264 sample video from Linaro, and converted it with ffmpeg:

I did this on my laptop. As a raw video, it’s pretty big with 3.3GB of storage used for a 45-second video:

Now let’s try to encode the video to H.264 on Khadas VIM4 board using aml_enc_test hardware video encoding sample:

The output explains the parameters used. There are some error messages, but the video can be played back with ffplay on my computer without issues.

Amlogic A311D2 H.264 video encoding sample

We can also see that encoding took place in 26 seconds, which is faster than real-time since the video is 45 seconds long.

Let’s try the same with H.265 encoding:

That’s surprising but H.265 video encoding is quite faster than H.264 video encoding. Let’s try H.264 encoding again:

Ah. It’s now taking less than 9 seconds. The first time it’s reading the data from the eMMC flash it is slow, but since the file is 3.3GB, it can fit into the cache so the second time there’s no bottleneck from storage.

amlogic a311d2 h265 hardware video encoding sample

Nevertheless, dump.h265 file could also play fine on my computer so the conversion was successful.

Amlogic A311D2 specifications say “H.265 & H.264 at 4Kp50” video encoding is supported. So let’s create a 45-second 4Kp50 video and convert it to NV12 YUV format. Oops, the size of the raw video is 27GB, and it won’t fit into the board’s eMMC flash… Let’s cut that to 30 seconds (about 18GB)…

Now we can encode the video to H.264:

Two minutes to encode a 30 seconds video! That does not cut it, so let’s run the sample again:

It’s even slower… I really think the storage is the bottleneck here because the required read speed for that file would be over 600 MB/s for real-time encoding. The system would typically encode video from the camera stream, not from the eMMC flash. I should have run iozone before:

The sequential read speed is about 178MB/s. I have a MINIX USB Hub with a 480GB SSD that I had tested at 400MB/s. Not quite what we need, but we should see an improvement.


Sadly, the drive was not mounted, and even no recognized at all even with tools like fdisk and GParted. When double-checking Khadas VIM4 specifications, I realized the USB Type-C port was a USB 2.0 OTG interface that should recognize the drive, but only support 480 Mbps, so it’s a lost cause anyway…  The only way to achieve over 600MB/s would be to use a USB 3.0 NVMe SSD, but I don’t have any.

So instead, I’ll make a 5-second 4Kp50 video that’s about 2.9GB in size.

First run using H.265:

Second run:

One last try with H.264:

Not quite real-time, but it’s getting closer, and that means 4Kp30 should be feasible. That’s the result with a 5-second 4Kp30 NV12 video encoded with H.264:

Less than four seconds. So real-time 4Kp30 H.264 hardware video encoding is definitely working on Amlogic A311D2 processor.

Amlogic A311D2 4Kp30 hardware video encoding

It’s playing fine on my PC too.

It’s also possible to encode NV12 YUV images into JPEG, but it won’t work with khadas user:

But no problem with sudo:

Probably just a simple permission issue. it was performed the task in 44ms, and I can open dump.jpg (a screenshot) without issues.

jpeg hardware encoding Amlogic A311D2

If I use ffmpeg to convert the NV12 file to jpeg, presumably with software encoding, it takes just under 200ms:

aml_enc_test and jpeg_enc_test are nice little utilities to test hardware video/image encoding in Linux on Amlogic A311D2, but the source code would be nice in order to integrate this into an application. But it does not appear to be public at this time, so I’d assume it’s part of Amlogic SDK. I’ll ask Khadas for the source code, or the method to get it.

Share this:
FacebookTwitterHacker NewsSlashdotRedditLinkedInPinterestFlipboardMeWeLineEmailShare

Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress

ROCK 5 ITX RK3588 mini-ITX motherboard

9 Replies to “Linux hardware video encoding on Amlogic A311D2 processor”

  1. > the second time there’s no bottleneck from storage.

    That’s why I always run sbc-bench -m (monitoring mode) in parallel with such tests. To see which kind of task the system is spending time on (your 1st run an awful lot of %iowait for sure).

    When switching to performance governor on all CPU clusters running iostat 5 instead consumes less resources. Though unadjusted cpufreq governor might be more interesting -> maybe low(er) CPU clockspeeds due to VPU busy and similar…

  2. I’m not even surprised that one can’t use any of the well-known frameworks for utilizing hw-encoding/-decoding, whether it is OMX, VA-API, V4L2 or similar and instead would have to write device-specific software using Amlogic’s SDK.

    Want to use some popular, already-existing open-source software? Nope, gotta fork it and (try to!) modify it to work with Amlogic’s libraries!

  3. BTW: if revisiting this topic it would be interesting to check SoC thermals while testing since Amlogic’s BSP kernel exposes half a dozen thermal sensors for A311D2: find /sys -name “*thermal”

    With lm-sensors package installed this will work too ofc: while true ; do sensors; sleep 10; done

    But exploring /sys a bit might be worth the efforts (clockspeeds/governors of memory, gpu, vpu and such things)

  4. Would it help to run these test with the media files on a ramdisk/tmpfs forceing them to be in ram regardless of any caching?

    1. Nope for the following simple reasons:

      • in passive benchmarking mode you need to repeat each test at least 3 times (since passive benchmarking means you’ve no idea what you’re actually doing)
      • you always need to monitor the benchmark environment
      • Linux filesystem caches/buffers work fine since over a decade so the 2nd run will show the problem
  5. The first time I tried doing YUV encoding for a video codec I was using a PowerMac G5 and a brand new firewire 800 external disk drive. I was taking an 854×480 NTSC MPEG2 stream from a DVD, and writing it to the hard drive at the same time I was reading the YUV (YUV4MPEG) file back in with the encoder software and writing the encoded file to the same external HDD. The drive had a physical failure in under 2 hours and was forevermore unusable.

    I switched to two different approaches.
    1) mkfifo will make a file that will take input from a process, and block until another process picks up that data to use it. This is great for running ffmpeg for the transcode to the raw format, but not actually using any space as the encoder picks it up.
    2) *NIX loves streams. You can have ffmpeg do the transcode to nv12 and pipe the output of that into your encoder. If your encoder can’t get data from stdin, you can get it from /dev/stdin.

    All of this is at the cost of some CPU and I/O that is handling the transcode to nv12 (or yuv4mpeg, or other) at runtime.

  6. i pay $199 only if i get the encoding and decoding source code otherwise i keep my rk3568 which has reasonable encoding performance.

  7. GOP should be something like 250 or 300 for better compression. It is how often do you want your key frames( frame without any compression)

Leave a Reply

Your email address will not be published. Required fields are marked *

Khadas VIM4 SBC
Khadas VIM4 SBC