Practical Applications and Benchmarks of GPU Computing via RenderScript and OpenCL with ARM Mali-T6XX GPU
Since the announcement of ARM Mali-T604 in 2010, ARM has explained that GPGPU (General Purpose computing on GPU), aka GPU Compute, would be one of the key features of their new Mali graphics processor, and the company now expects GPGPU to become mainstream in embedded and mobile devices in 2014 and beyond. I’ve just come across a presentation by Roberto Mijat, technical marketing manager at ARM, entitled “Unleashing the benefits of GPU Computing with ARM Mali” which shows practical applications and use cases where the use of RenderScript, or OpenCL can make massive performance improvements, at much lower power consumption, over the same parallel tasks processed by the CPU only. Let’s have a look at some of the most interesting slides.
GPU Compute for H.265 / HEVC
HEVC aka H.265 is the next generation codec providing twice the bandwidth with the same quality compared to H.264. The problem is that most SoCs today don’t have VPUs supporting this new standard, and the CPU are not quite powerful enough for 1080p decoding, and software decoding via CPU will require a lot of energy, and quickly drain battery.
Luckily many of the tasks for HEVC decoding require parallel data processing, and these can be partially offloaded from the CPU to the newer GPUs supporting OpenCL or RenderScript. Several companies, including Ittiam, have then developed HEVC implementations leveraging the GPU in ARM SoCs with very good results.
GPU Compute for Image and Video Processing
Nvidia already touted the GPU compute capabilities of the Tegra 4 for computational photography, and in the ARM slides, we can see some order of magnitudes improvement over CPU processing.
High Dynamic Range (HDR) imaging is technique taking two shots (foreground/background) to generate a better image. This is computationally intensive, and GPU compute (OpenGL) can provide a speed of about 16x over a CPU only implementation in an Arndale board with Mali-T604 GPU.
Other image processing algorithms are also greatly sped-up, between 3.5x to 15.7x, as shown in table on the right. This time the tests where performed on Nexus 10 tablet (Exynos 5250 with Mali-T604) in Android using RenderScript with software implemented by MuticoreWare.
GPGPU can also be used for Super-resolution techniques aiming to increase resolution of imaging systems, as well as video pre- and post-processing, leading to performance improvements of at least 3x, and a power consumption reduced by up to 80%.
GPU Compute for Computer Vision
If you’re a developer and have an application that may leverage the GPU compute capabilities of the newer Mali GPUs, you may want to have a look at Mali OpenCL SDK (Linux) and/or visit Android Developer’s RenderScript page.