Home > AllWinner A2X, Freescale i.MX, Processors, Video > Most Embedded GPUs Do NOT Support Hardware Video Decoding Acceleration. The VPU Does.

Most Embedded GPUs Do NOT Support Hardware Video Decoding Acceleration. The VPU Does.

December 10th, 2013 Leave a comment Go to comments

Many people seem to get confused with the actual function of GPUs used in embedded (ARM / MIPS) SoC, and I can often read comments similar to “with lima drivers we should get video decoding in XBMc soon”,  and I’ve just received any email reading “My main task is to build a full hd media player based on ffmpeg with hardware decoding acceleration for linux. Is it possible with mali400mp4?”. So I’ve decoded to write a short post about it to make things a bit more clear. Contrary to GPUs in the PC world, embedded GPUs only take care of 3D, and sometimes 2D graphics, and leave video encoding and/or decoding to another block called Video Processing Unit (VPU). There’s at least one exception with Broadcom Videocore IV GPU as found in the processor used in the Raspberry Pi that apparently takes care of 2D & 3D graphics as well as hardware video decoding & encoding, but this is not the norm.

Let’s take an example with Freescale i.MX6 Quad SoC.

Freescale i.MX6 Quad Block Diagram

Freescale i.MX6 Quad Block Diagram

In the multimedia section in the middle of the block diagram above you’ll see hardware graphics accelerators, and video codecs:

  • 3D via Vivante GC2000 GPU
  • 2D via Vivante GC320 GPU
  • Vector Graphics (OpenVG 1.1) via Vivante GC355 GPU
  • 1080p30 Enc/Dec via a Video Processing Unit (VPU)

Freescale SoC is using one GPU for 3D, two separate GPUs for 2D composition and vector graphics, and a VPU to handle video by hardware. That means Vivante GC2000 has nothing to do with video hardware decoding for example.

Let’s give another short example. AllWinner A20 features a Mali-400 (MP2) GPU with 3D graphics and OpenVG support, a separate 2D engine, and CedarX VPU for hardware video processing.  So please, don’t come to ask me if it is possible to use Mali-400 hardware video decoder in Linux. :)

Where it gets a little confusing, is that some of the GPU capabilities can be used to decode video codecs that are not supported by the Video Processing Unit. For example, the Raspberry Pi guys used some features of the VideoCore IV GPU, but not the hardware codecs, to implemented VP6, VP8, MJPEG decoding in standard resolution. More recent GPUs comes with Renderscript and OpenCL support, which allows 1080p HEVC (H.265) video decoding using the CPU and GPU. That’s called GPU compute, and although it works, it won’t be as power efficient as video hardware decoding in the VPU.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter

  1. adem
    December 11th, 2013 at 05:13 | #1

    i am sorry i am not the sharpest tool in the shed but what does this all mean what is your message i did not quiet understand also the pi does not have hardware acceleration except when playing multimedia via xbmc or omxplayer or something. and one more Q what part of the gpu is used for hw acceleration thanks

  2. December 11th, 2013 at 09:18 | #2

    @adem
    I sort of hope my post was clear.

    Normally in an ARM SoC you’ve got one or more GPU to handle 2D and 3D graphics, and a VPU to handle hardware video decoding.

    The Raspberry Pi is using Broadcom BCM2835 with Videocore IV GPU. This GPU comes with the blocks for 2D & 3D graphic as well as video hardware acceleration (There’s a VPU inside the Videocore IV).

    Now Videocore IV GPU support hardware decode for H.264, MPG2, VC1 and more a few other codecs. In order to support more codecs, the Raspberry Pi team used vector graphics engine (2D graphics) to accelerate VP6, VP8, and MJPEG decoding. This is software decoding but with the help of the GPU, and the maximum resolution for a smooth video is 480p. omxplayer is a command line tool used to decode videos in the R-Pi.

  3. notzed
    December 13th, 2013 at 04:33 | #3

    I wonder where this misunderstanding started since it’s so wide-spread? Maybe it’s just the connection with intel architecture where the cpu just does cpu stuff, the other chips are just glue, and a graphics card is now the only commercially viable place to put extra functional units like gpu, vpu, physics, or sound dsps.

    SoC’s have so many functional units these days even the distinction between ‘gpu’ ‘cpu’ or ‘vpu’ starts to blur as you demonstrate with the 3 separate parts related to graphics synthesis above – multiple blocks might be involved. And the graphics output stage will likely have other video related functions like scalable rgb or ycrcb overlays for everything from a pointer sprite to multiple video windows.

    And as you elude with the rpi even ‘hardware functional units’ are becoming programmable further blurring the line between hardware and software, even if the general programmer doesn’t have access to them. Even opengles can be used to accelerate certain functions like scaling, colourspace conversion or transforms where a given vpu lacks direct support for an algorithm.

    It will be interesting to see if HSA manages to open up some of these specialised programmable resources to other programmers which would just further blur the distinction between all these blocks and turn them into more integrated ‘systems’.

  4. December 13th, 2013 at 09:28 | #4

    I had to look up HSA (Heterogeneous Systems Architecture Foundation) – http://hsafoundation.com/

  5. May 20th, 2014 at 00:02 | #5

    Instead of ” So please, don’t come to ask me if it is possible to use Mali-400 hardware video decoder in Linux.” statement, it would be more helpful to clarify “why” not ask that question at all.

  1. July 26th, 2014 at 08:09 | #1