Faster JPEG decoding on ARM with libjpeg-turbo and NEON Instructions

Orange Pi Development Boards

libjpeg-turbo is based on libjpeg, but uses SIMD instructions (MMX, SSE2, etc.) to accelerate JPEG compression and decompression on x86 targets. On such systems, libjpeg-turbo is generally 2-4x as fast as the original version of libjpeg with the same hardware.

ARM does not support MMX or SSE2 instructions, but it has its own SIMD instructions processed by the NEON Engine on ARM Cortex Core A5, A8, A9 and A15. ARM claims that “NEON technology can accelerate multimedia and signal processing algorithms such as video encode/decode, 2D/3D graphics, gaming, audio and speech processing, image processing, telephony, and sound synthesis by at least 3x the performance of ARMv5 and at least 2x the performance of ARMv6 SIMD.”

Linaro worked on libjpeg-turbo and added NEON support to it.

The code is available on launchpad at https://code.launchpad.net/~tom-gall/linaro/libjpeg-turbo

Linaro has also provide benchmark result for libjpeg-turbo with a 12 Mpixel image on TI OMAP4 (Pandaboard) using the command:

djpeg 12mp.jpeg > /dev/null

Non Optimized libjpeg-turbo(5 runs): 2078 ms (average)
Linaro’s Optimized libjpeg-turbo (5 runs):  1676 ms (average)

That represents almost 20% improvement between the non-optimized libjpeg-turbo library and the one for ARM NEON optimization by Linaro.

For further information about Linaro’s libjpeg-turbo optimization go to Optimize JPEG Decoding for ARM page. If you are interested in optimizing your code for NEON instruction, you can visit Optimizing Code for ARM Cortex-A8 with NEON SIMD, check the list of NEON C functionsavailable when NEON is enabled (CFLAGS += -mfpu=neon) and/ or read ARM NEON™ Instruction Set and Why You Should Care slides presented at ELC 2011 by Mike Anderson, Chief Scientist at PTR Group.

Support CNX Software - Donate via PayPal or become a Patron on Patreon

3
Leave a Reply

avatar
3 Comment threads
0 Thread replies
0 Followers
 
Most reacted comment
Hottest comment thread
1 Comment authors
wangweifengLinaro 13.07 Release With Linux Kernel 3.10.1 and Android 4.2.2ARM Releases Ne10: An Open Source Library with NEON Optimized Functions | CNXSoft – Embedded Software Development Recent comment authors
  Subscribe  
newest oldest most voted
Notify of
trackback

[…] improvement brought by NEON instructions on a real project (JPEG Decoding) by reading “Faster JPEG decoding on ARM with libjpeg-turbo and NEON Instructions” blog post. (function(){var […]

trackback

[…] libjpeg-turbo as the preferred provider for […]

wangweifeng
Guest
wangweifeng

please send me jpeg source.thanks.