ARM NEON Tutorial in C and Assembler

The Advanced SIMD extension (aka NEON or “MPE” Media Processing Engine) is a combined 64- and 128-bit single instruction multiple data (SIMD) instruction set that provides standardized acceleration for media and signal processing applications similar to MMX, SSE and 3DNow! extensions found in x86 processors.

Doulos has a video tutorial showing how you can exploit NEON instructions in assembler, how to modify your C code and provides the compile options for gcc to enable NEON during the build.

Abstract:
With the v7-A architecture, ARM has introduced a powerful SIMD implementation called NEON™. NEON is a coprocessor which comes with its own instruction set for vector operations. While NEON instructions could be hand coded in assembler language, ideally we want our compiler to generate them for us. Automatic analysis whether an iterative algorithm can be mapped to parallel vector operations is not trivial not the least because the C language is lacking constructs necessary to support this. This paper explains how the RealView compiler tools (RVCT) and other modern compilers use a blend of sophisticated analysis techniques and language extensions to fulfill their job.

You can download the whiter paper at http://www.doulos.com/knowhow/arm/using_your_c_compiler_to_exploit_neon/Resources/index.php (registration required).

Here’s how to enable NEON instructions (with auto vectorization) for ARM gcc cross-compiler:

arm-none-gnueabi-gcc –mfpu=neon -ftree-vectorize -c sample.c

and armcc compiler:

armcc –cpu=Cortex-A9 -O3 -Otime –vectorize –remarks -c fir_neon.c

I recommand you watch the 17 minutes video tutorial as it explains how to modify your C code to take advantage of NEON instructions with a FIR filter.

Share this:

Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress

ROCK 5 ITX Rockchip RK3588 mini-ITX motherboard
Subscribe
Notify of
guest
The comment form collects your name, email and content to allow us keep track of the comments placed on the website. Please read and accept our website Terms and Privacy Policy to post a comment.
1 Comment
oldest
newest
Boardcon Rockchip RK3588S SBC with 8K, WiFI 6, 4G LTE, NVME SSD, HDMI 2.1...