Intel recently launched Movidius Neural Compute Stick (MvNCS)for low power USB based deep learning applications such as object recognition, and after some initial confusions, we could confirm the Neural stick could also be used on ARM based platforms such as the Raspberry Pi 3. Kochi Nakamura, who wrote the code for GPU accelerated object recognition on the Raspberry Pi 3 board, got hold of one sample in order to compare the performance between GPU and MvNCS acceleration.
That’s the description in YouTube:
Comparison of deep learning inference acceleration by Movidius’ Neural Compute Stick (MvNCS) and by Idein’s software which uses Raspberry Pi’s GPU (VideoCore IV) without any extra computing resources.
Movidius’ demo runs GoogLeNet with 16-bit floating point precision.Average inference time is 108ms.
We used MvNC SDK 1.07.07 and their official demo script without any changes. (ncapi/py_examples/stream_infer/stream_infer.py)
It seems something is wrong with the inference results.
We recompiled graph file with -s12 option to use 12 SHAVE vector processor simultaneously.
Idein’s demo also runs GoogLeNet with 32-bit floating point precision. Average inference time is 320ms.
It’s interesting to note the GPU demo used 32-bit floating point precision, against 16-bit floating point precision on the Neural Compute Stick, although it’s unclear to me how that may affect performance of such algorithms. Intel recommends a USB 3.0 interface for MvNCS, and the Raspberry Pi 3 only comes with a USB 2.0 interface that shares the bandwidth for the USB webcam and the MvNCS, so it’s possible an ARM board with a USB 3.0 interface for the stick, and a separate USB interface for the webcam could perform better. Has anybody tested it? A USB 3.0 interface and hub would also allow to cascade several Neural Compute Sticks.
Jean-Luc started CNX Software in 2010 as a part-time endeavor, before quitting his job as a software engineering manager, and starting to write daily news, and reviews full time later in 2011.