Download a free trial of the SoftNeuro Deep Learning SDK for Intel and Arm targets (Sponsored)

Jetson Xavier Tensorflow Lite vs SoftNeuro

Morpho, a global research & development company established in Japan in 2004 and specialized in imaging technology, is now offering a free trial for the SoftNeuro deep learning SDK working on Intel processors with AVX2 SIMD extensions, 64-bit Arm targets, while also leveraging OpenCL and/or CUDA. Some of the advantages of SoftNeuro are that the framework is easy to use even for those without any knowledge about deep learning, it’s fast thanks to the separation of the layers and their execution patterns, and it can run on several different hardware and OS being cross-platform. SoftNeuro relies on its own storage format (DNN format) to deliver the above advantages. But you can still use models trained with any mainstream deep learning framework. TensorFlow and Keras models can be directly converted to the DNN format, while models from other frameworks can be converted first to the ONNX format and then to the […]

Imagination IMG B-Series GPU family scales from IoT to the datacenter

Imagination IMG B-Series GPU: BXT MC4

Last year, Imagination Technologies unveiled IMG A-Series GPU family scaling from low-power IoT to mobile and high-performance server applications with up to 2.5 times the performance of the earlier PowerVR 9-series GPUs, as well as eight times faster AI processing and 60% less power under similar conditions. While I’m not aware of any SoCs announced with the new IMG A-Series GPU yet, the company has already announced the next-gen IMG B-Series GPU family with up to 4 times the multi-core performance thanks to decentralized multi-core technology, 30% lower power consumption, and 2.5 times the fill rate. The company offers four types of IM B-series GPU, each optimized for specific applications IMG BXE for high-resolution displays – From 1 up to 16 pixels per clock (PPC) BXE scales from 720p to 8K for UI rendering and entry-level gaming. IMG BXM designed for mid-range mobile gaming and complex UI solutions for DTV […]

Collabora & Microsoft to Bring OpenCL 1.2 and OpenGL 3.3 to DirectX 12 enabled Windows Devices

OpenCL DirectX Translation Layer

Collabora has been working on open-source graphics projects for a while, including Panfrost open-source drivers for Arm Midgard and Bitfrost GPUs which got experimental OpenGL ES 3.0 support earlier this year. But the company has also been working with Microsoft in order to provide an OpenCL 1.2 & OpenGL 3.3 translation layer for Windows devices compatible with DirectX 12. Their solution relies on Mesa 3D OpenCL and OpenGL open-source implementation with three main components: an OpenCL compiler using LLVM and the SPIRV-LLVM-Translator to generate SPIR-V representations of OpenCL kernels. The data goes through an SPIR-V to NIR translator (NIR is Mesa’s internal representation for GPU shaders), and finally to NIR-to-DXIL generating a DXIL compute shader and metadata understood by DirectX 12 (D3D12) a custom OpenCL runtime to do a direct translation of DirectX 12 (Not based on Mesa Clover implementation) a Gallium driver that builds and executes command-buffers on the […]

$118 BeagleBone-AI SBC is Made for AI Edge Applications

BeagleBone-AI

The BeagleBoard.org Foundation introduced BeagleBone-AI SBC at Embedded World 2019 last February. The board is specifically designed for artificial intelligence workloads at the edge thanks to Texas Instruments AM5729 dual-core Cortex-A15 processor that embeds a dual-core C66x DSP, and 4 EVE (Embedded Vision Engine) cores. The BeagleBone Black compatible board was not available at the time,  but the Foundation has now formally launched the board, and you can buy BeagleBone-AI for $118 and up with heatsink and antenna on sites such as Mouser, OKdo, or Newark. BeagleBone-AI full specifications have now been published: SoC – TI Sitara AM5729 with Dual-core Cortex-A15 processor @ 1.5 GHz 2x dual-core PRUs 2x Cortex-M4 real-time cores dual core C66x VLIW DSP 4x EVEs 2.5MB of on-chip L3 RAM VA-HD subsystem with support for 4K at 15fps H.264 encode/decode and other codecs at 1080p60 Vivante GC320 2D graphics accelerator Dual-Core PowerVR SGX544 3D GPU System […]

How to Get Started with OpenCL on ODROID-XU4 Board (with Arm Mali-T628MP6 GPU)

ODROID-XU4-OpenCL-Convolution

Last week, I reviewed Ubuntu 18.04 on ODROID-XU4 board testing most of the advertised features. However I skipped on the features listed in the Changelog: GPU hardware acceleration via OpenGL ES 3.1 and OpenCL 1.2 drivers for Mali T628MP6 GPU While I tested OpenGL ES with tools like glmark2-es2 and es2gears, as well as WebGL demos in Chromium, I did not test OpenCL, since I’m not that familiar with it, except it’s used for GPGPU (General Purpose GPU) to accelerate tasks like image/audio processing. That was a good excuse to learn a bit more, try it out on the board, and write a short guide to get started with OpenGL on hardware with Arm Mali GPU. The purpose of this tutorial is to show how to run an OpenCL sample, and OpenCL utility, and I won’t go into the nitty gritty of OpenCL code. If you want to learn more […]

Arm Releases Android / Linux Vulkan User Space Drivers for Mali GPUs (HiKey 960, Firefly-RK3288 Boards)

A little while ago, I wrote about Imagination’s PowerVR CLDNN Neural Network SDK and Image for Acer Chromebook R13, and some people looks into the Arch Linux Arm image and were pleasantly surprised to find Vulkan drivers, as it was the first Arm platform support Vulkan in Linux. It looks like there are now more Arm hardware supporting Vulkan drivers in Linux, as Arm has released binary user-space components for GNU/Linux and Android for development platforms featuring the Arm Mali Midgard GPU family, and – provided the GPU can handle it – supporting the following APIs: OpenGL ES 1.1 / 2.0 / 3.0 / 3.1 / 3.2, OpenCL 1.1 / 1.2 / 2.0, Vulkan 1.0, and RenderScript. Mali-G71 GPU is supported by Android 8.0 and Linux (fbdev) ARM64 drivers for Hikey 960 board, and Mali-T760 should be supported by Linux drivers (fbdev / wayland / X11) for Firefly-RK3288 board. Hikey […]

ODROID-HC2 Linux NAS System for 3.5″ Hard Drives Launched for $54

We knew it was coming, and Hardkernel has now launched an updated version of the ODROID-HC1, called ODROID-HC2 based on the same Samsung Exynos 5422 board, but instead supporting 3.5″ hard drives. The device can now be purchased for $54 plus shipping, but you may also consider adding some accessories like a 12V/2A power supply, and the top cover for the enclosure. [Update: Also listed on Ameridroid now]ODROID-HC2 specifications: SoC – Samsung Exynos 5422 octa-core processor with 4x ARM Cortex-A15 @ 2.0 GHz, 4x ARM Cortex-A7 @ 1.4GHz, and Mali-T628 MP6 GPU supporting OpenGL ES 3.0 / 2.0 / 1.1 and OpenCL 1.1 Full profile System Memory – 2GB LPDDR3 RAM PoP @ 750 MHz Storage UHS-1 micro SD slot up to 128GB SATA interface via JMicron JMS578 USB 3.0 to SATA bridge chipset Case supports 2.5″ or 3.5″ drives up to 27mm thick Network Connectivity – 10/100/1000Mbps Ethernet (via […]

First OpenCL Encounters on Cortex-A72: Some Benchmarking

This is a guest post by blu about his experience with OpenCL on MacchiatoBin board with a quad core Cortex A72 processor and an Intel based MacBook. He previously contributed several technical articles such as How ARM Nerfed NEON Permute Instructions in ARMv8 or OpenGL ES development on Ubuntu Touch. Qualcomm launched their long-awaited server ARM chip the other day, and we started getting the first benchmarks. Incidentally, I too managed to get some OpenCL ray-tracing code running on an ARM Cortex-A72 machine that same day (thanks to pocl – an LLVM-based open-source OCL multi-platform implementation), so my benchmarking curiosity got me. The code in question is an OCL (half-finished) port of a graphics demo from 2014. Some remarks of what it does: For each frame: a single thread builds a sparse voxel octree from a dynamic voxel scene; the octree, along with current camera settings are passed to an […]

Memfault IoT and embedded debugging platform