How to Get Started with OpenCL on ODROID-XU4 Board (with Arm Mali-T628MP6 GPU)

ODROID-XU4-OpenCL-Convolution

Last week, I reviewed Ubuntu 18.04 on ODROID-XU4 board testing most of the advertised features. However I skipped on the features listed in the Changelog: GPU hardware acceleration via OpenGL ES 3.1 and OpenCL 1.2 drivers for Mali T628MP6 GPU While I tested OpenGL ES with tools like glmark2-es2 and es2gears, as well as WebGL demos in Chromium, I did not test OpenCL, since I’m not that familiar with it, except it’s used for GPGPU (General Purpose GPU) to accelerate tasks like image/audio processing. That was a good excuse to learn a bit more, try it out on the board, and write a short guide to get started with OpenGL on hardware with Arm Mali GPU. The purpose of this tutorial is to show how to run an OpenCL sample, and OpenCL utility, and I won’t go into the nitty gritty of OpenCL code. If you want to learn more about OpenCL coding on Arm, one way would be to …

Support CNX Software – Donate via PayPal or become a Patron on Patreon

Arm Releases Android / Linux Vulkan User Space Drivers for Mali GPUs (HiKey 960, Firefly-RK3288 Boards)

A little while ago, I wrote about Imagination’s PowerVR CLDNN Neural Network SDK and Image for Acer Chromebook R13, and some people looks into the Arch Linux Arm image and were pleasantly surprised to find Vulkan drivers, as it was the first Arm platform support Vulkan in Linux. It looks like there are now more Arm hardware supporting Vulkan drivers in Linux, as Arm has released binary user-space components for GNU/Linux and Android for development platforms featuring the Arm Mali Midgard GPU family, and – provided the GPU can handle it – supporting the following APIs: OpenGL ES 1.1 / 2.0 / 3.0 / 3.1 / 3.2, OpenCL 1.1 / 1.2 / 2.0, Vulkan 1.0, and RenderScript. Mali-G71 GPU is supported by Android 8.0 and Linux (fbdev) ARM64 drivers for Hikey 960 board, and Mali-T760 should be supported by Linux drivers (fbdev / wayland / X11) for Firefly-RK3288 board. Hikey 960 and Firefly-RK3288 drivers don’t have specific files about Vulkan, …

Support CNX Software – Donate via PayPal or become a Patron on Patreon

ODROID-HC2 Linux NAS System for 3.5″ Hard Drives Launched for $54

We knew it was coming, and Hardkernel has now launched an updated version of the ODROID-HC1, called ODROID-HC2 based on the same Samsung Exynos 5422 board, but instead supporting 3.5″ hard drives. The device can now be purchased for $54 plus shipping, but you may also consider adding some accessories like a 12V/2A power supply, and the top cover for the enclosure. [Update: Also listed on Ameridroid now]ODROID-HC2 specifications: SoC – Samsung Exynos 5422 octa-core processor with 4x ARM Cortex-A15 @ 2.0 GHz, 4x ARM Cortex-A7 @ 1.4GHz, and Mali-T628 MP6 GPU supporting OpenGL ES 3.0 / 2.0 / 1.1 and OpenCL 1.1 Full profile System Memory – 2GB LPDDR3 RAM PoP @ 750 MHz Storage UHS-1 micro SD slot up to 128GB SATA interface via JMicron JMS578 USB 3.0 to SATA bridge chipset Case supports 2.5″ or 3.5″ drives up to 27mm thick Network Connectivity – 10/100/1000Mbps Ethernet (via USB 3.0) USB – 1x USB 2.0 port Debugging – …

Support CNX Software – Donate via PayPal or become a Patron on Patreon

First OpenCL Encounters on Cortex-A72: Some Benchmarking

This is a guest post by blu about his experience with OpenCL on MacchiatoBin board with a quad core Cortex A72 processor and an Intel based MacBook. He previously contributed several technical articles such as How ARM Nerfed NEON Permute Instructions in ARMv8 or OpenGL ES development on Ubuntu Touch. Qualcomm launched their long-awaited server ARM chip the other day, and we started getting the first benchmarks. Incidentally, I too managed to get some OpenCL ray-tracing code running on an ARM Cortex-A72 machine that same day (thanks to pocl – an LLVM-based open-source OCL multi-platform implementation), so my benchmarking curiosity got me. The code in question is an OCL (half-finished) port of a graphics demo from 2014. Some remarks of what it does: For each frame: a single thread builds a sparse voxel octree from a dynamic voxel scene; the octree, along with current camera settings are passed to an OCL kernel via double buffering; kernel computes a screen-space map …

Support CNX Software – Donate via PayPal or become a Patron on Patreon

Imagination PowerVR “Furian” Series8XT GT8525 GPU Targets High-end Smartphones, Virtual Reality and Automotive Products

Imagination Technologies has unveiled their first GPU based on PowerVR Furian architecture with Series8XT GT8525 GPU equipped with two clusters and designed for SoCs going to into products such as high-end smartphones and tablets, mid-range dedicated VR and AR devices, and mid- to high-end automotive infotainment and ADAS systems. The Furian architecture is said to allow for improvements in performance density, GPU efficiency, and system efficiency, features a new 32-wide ALU cluster design, and can be manufactured using sub-14nm (e.g. 7nm process once available). PowerVR GT8525 GPU supports compute APIs such as OpenCL 2.0, Vulkan 1.0 and OpenVX 1.1. Compared to the previous Series7XT GPU family, Series8XT GT8525 GPU delivers 80% higher fps in Trex benchmark, an extra 50% fps in GFXbench Manhattan benchmark, 50% higher fps in Antutu, doubles the fillrate throughput for GUI, and increases GFLOPs for compute applications by over 50%. GT8525 GPU is available for licensing now, and has already been delivered to lead customers. More …

Support CNX Software – Donate via PayPal or become a Patron on Patreon

Open Source ARM Compute Library Released with NEON and OpenCL Accelerated Functions for Computer Vision, Machine Learning

GPU compute promises to deliver much better performance compared to CPU compute for application such a computer vision and machine learning, but the problem is that many developers may not have the right skills or time to leverage APIs such as OpenCL. So ARM decided to write their own ARM Compute library and has now released it under an MIT license. The functions found in the library include: Basic arithmetic, mathematical, and binary operator functions Color manipulation (conversion, channel extraction, and more) Convolution filters (Sobel, Gaussian, and more) Canny Edge, Harris corners, optical flow, and more Pyramids (such as Laplacians) HOG (Histogram of Oriented Gradients) SVM (Support Vector Machines) H/SGEMM (Half and Single precision General Matrix Multiply) Convolutional Neural Networks building blocks (Activation, Convolution, Fully connected, Locally connected, Normalization, Pooling, Soft-max) The library works on Linux, Android or bare metal on armv7a (32bit) or arm64-v8a (64bit) architecture, and makes use of  NEON, OpenCL, or  NEON + OpenCL. You’ll need an …

Support CNX Software – Donate via PayPal or become a Patron on Patreon

ARM Introduces Bifrost Mali-G51 GPU, and Mali-V61 4K H.265 & VP9 Video Processing Unit

Back in May of this year, ARM unveiled Mali-G71 GPU for premium devices, and the first GPU of the company based on Bifrost architecture. The company has now introduced the second Bifrost GPU with Mali-G51 targeting augmented & virtual reality and higher resolution screens to be found in mainstream devices in 2018, as well as Mali-V61 VPU with 4K H.265 & VP9 video decode and encode capabilities, previously unknown under the codename “Egil“. Mali-G51 GPU ARM Mali-G51 will be 60% more energy efficiency, and have 60% more performance density compared to Mali-T830 GPU, making the new GPU the most efficient ARM GPU to date. It will also be 30% smaller, and support 1080p to 4K displays. Under the hood, Mali-G51 include an updated Bifrost’s low level instruction set, a dual-pixel shader core per GPU core to deliver twice the texel and pixel rates, features the latest ARM Frame Buffer Compression (AFBC) 1.2, and supports Vulkan, OpenGL ES 3.2, and OpenCL …

Support CNX Software – Donate via PayPal or become a Patron on Patreon

PowerVR GT7200 Plus and GT7400 Plus GPUs Support OpenCL 2.0, Better Computer Vision Features

Imagination Technologies introduced PowerVR Series7XT GPU family with up to 512 cores at the end of 2014, and at CES 2016, they’ve announced Series7XT Plus family with GT7200 Plus and GT7400 Plus GPUs, with many of the same features of Series7XT family, plus the addition of OpenCL 2.0 API support, and improvements for computer vision with a new Image Processing Data Master, and support for 8-bit and 16-bit integer data paths, instead of just 32-bit in the previous generation, for example leading to up to 4 times more performance for applications, e.g. deep learning, leveraging OpenVX computer vision API. GT7200 Plus GPU features 64 ALU cores in two clusters, and GT7400 Plus 128 ALU cores in a quad-cluster configuration. Beside OpenCL2.0, and improvements for computer vision, they still support OpenGL ES 3.2, Vulkan, hardware virtualization, advanced security, and more. The company has also made some microarchitectural enhancements to improve performance and reduce power consumption: Support for the latest bus interface …

Support CNX Software – Donate via PayPal or become a Patron on Patreon