Archive

Posts Tagged ‘machine learning’

ARM Cortex-A75 & Cortex-A55 Cores, and Mali-G72 GPU Details Revealed

May 27th, 2017 23 comments

We’ve already seen ARM Cortex A75 cores were coming thanks to leak showing Snapdragon 845 SoC will feature custom Cortex A75 cores, but we did not have many details. But since we live in a world where “to leak is glorious”, we already have some slides originally leaked through VideoCardz with the post now deleted, but Liliputing & TheAndroidSoul got some of the slides before deletion, so let’s see what we’ve got here.

ARM Cortex A75

So ARM Cortex-A75 will be  about 20% faster than Cortex A73 for single thread operation, itself already 30% faster than Cortex A72. It will also be the first DynamIQ capable processor together with Cortex A55 with both cores potentially used in big.LITTLE configuration.

Cortex A75 performance is only better for peak performance, and remain the same as Cortex-A73 for sustained performance.

The chart above does not start at zero, so it appear as though there are massive performance increases, but looks at the number and we can see 1.34x higher score with GeekBench, and 1.48x with Octane 2.0. Other benchmarks also have higher scores but between 1.16 and 1.33 times higher.

Click to Enlarge

Cortex A75 cores will be manufactured using 10nm process technology, and clocked at up to 3.0 GHz. While (peak) performance will be higher than Cortex A73, efficiency will remain the same.

ARM Cortex A55

Click to Enlarge

ARM Cortex A55 is the successor if Cortex-A53 with about twice the performance, and support for up to eight cores in a single cluster. There are octa-core (and even 24-core) ARM Cortex A53 processor but they also use multiple 4-core clusters.

Click to Enlarge

Power efficiency is 15% better too, and ARM claims it is 10x more configurable probably because of DynamIQ & 8-core cluster support.

Click to Enlarge

If we have a closer look at the benchmarks released by the company, we can see the 2x performance increase is only valid with LMBench memcpy memory benchmark, with other benchmarks from GeekBench v4 to SPECINT2006 showing 1.14x to 1.38x better performance. So Integer performance appears to be only slightly better, floating point gets close to 40%, and the most noticeable improvement is with memory bandwidth.

ARM Mali-G72 GPU

Click to Enlarge

Mali-G72 will offer 1.4x performance improvement over 2017 devices, which must be Mali-G71…, and will allow for machine learning directly on the device instead of having to rely on the cloud, better games, and an improved mobile VR experience.

Click to Enlarge

The new GPU is also 25& more efficient, and supports up to 32 shader cores. GEMM (general matrix multiplication) – used for example in machine learning algorithms – is improved by 17% over Cortex A73.

Click to Enlarge

Based on the information we’ve got from Qualcomm Snapdragon 845 leak, devices based on ARM Cortex A75/A55 processor and Mali-G72 GPU should start selling in Q1 2018. We may learn a few more details on Monday, once the embargo is lifted.

Google Releases Android O Developer Preview 2, Announces Android Go for Low-End Devices, TensorFlow Lite

May 18th, 2017 1 comment

After the first Android O developer preview released in March, Google has just released the second developer preview during Google I/O 2017, which on top of features like PiP (picture-in-picture), notifications channels, autofill, and others found in the first preview, adds notifications dots, a new Android TV home screen, smart text selection, and soon TensorFlow Lite. Google also introduced Android Go project optimized for devices with 512 to 1GB RAM.

Notifications dots (aka Notification Badges) are small dots that show on the top right of app icons – in supported launchers – in case a notification is available. You can then long press the icon to check out the notifications for the app, and dismiss or act on notifications. The feature can be disabled in the settings.

Android TV “O” also gets a new launcher that allegedly “makes it easy to find, preview, and watch content provided by apps”. The launcher is customizable as users can control the channels that appear on the homescreen. Developers will be able to create channels using the new TvProvider support library APIs.

I found text selection in Android to be awkward and frustrating most of the big time, but Android O brings improvements on that front with “Smart Text Selection” leveraging on-device machine learning to copy/paste, to let Android recognize entities like addresses, URLs, telephone numbers, and email addresses.

TensorFlow is an open source machine learning library that for example allows image recognition. Android O will now support TensorFlow Lite specifically designed to be fast and lightweight for embedded use cases. The company is also working on a new Neural Network API to accelerate computation, and both plan for release in a future maintenance update of Android O later this year.

Finally, Android Go project targets devices with 1GB or less of memory, and including optimization to the operating system itself, as well as optimization to apps such as YouTube, Chrome, and Gboard to make them use less memory, storage space, and mobile data. The Play Store will also highlight apps with low resources requirements on such devices, but still provide access to the full catalog. “Android Go” will ship in 2018 for all Android devices with 1GB or less of memory.

You can test Android O developer preview 2 by joining the Android O beta program if you own a Nexus 5X, 6P, Nexus Player, Pixel, Pixel XL, or Pixel C device.

Open Source ARM Compute Library Released with NEON and OpenCL Accelerated Functions for Computer Vision, Machine Learning

April 3rd, 2017 11 comments

GPU compute promises to deliver much better performance compared to CPU compute for application such a computer vision and machine learning, but the problem is that many developers may not have the right skills or time to leverage APIs such as OpenCL. So ARM decided to write their own ARM Compute library and has now released it under an MIT license.

The functions found in the library include:

  • Basic arithmetic, mathematical, and binary operator functions
  • Color manipulation (conversion, channel extraction, and more)
  • Convolution filters (Sobel, Gaussian, and more)
  • Canny Edge, Harris corners, optical flow, and more
  • Pyramids (such as Laplacians)
  • HOG (Histogram of Oriented Gradients)
  • SVM (Support Vector Machines)
  • H/SGEMM (Half and Single precision General Matrix Multiply)
  • Convolutional Neural Networks building blocks (Activation, Convolution, Fully connected, Locally connected, Normalization, Pooling, Soft-max)

The library works on Linux, Android or bare metal on armv7a (32bit) or arm64-v8a (64bit) architecture, and makes use of  NEON, OpenCL, or  NEON + OpenCL. You’ll need an OpenCL capable GPU, so all Mali-4xx GPUs won’t be fully supported, and you need an SoC with Mali-T6xx, T-7xx, T-8xx, or G71 GPU to make use of the library, except for NEON only functions.

In order to showcase their new library, ARM compared its performance to OpenCV library on Huawei Mate 9 smartphone with HiSilicon Kirin 960 processor with an ARM Mali G71MP8  GPU.

ARM Compute Library vs OpenCV, single-threaded, CPU (NEON)

Even with some NEON acceleration in OpenCV, Convolutions and SGEMM functions are around 15 times faster with the ARM Compute library. Note that ARM selected a hardware platform with one of their best GPU, so while it should still be faster on other OpenCL capable ARM GPUs the difference will be lower, but should still be significantly, i.e. several times faster.

ARM Compute Library vs OpenCV, single-threaded, CPU (NEON)

The performance boost in other function is not quite as impressive, but the compute library is still 2x to 4x faster than OpenCV.

While the open source release was just about three weeks ago, the ARM Compute library has already been utilized by several embedded, consumer and mobile silicon vendors and OEMs better it was open sourced, for applications such as 360-degree camera panoramic stitching, computational camera, virtual and augmented reality, segmentation of images, feature detection and extraction, image processing, tracking, stereo and depth calculation, and several machine learning based algorithms.