Archive

Posts Tagged ‘tensorflow’

Qualcomm Snapdragon 845 Octa Core Kryo 385 SoC to Power Premium Smartphones, XR Headsets, Windows Laptops

December 7th, 2017 9 comments

Qualcomm Snapdragon 845 processor was expected since May 2017 with four custom Cortex A75 cores, four Cortex A53 cores, Adreno 630 GPU, and X20 LTE modem. with the launch planned for Q1 2018. At least, that what the leaks said.

Qualcomm has now formally launched Snapdragon 845 Mobile Platform and rumors were mostly right, as the the octa-core processor comes with four Kryo 385 Gold cores (custom Cortex A75), four Kryo 385 Silver cores (custom Cortex A55) leveraging DynamIQ technology, an Adreno 630 “Visual Processing System”, and Snapdragon X20 modem supporting LTE Cat18/13.

The processor is said to use more advanced artificial intelligence (AI) allowing what the company calls “extended reality (XR)” applications, and will soon be found in flagship smartphones, XR headsets, mobile PCs, and more.

Qualcomm Snapdragon 845 (SDM845) specifications:

  • Processor
    • 4x Kryo 385 Gold performance cores @ up to 2.80 GHz (custom ARM Cortex A75 cores)
    • 4x Kryo 385 Silver efficiency cores @ up to 1.80 GHz (custom ARM Cortex A55 cores)
    • DynamIQ technology
  • GPU (Visual Processing Subsystem) – Adreno 630 supporting OpenGL ES 3.2, OpenCL 2.0,Vulkan 1.x, DxNext
  • DSP
    • Hexagon 685 with 3rd Gen Vector Extensions, Qualcomm All-Ways Aware Sensor Hub.
    • Supports Snapdragon Neural Processing Engine (NPE) SDK, Caffe, Caffe2, and Tensorflow
  • Memory I/F – LPDDR4x, 4×16 bit up to 1866MHz, 8GB RAM
  • Storage I/F – TBD (Likely UFS 2.1, but maybe UFS 3.0?)
  • Display
    • Up to 4K Ultra HD, 60 FPS, or dual 2400×2400 @ 120 FPS (VR); 10-bit color depth
    • DisplayPort and USB Type-C support
  • Audio
    • Qualcomm Aqstic audio codec and speaker amplifier
    • Qualcomm aptX audio playback with support for aptX Classic and HD
    • Native DSD support, PCM up to 384kHz/32bit
  • Camera
    • Spectra 280 ISP with dual 14-bit ISPs
    • Up to 16 MP dual camera, up to 32 MP single camera
    • Support for 16MP image sensor operating up to 60 frames per second
    • Hybrid Autofocus, Zero Shutter Lag, Multi-frame Noise Reduction (MFNR)
    • Video Capture – Up to 4K @ 60fps HDR (H.265), up to 720p @ 480fps (slow motion)
  • Connectivity
    • Cellular Modem – Snapdragon X20 with peak download speed: 1.2 Gbps (LTE Cat 18), peak upload speed: 150 Mbps (LTE Cat 13)
    • Qualcomm Wi-Fi 802.11ad Multi-gigabit, integrated 802.11ac 2×2 with MU-MIMO, 2.4 GHz, 5 GHz and 60 GHz
    • Qualcomm TrueWireless Bluetooth 5
  • Location – Support for 6 satellite systems: GPS, GLONASS, Beidou, Galileo, QZSS, SBAS; low power geofencing and tracking, sensor-assisted navigation
  • Security – Qualcomm Secure Processing Unit (SPU), Qualcomm Processor Security, Qualcomm Mobile Security, Qualcomm Content Protection
  • Charging – Qualcomm Quick Charge 4/4+ technology
  • Process – 10nm LPP

The company will provide support for Android and Windows operating systems. eXtended Reality (XR) is enabled with features such as room-scale 6DoF with simultaneous localization and mapping (SLAM), advanced visual inertial odometry (VIO), and Adreno Foveation. Maybe I don’t follow the phone market closely enough, but I can’t remember seeing odometry implemented in any other phones, and Adreon Foveation is not quite self-explaining, so the company explains it combines graphics rendering with eye tracking, and directs the highest graphics resources to where you’re physically looking, while using less resources for rendering other areas. This improves the experience, performance, and lower power consumption.

 

Click to Enlarge

Compared to Snapdragon 835, the new processor is said to be around 25 to 30% faster, the Spectra camera and Adreno graphics architectures are claimed to boost power efficiency by up to 30 percent, and the LTE modem is a bit faster (1.2 Gbps/150Mbps vs 1.0 Gbps/150Mbps). Quick Charge 4+ technology should deliver up  to 50 percent charge in 15 minutes. Earlier this year when SD835 was officially launched, there was virtually no mention of artificial intelligence support in mobile APs, but now NNA (Neural Network Accelerator) or NPE (Neural Processing Engine) are part of most high-end mobile processors, which in SD845 appears to be done though the Hexagon 685 DSP. High Dynamic Range (HDR) for video playback and capture is also a novelty in the new Snapdragon processor.

One of the first device powered by Snapdragon 845 will be Xiaomi Mi 7 smartphone, and according to leaks it will come with a 6.1″ display, up to 8GB RAM, dual camera, 3D facial recognition, and more. Further details about the phone are expected for Mobile World Congress 2018. Considering the first Windows 10 laptop based on Snapdragon 835 processor are expected in H1 2018, we may have to wait until the second part of the year for the launch of Snapdragon 845 mobile PCs.

More details may be found on Qualcomm Snapdragon 845 mobile platform product page.

AWS DeepLens is a $249 Deep Learning Video Camera for Developers

November 30th, 2017 4 comments

Amazon Web Services (AWS) has launched Deeplens, the “world’s first deep learning enabled video camera for developers”. Powered by an Intel Atom X5 processor with 8GB, and featuring a 4MP (1080p) camera, the fully programmable system runs Ubuntu 16.04, and is designed expand deep learning skills of developers, with Amazon providing tutorials, code, and pre-trained models.

Click to Enlarge

AWS Deeplens specifications:

  • Camera – 4MP (1080p) camera using MJPEG, H.264 encoding
  • Video Output – micro HDMI port
  • Audio – 3.5mm audio jack, and HDMI audio
  • Connectivity – Dual band WiFi
  • USB – 2x USB 2.0 ports
  • Misc – Power button; camera, WiFi and power status LEDs; reset pinhole
  • Power Supply – TBD
  • Dimensions – 168 x 94 x 47 mm
  • Weight – 296.5 grams

The camera can not only do inference, but also train deep learning models using Amazon infrastructure. Performance wise, the camera can infer 14 images/second on AlexNet, and 5 images/second on ResNet 50 for batch size of 1.

Six projects samples are currently available: object detection, hot dog not hot dog, cat and dog,  activity detection, and face detection. Read that blog post to see how to get started.

But if you want to make your own project, a typical workflow would be as follows:

  • Train a deep learning model using Amazon SageMaker
  • Optimize the trained model to run on the AWS DeepLens edge device
  • Develop an AWS Lambda function to load the model and use to run inference on the video stream
  • Deploy the AWS Lambda function to the AWS DeepLens device using AWS Greengrass
  • Wire the edge AWS Lambda function to the cloud to send commands and receive inference output

This steps are explained in details on Amazon blog.

Click to Enlarge

Intel also published a press release explaining how they are involved in the project:

DeepLens uses Intel-optimized deep learning software tools and libraries (including the Intel Compute Library for Deep Neural Networks, Intel clDNN) to run real-time computer vision models directly on the device for reduced cost and real-time responsiveness.

Developers can start designing and creating AI and machine learning products in a matter of minutes using the preconfigured frameworks already on the device. Apache MXNet is supported today, and Tensorflow and Caffe2 will be supported in 2018’s first quarter.

AWS DeepLens can be pre-ordered today for $249 by US customers only (or those using a forwarding service) with shipping expected on April 14, 2018. Visit the product page on AWS for more details.

Google Releases Tensorflow Lite Developer Preview for Android & iOS

November 17th, 2017 No comments

Google mentioned TensorFlow Lite at Google I/O 2017 last may, an implementation of TensorFlow open source machine learning library specifically optimized for embedded use cases. The company said support was coming to Android Oreo, but it was not possible to evaluate the solution at the time.

The company has now released a developer preview of TensorFlow Lite for mobile and embedded devices with a lightweight cross-platform runtine that runs on Android and iOS for now.

TensorFlow Lite Architecture – Click to Enlarge

TensorFlow Lite supports the Android Neural Networks API to take advantage of Machine Learning accelerators when available, but falls back to  CPU execution otherwise.

The architecture diagram above shows three components for TensorFlow Lite:

  • TensorFlow Model – A trained TensorFlow model saved on disk.
  • TensorFlow Lite Converter – A program that converts the model to the TensorFlow Lite file format.
  • TensorFlow Lite Model File – A model file format based on FlatBuffers, that has been optimized for maximum speed and minimum size.

The model file is then within a Mobile App using a C++ or Java (Android only) API, and an interpreter optionally using the Neural Networks API.

TensorFlow Lite currently supports three models: MobileNet (A class of vision models to identify across 1000 different object classes),Inception v3 (An image recognition model with higher accuracy, larger size), and Smart Reply (An on-device conversational model for one-touch replies to chat messages).

The preview release is available on Github, where you’ll also find a demo app that can be tried with a pre-build binary, but it’s probably more fun/useful to instead build it from source in Android Studio and try to change the code to experiment and learn. You can also build the complete framework and demo app from source by cloning the repo. TensorFlow Lite may also be coming to Linux soon, as one of the comment in the announcement mentions that “it should be pretty easy to build TensorFlow Lite on Raspberry PI. We plan to make sure this path works seamlessly soon“. While most of the documentation can be found on Github, some more info may be available on TensorFlow Lite page.

Google Pixel Visual Core is a Custom Designed Co-Processor for Smartphone Cameras

October 18th, 2017 1 comment

Google unveiled their latest Pixel 2 & Pixel 2 XL premium smartphones powered by Snapdragon 835 SoC earlier this month, and while they are expected to go on sale tomorrow, reviewers have got their hands on samples, and one of the key feature is the camera that takes really good photos and videos as reported here and there.

You’d think the ISP and DSP inside Snapdragon 835 SoC would handle any sort of processing required to take photos. But apparently that was not enough, as Google decided to design their own custom co-processor – called Pixel Visual Core -, and integrated it into Pixel 2 phones.

The co-processor features a Cortex A53 core, an LPDDR4 memory interface, PCIe interface and MIPI CSI interface, as well as an image processing unit (IPU) IO block with 8 IPU cores. Google explains the IPU block will allow 3rd party applications to leverage features like low latency HDR+ photography, where the camera takes photos with different exposure very quickly, and “juxtapose” them to provide the best possible photo.

Each IPU core includes 512 arithmetic logic units (ALUs), and the IPU delivers more than 3 TOPS (trillion operations per second) on a mobile power budget. Pixel Visual Core allows HDR+ to run 5x faster using a tenth of energy required by running the algorithm on the application processor (AP). Programming is done using domain-specific languages: Halide for image processing and TensorFlow for machine learning, and a Google-made compiler optimizes the code for the hardware.

Pixel Visual Core will be accessible as a developer option in the developer preview of Android Oreo 8.1 (MR1), before being enabled for any apps using the Android Camera API.

Intel Introduces Movidius Myriad X Vision Processing Unit with Dedicated Neural Compute Engine

August 29th, 2017 No comments

Intel has just announced the third generation of Movidius Video Processing Units (VPU) with Myriad X VPU, which the company claims is the world’s first SoC shipping with a dedicated Neural Compute Engine for accelerating deep learning inferences at the edge, and giving devices the ability to see, understand and react to their environments in real time.

Movidius Myraid X VPU key features:

  • Neural Compute Engine – Dedicated on-chip accelerator for deep neural networks delivering over 1 trillion operations per second of DNN inferencing performance (based on peak floating-point computational throughput).
  • 16x programmable 128-bit VLIW Vector Processors (SHAVE cores) optimized for computer vision workloads.
  • 16x configurable MIPI Lanes – Connect up to 8 HD resolution RGB cameras for up to 700 million pixels per second of image signal processing throughput.
  • 20x vision hardware accelerators to perform tasks such as optical flow and stereo depth.
  • On-chip Memory – 2.5 MB homogeneous memory with up to 450 GB per second of internal bandwidth
  • Interfaces – PCIe Gen 3, USB 3.1
  • Packages
    • MA2085: No memory in-package; interfaces to external memory
    • MA2485: 4 Gbit LPDDR4 memory in-package

The hardware accelerators allows to offload the neural compute engine, for example, the stereo depth accelerator can simultaneously process 6 camera inputs (3 stereo pairs) each running 720p resolution at 60 Hz frame rate. The slide below also indicates Myriad X to have 10x higher DNN performance compared to Myriad 2 VPU found in Movidius Neural Compute Stick.

Click to Enlarge

The VPU ships with an SDK that contains software development frameworks, tools, drivers and libraries to implement artificial intelligence applications, such as a specialized “FLIC framework with a plug-in approach to developing application pipelines including image processing, computer vision, and deep learning”, and a neural network compiler to port neural networks from Caffe, Tensorflow, and others.

Myriad SDK Architecture

More details can be found on Movidius’ MyriadX product page.

Google Releases Android O Developer Preview 2, Announces Android Go for Low-End Devices, TensorFlow Lite

May 18th, 2017 2 comments

After the first Android O developer preview released in March, Google has just released the second developer preview during Google I/O 2017, which on top of features like PiP (picture-in-picture), notifications channels, autofill, and others found in the first preview, adds notifications dots, a new Android TV home screen, smart text selection, and soon TensorFlow Lite. Google also introduced Android Go project optimized for devices with 512 to 1GB RAM.

Notifications dots (aka Notification Badges) are small dots that show on the top right of app icons – in supported launchers – in case a notification is available. You can then long press the icon to check out the notifications for the app, and dismiss or act on notifications. The feature can be disabled in the settings.

Android TV “O” also gets a new launcher that allegedly “makes it easy to find, preview, and watch content provided by apps”. The launcher is customizable as users can control the channels that appear on the homescreen. Developers will be able to create channels using the new TvProvider support library APIs.

I found text selection in Android to be awkward and frustrating most of the big time, but Android O brings improvements on that front with “Smart Text Selection” leveraging on-device machine learning to copy/paste, to let Android recognize entities like addresses, URLs, telephone numbers, and email addresses.

TensorFlow is an open source machine learning library that for example allows image recognition. Android O will now support TensorFlow Lite specifically designed to be fast and lightweight for embedded use cases. The company is also working on a new Neural Network API to accelerate computation, and both plan for release in a future maintenance update of Android O later this year.

Finally, Android Go project targets devices with 1GB or less of memory, and including optimization to the operating system itself, as well as optimization to apps such as YouTube, Chrome, and Gboard to make them use less memory, storage space, and mobile data. The Play Store will also highlight apps with low resources requirements on such devices, but still provide access to the full catalog. “Android Go” will ship in 2018 for all Android devices with 1GB or less of memory.

You can test Android O developer preview 2 by joining the Android O beta program if you own a Nexus 5X, 6P, Nexus Player, Pixel, Pixel XL, or Pixel C device.

GPU Accelerated Object Recognition on Raspberry Pi 3 & Raspberry Pi Zero

April 30th, 2017 6 comments

You’ve probably already seen one or more object recognition demos, where a system equipped with a camera detects the type of object using deep learning algorithms either locally or in the cloud. It’s for example used in autonomous cars to detect pedestrian, pets, other cars and so on. Kochi Nakamura and his team have developed software based on GoogleNet deep neural network with a a 1000-class image classification model running on Raspberry Pi Zero and Raspberry Pi 3 and leveraging the VideoCore IV GPU found in Broadcom BCM283x processor in order to detect objects faster than with the CPU, more exactly about 3 times faster than using the four Cortex A53 cores in RPi 3.

They just connected a battery, a display, and the official Raspberry Pi camera to the Raspberry Pi boards to be able to recognize various objects and animals.

The first demo is with Raspberry Pi Zero.

and the second demo is on the Raspberry Pi 3 board using a better display.

Source code? Not yet, but he is thinking about it, and when/if it is released it will probably be found on his github account, where there is already py-videocore Python library for GPGPU on Raspberry Pi, which was very likely used in the demos above. They may also have used TensorFlow image recognition tutorials as a starting point, and/or instructions to install Tensorflow on Raspberry Pi.

If you are interested in Deep Learning, there’s a good list of resources with links to research papers, software framework & applications, tutorials, etc… on Github’s .