Archive

Posts Tagged ‘machine learning’
Orange Pi Development Boards

Qualcomm Developer’s Guide to Artificial Intelligence (AI)

December 21st, 2017 3 comments

Qualcomm has many terms like ML (Machine Learning), DL (Deep Learning), CNN (Convolutional Neural Network),  ANN (Artificial Neural Networks), etc.. and is currently made possible via frameworks such as TensorFlow, Caffe2 or ONNX (Open Neural Network Exchange).

If you have not looked into details, all those terms may be confusions, so Qualcomm Developer Network has released a 9-page e-Book entitled “A Developer’s Guide to
Artificial Intelligence (AI)” that gives an overview of all the terms, what they mean, and how they differ.

For example, they explain that a key difference between Machine Learning and Deep Learning is that with ML, the input features of the CNN are determined by humans, while DL requires less human intervention. The book also covers that AI is moving to the edge / on-device for low latency, and better reliability, instead of relying on the cloud.

Click to Enlarge

It also quickly go through the workflow using Snapdragon NPE SDK with a total of 4 steps including 3 done on your build machine, in cluding training, conversion to DLC (Deep Leaning Container) format, and addition of the NPE runtime to  the app, before the final step, loading and running the model on the target device.

AWS DeepLens is a $249 Deep Learning Video Camera for Developers

November 30th, 2017 4 comments

Amazon Web Services (AWS) has launched Deeplens, the “world’s first deep learning enabled video camera for developers”. Powered by an Intel Atom X5 processor with 8GB, and featuring a 4MP (1080p) camera, the fully programmable system runs Ubuntu 16.04, and is designed expand deep learning skills of developers, with Amazon providing tutorials, code, and pre-trained models.

Click to Enlarge

AWS Deeplens specifications:

  • Camera – 4MP (1080p) camera using MJPEG, H.264 encoding
  • Video Output – micro HDMI port
  • Audio – 3.5mm audio jack, and HDMI audio
  • Connectivity – Dual band WiFi
  • USB – 2x USB 2.0 ports
  • Misc – Power button; camera, WiFi and power status LEDs; reset pinhole
  • Power Supply – TBD
  • Dimensions – 168 x 94 x 47 mm
  • Weight – 296.5 grams

The camera can not only do inference, but also train deep learning models using Amazon infrastructure. Performance wise, the camera can infer 14 images/second on AlexNet, and 5 images/second on ResNet 50 for batch size of 1.

Six projects samples are currently available: object detection, hot dog not hot dog, cat and dog,  activity detection, and face detection. Read that blog post to see how to get started.

But if you want to make your own project, a typical workflow would be as follows:

  • Train a deep learning model using Amazon SageMaker
  • Optimize the trained model to run on the AWS DeepLens edge device
  • Develop an AWS Lambda function to load the model and use to run inference on the video stream
  • Deploy the AWS Lambda function to the AWS DeepLens device using AWS Greengrass
  • Wire the edge AWS Lambda function to the cloud to send commands and receive inference output

This steps are explained in details on Amazon blog.

Click to Enlarge

Intel also published a press release explaining how they are involved in the project:

DeepLens uses Intel-optimized deep learning software tools and libraries (including the Intel Compute Library for Deep Neural Networks, Intel clDNN) to run real-time computer vision models directly on the device for reduced cost and real-time responsiveness.

Developers can start designing and creating AI and machine learning products in a matter of minutes using the preconfigured frameworks already on the device. Apache MXNet is supported today, and Tensorflow and Caffe2 will be supported in 2018’s first quarter.

AWS DeepLens can be pre-ordered today for $249 by US customers only (or those using a forwarding service) with shipping expected on April 14, 2018. Visit the product page on AWS for more details.

Bolt IoT Platform Combines ESP8266, Mobile Apps, Cloud, and Machine Learning (Crowdfunding)

November 22nd, 2017 4 comments

There are plenty of hardware to implemented IoT projects now, but in many cases a full integration to get data from sensors to the cloud requires going though a long list of instructions. Bolt IoT, an Indian and US based startup, has taken up the task to simplify IoT projects with their IoT platform comprised of ESP8266 Bolt WiFi module, a cloud service with machine learning capabilities, and mobile apps for Android and iOS.

Bolt IoT module hardware specifications:

  • Wireless Module – A.I Thinker ESP12 module based on ESP8266 WiSoC
  • Connectivity – 802.11 b/g/n WiFi secured by WPA2
  • USB – 1x micro USB for power and programming
  • Expansion – 4-pin female header and 7-pin female header with 5 digital I/Os, 1x analog I/O, and UART
  • Misc – Cloud connection LED

The hardware is not the most interesting part of Bolt IoT, since it offers similar functionalities as other ESP8266 boards. But what may make the project worthwhile is built-in support for the company’s cloud service (lifetime access to backers) that simplifies node and data management, as well as Bolt IoT mobile app to control the board with your smartphone (Android or iOS)

Some other noticeable features of the Bolt IoT cloud platform include:

  • Remote configuration of the pins on Bolt WiFi module from the dashboard
  • Built-in code editor, and code deployment to all your Bolt based IoT devices with a single click.
  • Data Visualization
  • Machine learning for future data prediction and anomaly detection with just a few clicks.
  • Notifications over SMS and E-Mail.
  • Integration with systems like IFTTT and Zapier
  • Integration with smart home devices like Alexa and Google Home

The whole ecosystem supposedly allows developers to work 10 times faster, and use 80% less code than other methods.  The company will also provide an API that let you manage notifications, select third party visualization tools, and control devices from your own app.

The company launched their platform on Kickstarter at the beginning of November, and they’ve now surpassed their $10,000 funding target, having raised close to $30,000 from about 700 backers. Bolt IoT module with lifetime access to Bolt Cloud requires a $12 pledge, but they also have kits with Arduino baseboard and sensors starting with a $37 Starter Kit to the $650 Legendary kit with multiple Bolt board, and a very long list of modules. For some reasons that I may have missed all kits also include $10 credit with DigitalOcean VPS provider. Bolt Cloud will be free to all backers for life, but after the KS campaign Bolt IoT will charge a fee for commercial projects, and potentially for hobbyist projects too. Shipping adds $5 to $100 depending on the selected reward, and delivery is scheduled for February 2018.

JeVois-A33 Linux Computer Vision Camera Review – Part 2: Setup, Guided Tour, Documentation & Customization

November 22nd, 2017 4 comments

Computer Vision, Artificial Intelligence, Machine Learning, etc.. are all terms we hear frequently those days. JeVois-A33 smart machine vision camera powered by Allwinner A33 quad core processor was launched last year on Indiegogo to bring such capabilities in a low power small form factor devices for example to use in robotics project.

The company improved the software since the launch of the project, and has now sent me their tiny Linux camera developer kit for review, and I’ve already checked  out the hardware and accessories in the first post. I’ve now had time to test the camera, and I’ll explained how to set it up, test some of the key features via the provided guided tour, and show how it’s possible to customize the camera to your needs with one example.

Getting Started with JeVois-A33

In theory, you could just get started by inserting the micro SD card provided with the camera, connect it to your computer via the USB cable, and follow the other instructions on the website. But to make sure you have the latest features and bug fixed, you’d better download the latest firmware (jevois-image-latest-8G.zip), and flash it to the micro SD card with the multi-platform Etcher tool.

You could also use your own micro SD card, as long as it has 8GB or more capacity. Once this is done, insert the micro SD card into the camera with the fan of the camera and the golden contact of the micro SD card both facing upwards. Connect the camera to your computer with the provided mini USB to USB cable. I also added the USB power meter to monitor the power consumption for the different use cases, and USB serial cable to checkout output from the console. At least that was the plan, but I got no lights from the camera, and voltage was reported to be only 4V. Then I read the guide a little better, and found out I had to use a USB 3.0 port, or two USB 2.0 ports for power.

Once I switched to using two USB 2.0 ports from a powered USB 2.0 hub, I could see output from the serial console…

and both green and orange/red LEDs were lit. The instructions to use JeVois camera are mostly OS agnostic, except for the video capture software. If you are using Windows you can use the free OBS Studio or AMCap programs, and on Mac, select either PhotoBooth or OBS Studio. I’m a Ubuntu user, so instead I installed guvcview:

and ran it use 640×360 resolution and YUYV format as instructed in the getting started guide:

But then I got no output at all in the app:

The last line above would repeat in a loop. The kernel log (dmesg) also reported a crash linked to guvcview:

Another person had the same problem a few months ago, and it was suggested it may be a USB problem. So I connect the camera to directly to two of the USB ports on my tower, and it worked…

Click to Enlarge

The important part of the settings are in the Video Controls tab, where we can change resolution and frame rate to switch between camera modes as we’ll see later on.

But since my tower is under the desk, the USB cable is a bit too short, and the program crashed with the same error message a few minutes later. So I went with my Ubuntu 16.04 laptop instead. Powering the camera via the USB 3.0 port worked until I started deep learning modes, where the camera would stop, causing guvcview to gray out. Finally, I connected the camera to both my USB 3.0 port, and the power bank part of the kit, and the system was then much more stable.

Click to Enlarge

I contacted the company about the issues I had, but they replied this problem was not often reported:

… we have only received very few reports like that but we were able to confirm here using front panel ports on one machine. On my desktop I have a hub too, but usb3 and rated for fast charging (60W power supply for 7+2 ports) and it works ok with jevois. A single usb3 port on my mac laptop is also ok.

So maybe it’s just me with all my cheap devices and accessories…

So three main points to get started:

  1. Update the firmware
  2. Install the camera software
  3. Check power in case of issues / crashes (Both LEDs should be on if the camera is working)

JeVois-A33 Guided Tour

Now we have the camera running, we can try the different features, and the best way to do so is to download Jevois Guided Tour (PDF) that will give you an overview of the camera and how it works, as well as examples.

Click to Enlarge

As shown above, the PDF includes information for each module with the name, link to documentation, introduction, display explanation, and on the top right the resolution/framerate that can be used to launch a given module. On following pages, there will be example pictures that you can point to with the camera.

Some of modules include:

  • Visual attention – finding interesting things
  • Face and handwritten digit recognition
  • QR-codes and other tags
  • Road detection
  • Object matching
  • Object recognition with deep neural networks
  • Color-based object tracking
  • Moving object detection
  • Record video to the microSD card inside JeVois
  • Motion flow detection
  • Eye tracking
  • and more…

You could print the guide with a color printer, but the easiest way is problem to use two screens, once with the PDF guide open, and the other running the camera application (guvcview, OBS Studio…). I’ve gone through some of the example in the guided tour in the video below, with PDF shown on a TV box, and the camera application output shown on the laptop screen on the top bottom corner.

That’s lot of fun, and everything works pretty well most of the time. Some of the tests are quite demanding for such low power device, as for example Darknet based “Deep neural scene analysis” using 1280×480 @ 15 fps with the ability to recognize multiple object types would only refresh the results every 2.7 seconds or so.

Documentation & Customization of Salient SURF Module

If you’ve gone through the guide tour, you should now have a good understanding of what the camera is capable of. So now, let’s take one of the modules, and try to adjust it to our needs. I picked SaliencySURF module with the documentation available here for this section of the review. Introduction for the module:

Trained by default on blue iLab logo, point JeVois to it and adjust distance so it fits in an attention box.
Can easily add training images by just copying them to microSD card.
Can tune number and size of salient regions, can save regions to microSD to create a training set

So let’s take a few other images (Tux logo), copy them to the micro SD card in the camera, and tune some of the settings.

Ideally the camera should also be detected, as a storage device, so that we can easily copy files and edit parameters, and in my computer it was shown as a UVC camera, a USB ACM device, and USB storage device when I connect it:

But for some reasons, I could not see the /dev/sdb storage after that:

[Update: We can use use jevois-usbsd script to access the camera storage from the host computer / board:

]

So instead I had to take out the micro SD card from the camera, and copy the files in /modules/JeVois/SaliencySURF/images/ directory in JEVOIS partition.

The module will process those photo when we start it, and return the name of the file when detected.

We can go back to SaliencySURF directory to edit params.cfg file, and change some parameters to determine how strict a match should be, taking into account that a stricter matching may mean the object was not be detected, and looser matching that we get some false positives. But this is where it gets a little more complicated, as we’ll see from a subset of the list of parameters.

Click to Enlarge

I cannot understand what half of the parameters are supposed to do. That’s where you can click on the SaliencySURF / Saliency links to access the base documentation. and find out how the module works, find out more about each parameter, and easily access the source code for the functions used by the module. That type of documentation is available for all modules used in JeVois C++ framework, and it’s a very good learning tool for people wanting to know more about computer vision. You’ll have to be familiar with C++ to understand the code, and what it really does, beside learning jargon and acronyms specific to computer vision or machine learning.

By default params.cfg file includes just two lines:

Those are the parameters for ObjectMatcher module, with goodpts corresponding to the number range of good matches considered, and distthresh being the maximum distance for a match to be considered good.

I’ve set looser settings in params.cfg:

I saved the file, put the micro SD card back into the camera, and launch guvcview with 320×288 @ 30 fps resolution/framerate to enter SaliencySURF mode.

Click to Enlarge

Oops, it’s seeing Tux logos everywhere, even when there are none whatsoever, so our settings are clearly too loose. So I went back to the default settings, but the rsults was still similar, so since the distance was shown to be 0.30 in my first attempt, I reduced distthresh to 0.2. False positive are now mostly gone, except for very short period od time, and it’s now detecting CNX Tux logo accuractely. Note that Green square is for object detection, and the white squares for saliency zones.

However, it struggles to detect my third Tux logo repeatedly, often following back to CNX Tux logo.

But as you could see with the green square, the detection was done on the left flap of the penguin. That’s because SaliencySURF detection is done in a fixed size zone (64×64 pixels per detault), so camera distance, or size of the zone matter. You can change the size of the salient regions with SaliencySURF rsiz parameter which defined the height and length of the quare in pixel. When I did the test I first tried to detected if from the list of Tux images from DuckDuckGo search ut it was not small or blurry. After switchign to a bigger photo, the cable was too short to cover the logo, so instead I copied to gimp and resized it so that it could fit in the 64×64 square while using the camera, and in this case detection worked resaonably well.

The more you use the camera, the better you’ll be at understanding how it works, and leverage its capabilities.

Final Words

JeVois-A33 camera is an inexpensive way to get started with computer vision and deep learning, with excellent documentation, and if you put the efforts, you’ll even understand how it works at the source code level. It’s also fun to use with many different modules to try. I have not tried it n this review due to time limitations, but you could also connect the camera to an Arduino board controlling a robot (Cat chasing bot anyone?) via the serial interface.

The main challegenges you may face while getting started ar:

  1. Potential crashes due to power issues, but that’s solvable, and a power issues troubleshooting guide has even been published
  2. For robotics projects, you have to keep in mind there will be some lag for some modules, for example from 500ms (single object) to 3 seconds (YOLO test with multiple object types) for deep learning algorithms. Other modules such as ArUco marker are close to real-time performance however.

Bear in mind all processing is done by the Allwinner A33 CPU cores, as the Mali-400MP GPU is not suitable for GPGPU. As more affordable SoC with OpenCL/Vulkan capable GPU (e.g. Mali-T720) are launched, and in some cases even NNA (Neural Network Accelerator), we’ll be able to get similar low power smart cameras, but with much better computer vision performance.

JeVois-A33 can be purchased for $49, but to avoid wasting time with power issues, and give you more options, I’d recommend to go with JeVois-A33 Developer/Robotics Kit reviewed here, going for $99.99 on Amazon, RobotShop, or JeVois Store.

Google Releases Tensorflow Lite Developer Preview for Android & iOS

November 17th, 2017 No comments

Google mentioned TensorFlow Lite at Google I/O 2017 last may, an implementation of TensorFlow open source machine learning library specifically optimized for embedded use cases. The company said support was coming to Android Oreo, but it was not possible to evaluate the solution at the time.

The company has now released a developer preview of TensorFlow Lite for mobile and embedded devices with a lightweight cross-platform runtine that runs on Android and iOS for now.

TensorFlow Lite Architecture – Click to Enlarge

TensorFlow Lite supports the Android Neural Networks API to take advantage of Machine Learning accelerators when available, but falls back to  CPU execution otherwise.

The architecture diagram above shows three components for TensorFlow Lite:

  • TensorFlow Model – A trained TensorFlow model saved on disk.
  • TensorFlow Lite Converter – A program that converts the model to the TensorFlow Lite file format.
  • TensorFlow Lite Model File – A model file format based on FlatBuffers, that has been optimized for maximum speed and minimum size.

The model file is then within a Mobile App using a C++ or Java (Android only) API, and an interpreter optionally using the Neural Networks API.

TensorFlow Lite currently supports three models: MobileNet (A class of vision models to identify across 1000 different object classes),Inception v3 (An image recognition model with higher accuracy, larger size), and Smart Reply (An on-device conversational model for one-touch replies to chat messages).

The preview release is available on Github, where you’ll also find a demo app that can be tried with a pre-build binary, but it’s probably more fun/useful to instead build it from source in Android Studio and try to change the code to experiment and learn. You can also build the complete framework and demo app from source by cloning the repo. TensorFlow Lite may also be coming to Linux soon, as one of the comment in the announcement mentions that “it should be pretty easy to build TensorFlow Lite on Raspberry PI. We plan to make sure this path works seamlessly soon“. While most of the documentation can be found on Github, some more info may be available on TensorFlow Lite page.

Google Pixel Visual Core is a Custom Designed Co-Processor for Smartphone Cameras

October 18th, 2017 1 comment

Google unveiled their latest Pixel 2 & Pixel 2 XL premium smartphones powered by Snapdragon 835 SoC earlier this month, and while they are expected to go on sale tomorrow, reviewers have got their hands on samples, and one of the key feature is the camera that takes really good photos and videos as reported here and there.

You’d think the ISP and DSP inside Snapdragon 835 SoC would handle any sort of processing required to take photos. But apparently that was not enough, as Google decided to design their own custom co-processor – called Pixel Visual Core -, and integrated it into Pixel 2 phones.

The co-processor features a Cortex A53 core, an LPDDR4 memory interface, PCIe interface and MIPI CSI interface, as well as an image processing unit (IPU) IO block with 8 IPU cores. Google explains the IPU block will allow 3rd party applications to leverage features like low latency HDR+ photography, where the camera takes photos with different exposure very quickly, and “juxtapose” them to provide the best possible photo.

Each IPU core includes 512 arithmetic logic units (ALUs), and the IPU delivers more than 3 TOPS (trillion operations per second) on a mobile power budget. Pixel Visual Core allows HDR+ to run 5x faster using a tenth of energy required by running the algorithm on the application processor (AP). Programming is done using domain-specific languages: Halide for image processing and TensorFlow for machine learning, and a Google-made compiler optimizes the code for the hardware.

Pixel Visual Core will be accessible as a developer option in the developer preview of Android Oreo 8.1 (MR1), before being enabled for any apps using the Android Camera API.

NVIDIA DRIVE PX Pegasus Platform is Designed for Fully Autonomous Vehicles

October 11th, 2017 1 comment

Many companies are now involved in the quest to develop self-driving cars, and getting there step by step with 6 levels of autonomous driving defined based on info from  Wikipedia:

  • Level 0 – Automated system issues warnings but has no vehicle control.
  • Level 1 (”hands on”) – Driver and automated system shares control over the vehicle. Examples include Adaptive Cruise Control (ACC), Parking Assistance, and Lane Keeping Assistance (LKA) Type II.
  • Level 2 (”hands off”) – The automated system takes full control of the vehicle (accelerating, braking, and steering), but the driver is still expected to monitor the driving, and be prepared to immediately intervene at any time. You’ll actually have your hands on the steering wheel, just in case…
  • Level 3 (”eyes off”) – The driver can safely turn their attention away from the driving tasks, e.g. the driver can text or watch a movie. The system may ask the driver to take over in some situations specified by the manufacturer such as traffic jams. So no sleeping while driving 🙂 . The Audi A8 Luxury Sedan was the first commercial car to claim to be able to do level 3 self driving.
  • Level 4 (”mind off”) – Similar to level 3, but no driver attention is ever required. You could sleep while the car is driving, or even send the car somewhere without your being in the driver seat. There’s a limitation at this level, as self-driving mode is limited to certain areas, or special circumstances. Outside of these areas or circumstances, the vehicle must be able to safely park the car, if the driver does not retake control.
  • Level 5 (”steering wheel optional”) – Fully autonomous car with no human intervention required, no other limitations

So the goal is obviously to reach level 5, which would allow robotaxis, or safely drive you home whatever your alcohol or THC blood levels. This however requires lots of redundant (for safety) computing power, and current autonomous vehicle prototypes have a trunk full of computing equipments.

NVIDIA has condensed the A.I processing power required  or level 5 autonomous driving into DRIVE PX Pegasus AI computer that’s roughly the size of a license plate, and capable of handling inputs from high-resolution 360-degree surround cameras and lidars, localizing the vehicle within centimeter accuracy, tracking vehicles and people around the car, and planning a safe and comfortable path to the destination.

The computer comes with four A.I processors said to be delivering 320 TOPS (trillion operations per second) of computing power, ten times faster than NVIDIA DRIVE PX 2, or about the performance of a 100-server data center according to Jensen Huang, NVIDIA founder and CEO. Specifically, the board combines two NVIDIA Xavier SoCs and two “next generation” GPUs with hardware accelerated deep learning and computer vision algorithms. Pegasus is designed for ASIL D certification with automotive inputs/outputs, including CAN bus, Flexray, 16 dedicated high-speed sensor inputs for camera, radar, lidar and ultrasonics, plus multiple 10Gbit Ethernet

Machine learning works in two steps with training on the most powerful hardware you can find, and inferencing done on cheaper hardware, and for autonomous driving, data scientists train their deep neural networks NVIDIA DGX-1 AI supercomputer, for example being able to simulate driving 300,000 miles in five hours by harnessing 8 NVIDIA DGX systems. Once trained is completed, the models can be updated over the air to NVIDIA DRIVE PX platforms where inferencing takes place. The process can be repeated regularly so that the system is always up to date.

NVIDIA DRIVE PX Pegasus will be available to NVIDIA automotive partners in H2 2018, together with NVIDIA DRIVE IX (intelligent experience) SDK, meaning level 5 autonomous driving cars, taxis and trucks based on the solution could become available in a few years.

Google’s Teachable Machine is a Simple and Fun Way to Understand How Machine Learning Works

October 9th, 2017 4 comments

Artificial intelligence, machine learning, deep learning, neural networks… are all words we hear more and more today, as machines get the ability to recognize objects, answer voice requests / commands, and so on. But many people may not know at all the basics of how machine learning works, and with that in mind, Google launched Teachable Machine website to let people experiment and understand the basics behind machine learning without having to install an SDK or even code.

So I quickly tried it with Google Chrome, as it did not seem to work with Mozilla Firefox. It’s best to have audio on, as a voice explains how to use it.

Basically you connect your webcam, authorize Chrome too use it, and you should see the image in the input section on the left. After you’re being to train the machine in the learning section in the middle with three difference classes. You’ll be asked to wave your hand and keep pressing on the “Train Green” button until you have at least 100 examples. At this stage, the machine will always detect the green class since it’s all that it knows. Then you can train the Purple class by staying still, and again make sure you have at least 100 examples before you release the button. Now the machine should be able to detect when you stay still or move with a varying percentage of confidence. The output section will just show some animated GIFs, or play sound or words depending on what it detects.  It can learn actions (still, wave hands, clap hands) and object detections. My webcam is pretty bad, but if you have a good image, you should be able to also detect feelings like happiness, sadness, anger, anxiousness, etc… Give it a try it’s fun.

The Teacheable Machine has been built with a new open source hardware-accelerated JavaScript library called deeplearn.js,and Google released the source code for the website too.