Archive

Posts Tagged ‘how-to’

NanoPi NEO NAS Kit Review – Assembly, OpenMediaVault Installation & Setup, and Benchmarks

June 18th, 2017 55 comments

NAS Dock v1.2 for Nano Pi NEO / NEO 2 is, as the name implies, a complete mini NAS kit for 2.5″ drive for NanoPi NEO or NEO 2 board. The NEO 2 board is strongly recommended, since it’s not much more expensive, but should deliver much better results due to its Gigabit Ethernet interface. I’ve received two of those kits together with several other boards & accessories from FriendlyELEC, and today I’ll show how to assemble the kit, configure OpenMediaVault, and run some benchmarks.

NAS Kit V1.2 Assembly with NanoPi NEO 2 Board

The only extra tool you’ll need is a screwdriver, and potentially a soldering iron as we’ll see further below.
The metal box is stuff wih accessories so the first thing is to open one or two sides to take out the content. We have the mainboard, NanoPi NEO back plate, NanoPi NEO 2 back plater, a heatsink and thermal set, and a set of 5 screws to tighten the hard drive which mean there’s one extra screw. FriendlyELEC always adds extra screws, and I find it’s a nice touch, as it can be a real pain if you happen to lose one.

Click to Enlarge

Let’s have a closer look at the “1-bay NAS Dock v1.2 for NanoPi NEO/NEO2” board. We have a UAS capable USB 3.0 to SATA brige chip between the two header for NanoPi NEO board (note that the USB connection will be limited to USB 2.0 since the board only supports that), an LED, a USB 2.0 host port for a printer, WiFi dongle, or webcam, the power switch, the power jack, a 3-pin serial header, an I2C connector for Grove modules, and of course the SATA connector.

Click to Enlarge

There’s not much on the other side of the board, except a CR2032 battery slot for the RTC.

Before going further, you’ll need to go to the Wiki, and get the latest OpenMediaVault firmware, in my case nanopi-neo2_debian-nas-jessie_4.11.2_20170531.img.zip, which I then flashed with Ether program to a micro SD card..

Once this is done, install the heatsink and thermal to your NanoPi NEO 2 board, and insert the micro SD card into the board.

Notice that I also soldered the headers. While it would be obvious to people would have looked at the pinout diagram, I’ve read some people have justed connect the board using the (pre-soldered) 4-pin header, as they may have believed it was a USB header, but it’s just the serial console instead, and obviously the hard drive was not detected. If you don’t feel like soldering the headers to the board yourself, make sure you tick the option “with pin headers soldered” when ordering. It just costs $1 extra.

Now we can insert our board into the “1-bay NAS Dock” board, instead the hard drive, and optionally an I2C module. I connected an I2C OLED display i the picture below for illustrate, as using the display would require cutting out the case. Some people may want to connect an I2C temperature sensor instead.

Click to Enlarge

I used four screws to tighen the hard drive on the other side of the board, and install a CR2032 battery for the real-time clock.


Finally, you’ll need a 12V power supply with at least 1A, but I could not find any (safe) spare ones so I used Maxoak K2 power bank instead, since it can output 12V @ 2.5 A max.


OpenMediaVault Setup on NanoPi NEO 2 Board

So I connected everything, and applied power, but the board would not boot with the Ethernet Link LED blinking in a regular fashion, meaning something was very wrong. So I took out the board, and connected a serial debug board, connect to the console via minicom using 115200 8N1, and that’s what I got:

The boot was just stuck there. I re-inserted the micro SD in my PC, and I could see both boot and rootfs partitions, so everything looked good.
Then I powered the NanoPi NEO 2 board with a 5V/2A power supply only, and the boot succeeded:

Then I went back to the 12V power input on NAS Kit with the power bank and the boot succeeded. Very strange. It turns out the board would not boot most of the time, but the symptoms are not reproducible 100% of the time. This kind of random behavior is usually a timing or distorted signal issue. So I thought the micro SD card might not play well with the board, and the power bank signal might not be so clean. So I first flashed another micro SD card, but same results. I used another 12V/5A power supply, and it did not really help either. Finally, I used another NanoPi NEO 2 board and it appears to be stable.

You can find the board using FriendlyELEC.local if bonjour services are running in your computer:

Alternatively, you could check out the IP address in other ways. In my case, I just type friendlyelec.local in Firefox to access the web interface. The default username and password are admin and openmediavault.

Click to Enlarge

After login, you can access the dashboard showing system information, and which services are running. You may want to disable the services you don’t need.

Click to Enlarge

You can go to Storage->Physical Disks to check if your hard drive has been detected. No problem for me here with a 931.51 GiB drive detected.

Click to Enlarge

You may then want to setup a fix IP address. There are various ways to do this but I went to Network->Interfaces and set eth0 to a fixed IP address. You’ll be asked to apply the changes once it’s done.

Click to Enlarge

I also changed the hostname to CNX-NEO2-NAS in the General tab.

After that I decided to address some security issues. First by changing the administrator password in General Settings->Web Administrator Password.

I then went to Access Rights Management->User to find out there were two pre-configured users: pi and fa. I deleted fa user, changed pi’s user password, and added it to ssh group. It’s actually even probably better to just delete both user, and create your own.

The root user is not shown, but you’ll want to login as root through ssh first and change the password, as the default password is fa. Once it’s done, you’ll have better security, and your system should not be easily accessible via basic “hacks”. For more security, you’ll still want to install an RSA certificate. A self-signed one should do if you plan to use it only in the local network, but you may also consider a free Let’s Encrypt certificate instead.

We can now take care of the hard drive. I went to Storage->File Systems, and clicked on +Create file system which will let you choose between BTRFS, EXT3, EXT4, XFS, and JFS. I’ve gone with EXT4 first.

Click to Enlarge

After a few minutes you drive should be formatted, so we can configure network shares. I want to use SAMBA and SFTP to transfer files for the purpose of this review, so I went to Access Rights Management->Shared Folders to add a new share called HDD for the root of of hard drive. You may want to add multiple share if you plan to split videos, documents, music and so on.

Click to Enlarge

I clicked Save, and selected ACL to add permissions to pi and admin users. You can add whatever users you plan to use to access the share.

Click to Enlarge

That share3d folder can now be assigned to the services you plan to use. SFTP is enabled by default when SSH is running, so I create a SAMA/CIFS share by going to Services->SMB/CIFS->Shares to add the share.

Click to Enlarge

Browsing the Network with Nautilus would show both cnx-neo2-NAS – SMB.CIFS and cnx-neo2-nas – SSH (SFTP) shares.

Configuration is now complete. I have not find a clean way to power off the system, so I normally open a terminal session via ssh and run the shutdown now command. A software button to turn of the NAS would have been a nice features on the kit.

I also often encountered the error “Software Failure. Press left mouse button to continue. Session not authenticated.” before the session timeout is set to 5 minutes. If you prefer a longer timeout, you can change it in General Settings->Web Administration.

In case you want to use the RTC, you may first want to set the timezone:

Check the date is correct, and write it to the hardware clock:

before reading it back.

You can test it by rebooting the board without the Ethernet cable:

Perfect! You’d just have to make sure the “set” command is run automatically at boot time if the time in the RTC is set. It would be good if FriendlyELEC updated their image to do that automatically at boot time.

NAS Dock V1.2 + NanoPi NEO 2 Benchmarks

Since I can now copy files and folders over SAMBA and SFTP, we can start running some benchmarks to evaluate performance. I’ll use EXT-4, BTRFS, and XFS file systems on the hard drive, and run iozone to specicially test storage performance, following by copying large and small files over SAMBA or SFTP to test real-life NAS performance. For large file copy, I’ll use a folder with 7 large files totaling 6.5 GB, and for small files, I’ve done a fresh checkout of the Linux kernel in my computer:

and removed symlinks since they may cause issues during copy, as well as .git directory with a huge 1.8GB file:

The end result is a directory with 64,013 files totaling 748.6 MB.

Iozone results

EXT-4:

BTRFS:

XFS:

I’ve taken results with 16384kB reclen for read, write, random read and random write values to draw a chart, since most people are likely going to store large files in their NAS. The smaller reclen could be interesting if you plan to handle smaller files.

All three file systems have a very good read speed of around 40 MB/s, but BTRFS write appear to be the fastest among the three, with EXT-4 being the weakest at around 25 MB/s. But for some reasons, those results are useless in practice, as we’ll see below. Finding out the exact reason would possibly require studying and profiling iozone and the kernel source code which would be outside of the scope of this review.

File copy over SAMBA and SFTP

Results for large files in minutes and seconds.

File Copy  Large Files SMB SFTP
Write Read Write Read
EXT4 02:49.00 02:40.00 03:54.00 04:15.00
BTRFS 03:20.00 02:40.00 03:48.00 04:32.00
XFS 02:45.00 02:38.00 03:36.00 04:23.00

Chart converted to MB/s.

Read and Write Speeds in MB/s

First, we can see very good read performance from the NAS (NAS to my PC)  with 41 to 42 MB/s close to the theorethical limit of a USB 2.0 connection. Write speed is a a little different as the files were transferred more slowly with BTRS, and around 40MB/s with EXT-4 and XFS.  Since SFTP is encrypted the transfer speed is roughly the same for all three file systems. Overall the file system you choose does not really impact performance with large files.

Results for small files in minutes and seconds.

File Copy  Small Files SMB SFTP
Write Read Write Read
EXT4 15:26.00 18:34.00 09:02.00 12:48.00
BTRFS 18:48.00 18:02.00 10:30.00 11:30.00
XFS 17:33.00 18:22.00 09:18.00 12:35.00

Chart converted to MB/s.

Transferring a large number of small files over SAMBA is really slow, and barely faster over SFTP. Again,there aren’t any significant differences between file systems here.  If you are going to transfer a large number of small file over the network, you may want to either compress the files before transfer, or compress the files on the fly using the command line:

It took just 1 minute and 49 seconds to transfer all 64,013 files, or over five times faster than SFTP write to XFS, at around an effective 6.86 MB/s. So knowing your tools may matter as much as having the right hardware.

I was going to run a last part after enabling optimizations provided by tkaiser, but it turns out FriendELEC has already done that in their firmware image.

If you want to reproduce the setup above, you’ll need to purchase NAS Kit v1.2 for $12.99, and a NanoPi NEO 2 with soldered headers for $15.99. If you don’t have a 2.5″ hard drive, you’ll need to add this, as well as a 12V power supply which you could purchase locally, or on FriendlyELEC website for under $10. All in all that’s cheaper than a similar kit with a Raspberry Pi 3 board, and you’ll get close to four times the SAMBA performance for large files since RPi 3 will be limited to 10 to 12 MB/s due to the Fast Ethernet connection.

Micro SD Cards for Development Boards – Classes, Tools, Benchmarks, Reliability, and Tips & Tricks

June 13th, 2017 38 comments

When people plan to use a development board for their project, they mainly focus on the requirements of the development board itself, as well as software support. But selecting the right accessories may be just as crucial for good performance and stability. For example, selecting a proper power supply is important, as the board may freeze or randmly rebooted if it is not feed at the right voltage. Part of this is selecting a micro USB cable, as you’ll want a cable with minimal resistance which can be achieved through shorter cables and/or a low AWG value. Another important item that can impact stability and performance of the systems are micro SD cards when used to run the operating system in development boards.

Understanding SD Card Performance Metrics & Classes

Until a few years ago, (Micro) SD cards were primary used to store data such as photos, videos and music. In those use cases, you have large files that benefit from high sequential read and write speeds. That’s why the SD association created different classes to specify a minimum write speed with speed class, UHS speed class, and more recently Video Speed Class.

That’s useful as you won’t need the same write speeds to copy music files, or record videos in a 4K camera, and the cards with a lower class are normally cheaper. Note that most (all?) low cost development board do not support UHS-II or UHS-III interface, so it may work, but you won’t be able to reach the maximum performance of the card. Micro SD card controllers are also often connected via SDIO which limits the sequential performance to around 23 MB/s.

However, those classes are only marginally useful for micro SD cards used in development boards, since operating systems generate a lot of small read and write operations, for example for databases and web caches, rather than handling large files.  That’s why storage devices may also include read and write IOPS (I/O per seconds) which are more useful for this type of use case. The table below about Samsung eMMC and UFS flash chip shows some of those numbers.

Click to Enlarge

The only problem is that until very recently, there was no way you could know minimal IOPS R/W numbers for SD cards since they were simply not made available by the manufacturers, so instead people were recommending Class 10 micro SD cards which may or may not best suited to their use in development boards. However, since the introduction of Adoptable Storage in Android, where micro SD cards could be used for apps instead of just data, IOPS values became more important, and the SD association has now introduced Class A1 & Class A2 application performance classes for SD cards specifying minimum IOPS R/W values together with a 10MB/sec minimum sustained sequential write speed.

That’s great! Problem solved then? Not quite, as manufacturers have only started to use application performance class logos in their higher-end SD cards with high capacity like 256GB, which usually cost several times the price of your typical development board. So we need to rely on the community to test IOPS R/W values, or random read and write speeds.

Software Tools for SD Cards

There are several program that can test performance. hdparm and dd are popular, but the former only test sequential speed for a very short time, and the latter only test sequential speed, and many people may not include the time it takes to flash the cache to the actual card (e.g. with sync), leading to potentially misleading results, and  mostly irrelevant to our use case.

So instead, there are specific tools to tests both sequential speeds and random I/O performance. Bonnie++ is one, but recently iozone has become the reference for disk I/O testing, and that’s the one used in the benchmarks below.

Many tutorials recommend to flash firmware using Win32DiskImager in Windows, or dd in Linux. However, while I have not experience the issue myself, I’ve been told the latter may not always detect errors while flashing, and a new tool is now recommended: Etcher. It works in Windows, Linux, or Mac OS using a GUI or the command line, and will verify the SD card after flashing making sure nothing has gone wrong during the process.

SD Cards Benchmarks

With that in mind we need micro SD cards with good random R/W performance, which we can test with iozone3, and for many applications 8 to 32 GB capacities are enough. We were planning to test ourselves, but Andreas Spiess has already published a videos – embedded further below- leveraging pidramble work, with benchmarks using several micro SD cards on a Raspberry Pi 3 board. You’ll also find more results and discussion on Armbian forums. Linaro also made a survey of micro SD cards a while ago.

Random Read & Write Speeds in MB/s – Cards Sorted by Random Write Speed – Click to Enlarge

Here are the main takeaways from the tests:

  • Larger cards are usually faster but not always (See Samsung EVO+ 64GB results).
  • Benchmarks vary a lot between runs for a given card with 10 to 20% variations
  • Most low cost micro SD cards / clones have really bad random write performance.
  • Samsung and Sandisk cards get the best random I/O performance
  • It’s not always necessary to buy the most expensive cards for usable performance.

The conclusion is that for the best performance select cards like Samsung EVO+ 32GB or Samsung Pro+ 32GB selling for respectively around $18 and about $25 on Amazon US. A cheaper card that should work decently is Sandisk Extreme Pro 8GB sold for $8.

The Problem of Fake Micro SD Cards

It looks like we have now found a decent list of micro SD cards to use with our development boards. There are still problem has some are copy, even on sites like Amazon , as manufacturer of fake card may insert them in the supply chain. One way to double check if you have an original is to check the CID (Card Identification) number. In Linux, you can do so as follows:

The first two digit represents the manufacturer ID. There’s no public list, but “1b” is for Samsung. You can decode the full CID on here.

If you’ve bought a Samsung SD card, and the Manufacturer ID is difference from 0x1b, then you almost certainly have a fake Samsung SD card. If the number is 0x1b, you can’t be 100% sure you have a genuine SD card however, as the clones may have also cloned that part.

Sadly there’s no easy other way to check if a card is real or fake. Checking the performance is similar to other reports online may help confirm you have a proper card. That 2010 article by Bunnie has more details about fake SD cards.

Longevity and Reliability of micro SD Cards

I’ve been using micro SD cards for about 5 years both in development boards, and my phone, and I have to say I’ve had to throw away many of them due to I/O errors after a while. I’ve also found my card readers also only last a few months. So far, I mostly bought cheaper ones, so it could be an explanation. Sadly, manufacturers do not provide any MTBF (Mean Time Between Failure) for SD cards so there’s no data like on hard drives and SSDs.

However, there are ways to limit wear and tear on micro SD cards by limited writes, either by making the rootfs read-only which is fine for embedded projects, less if you plan on using your board as a computer, or using log2ram, extending the write buffer commit interval, etc… More discussion about SD card lifespan can be found on Pete Scargill’s blog.

Some people will also encounter corrupted flash that won’t boot anymore after a power failure or a crash because the file system has been corrupted. It is not directly related to the SD card itself, and it’s more likely to occur if you have a poor power supply. Steps to reduce the probably of corruption involves the same steps as previously mentioned to limit wear and tear with a read-only file system and so on.

Another solution to avoid most the issues mentioned above is to use boards with eMMC flash, but of course going that route would mean that you’d have to avoid  Raspberry Pi boards, baring the RPi Compute Module. Neverhteless, a combination of carefully selected power supply, USB cable, and micro SD card should still give you a good experience, and avoid many potential headaches, with your board.

Thanks to Karl and Thomas for tips and insights.

Getting Started with OpenCV for Tegra on NVIDIA Tegra K1, CPU vs GPU Computer Vision Comparison

May 24th, 2017 No comments

This is a guest post by Leonardo Graboski Veiga, Field Application Engineer, Toradex Brasil

Introduction

Computer vision (CV) is everywhere – from cars to surveillance and production lines, the need for efficient, low power consumption yet powerful embedded systems is nowadays one of the bleeding edge scenarios of technology development.

Since this is a very computationally intensive task, running computer vision algorithms in an embedded system CPU might not be enough for some applications. Developers and scientists have noticed that the use of dedicated hardware, such as co-processors and GPUs – the latter traditionally employed for graphics rendering – can greatly improve CV algorithms performance.

In the embedded scenario, things usually are not as simple as they look. Embedded GPUs tend to be different from desktop GPUs, thus requiring many workarounds to get extra performance from them. A good example of a drawback from embedded GPUs is that they are hardly supported by OpenCV – the de facto standard libraries for computer vision – thus requiring a big effort from the developer to achieve some performance gains.

The silicon manufacturers are paying attention to the growing need for graphics and CV-oriented embedded systems, and powerful processors are being released. This is the case with the NVIDIA Tegra K1, which has a built-in GPU using the NVIDIA Kepler architecture, with 192 cores and a processing power of 325 GFLOPS. In addition, this is one of the very few embedded GPUs in the market that supports CUDA, a parallel computing platform from NVIDIA. The good news is that OpenCV also supports CUDA.

And this is why Toradex has decided to develop a System on Module (aka Computer on Module) – the Apalis TK1 – using this processor. In it, the K1 SoC Quad Core ARM Cortex-A15 CPU runs at up to 2.2GHz, interfaced to 2GB DDR3L RAM memory and a 16GB 8-bit eMMC. The full specification of the CoM can be found here.

The purpose of this article is to install the NVIDIA JetPack on the Apalis TK1 System on Module, thus also installing OpenCV for Tegra, and trying to assess how much effort is required to code some simple CV application accelerated by CUDA. The public OpenCV is also tested using the same examples, to determine if it is a viable alternative to the closed-source version from NVIDIA.

Hardware

The hardware employed in this article consists of the Apalis TK1 System on Module and the Apalis Evaluation Board. The main features of the Apalis TK1 have been presented in the introduction, and regarding the Apalis Evaluation Board, we will use the DVI output to connect to a display and the USB ports to interface a USB camera and a keyboard. The Apalis TK1 is presented in figure 1 and the Apalis Evaluation Board in figure 2:

Figure 1 – Apalis TK1 – Click to Enlarge

Figure 2 – Apalis Evaluation Board – Click to Enlarge

System Setup

NVIDIA already provides an SDK package – the NVIDIA JetPack – that comes with all tools that are supported for the TK1 architecture. It is an easy way to start developing applications with OpenCV for Tegra support. JetPack also provides many source code samples for CUDA, VisionWorks, and GameWorks. It also installs the NVIDIA Nsight, an IDE that is based on Eclipse and can be useful for debugging CPU and GPU applications.

OpenCV for Tegra is based on version 2.4.13 of the public OpenCV source code. It is closed-source but free to use and benefits from NEON and multicore optimizations that are not present in the open-source version; on the other hand, the non-free libraries are not included. If you want or need the open-source version, you can find more information on how to build OpenCV with CUDA support here – these instructions were followed and the public OpenCV 2.4.13 was also tested during this article’s development.

Toradex provides an article in the developer website with concise information describing how to install JetPack on the Apalis TK1.

Regarding hardware, it is recommended that you have an USB webcam connected to the Apalis Evaluation Board because samples tested in this article often need a video source as input.

OpenCV for Tegra

After you have finished installing the NVIDIA JetPack, OpenCV for Tegra will already be installed on the system, as well as the toolchain required for compilation on the target. You must have access to the serial terminal by means of an USB to RS-232 adapter or an SSH connection.

If you want to run Python code, an additional step on the target is required:

The easiest way to check that everything works as expected is to compile and run some samples from the public OpenCV repository since it already has the Cmake configuration files as well as some source code for applications that make use of CUDA:

We can begin testing a Python sample, for instance, the edge detector. The running application is displayed in figure 3.

Figure 3 – running Python edge detector sample – Click to Enlarge

After the samples are compiled, you can try some of them. A nice try is the “background/foreground segmentation” samples since they are available with and without GPU support. You can run them from the commands below, as well as see the results in figures 4 and 5.

Figure 4 – running bgfg_segm CPU sample – Click to Enlarge

Figure 5 – running bgfg_segm GPU sample – Click to Enlarge

By running both samples it is possible to subjectively notice the performance difference. The CPU version has more delay.

Playing Around

After having things setup, the question comes: how easy it is to port some application from CPU to GPU, or even start developing with GPU support? It was decided to play around a little with the Sobel application that is well described in the Sobel Derivatives tutorial.

The purpose is to check if it’s possible to benefit from CUDA out-of-the-box, therefore only the function getTickCount from OpenCV is employed to measure the execution time of the main loop of the Sobel implementations. You can use the NVIDIA Nsight for advanced remote debugging and profiling.

The Code

The first code is run completely on the CPU, while in the first attempt to port to GPU (the second code, which will be called CPU-GPU), the goal is to try to find functions analog to the CPU ones, but with GPU optimization. In the last attempt to port, some improvements are done, such as creating filter engines, which reduces buffer allocation, and finding a way to replace the CPU function convertScaleAbs into GPU accelerated functions.

A diagram describing the loop for the three examples is provided in figure 6.

Figure 6 – CPU / CPU-GPU / GPU main loop for Sobel implementations

The main loop for the three applications tested is presented below. You can find the full source code for them on Github:

  • CPU only code:
  • CPU-GPU code:
  • GPU code

The Tests

  • Each of the three examples is executed using a random picture in jpeg format as input.
  • The input pictures dimensions in pixels that were tested are: 3483×2642, 2122×1415, 845×450 and 460×290.
  • The main loop is being iterated 500 times for each run.
  • All of the steps described in figure 6 have their execution time measured. This section will present the results.
  • Therefore there are 12 runs total.
  • The numbers presented in the results are the average values of the 500 iterations for each run.

The Results

The results presented are the total time required to execute the main loop – with and without image capture and display time, available in tables 1 and 2 – and the time each task takes to be executed, which is described in figures 7, 8, 9 and 10. If you want to have a look at the raw data or reproduce the tests, everything is in the aforelinked GitHub repository.

Table 1 – Main loop execution time, in milliseconds

Table 2 – Main loop execution time, discarding read and display image times, in milliseconds

Figure 7 – execution time by task – larger image (3483×2642 pixels) – Click to Enlarge

Figure 8 – execution time by task – large image (2122×1415 pixels) – Click to Enlarge

Figure 9 – execution time by task – small image (845×450 pixels) – Click to Enlarge

Figure 10 – execution time by task – smaller image (460×290 pixels) – Click to Enlarge

The Analysis

Regarding OpenCV for Tegra in comparison to the public OpenCV, the results point out that OpenCV for Tegra has been optimized, mostly for some CPU functions. Even when discarding image read  – that takes a long time to be executed, and has approximately a 2x gain – and display frame execution times, OpenCV for Tegra still bests the open-source version.

When considering only OpenCV for Tegra, from the tables, it is possible to see that using GPU functions without care might even make the performance worse than using only the CPU. Also, it is possible to notice that, for these specific implementations, GPU is better for large images, while CPU is best for small images – when there is a tie, it would be nice to have a power consumption comparison, which hasn’t been done, or also consider the fact that this GPU code is not optimized as best as possible.

Looking at the figures 7 to 10, it can be seen that the Gaussian blur and scale conversion from 16 bits to 8 bits had a big boost when running on GPU, while conversion of the original image to grayscale and the Sobel derivatives had their performance degraded. Another point of interest is the fact that transferring data from/to the GPU has a high cost, and this is, in part, one of the reasons why the first GPU port was unsuccessful – it had more copies than needed.

Regarding image size, it can be noticed that the image read and display have an impact in overall performance that might be relevant depending on the complexity of the algorithm being implemented, or how the image capture is being done.

There are probably many ways to try and/or make this code more optimized, be it by only using OpenCV; by combining custom CUDA functions with OpenCV; by writing the application fully in CUDA or; by using another framework or tool such as VisionWorks.

Two points that might be of interest regarding optimization still in OpenCV are the use of streams – asynchronous execution of code on the CPU/GPU – and zero-copy or shared memory, since the Tegra K1 has CPU and GPU shared memory supported by CUDA (see this NVIDIA presentation from GPU Technology Conference and this NVIDIA blog post for reference).

Conclusion

In this article, the installation of the NVIDIA JetPack SDK and deployment on the Toradex Apalis TK1 have been presented. Having this tool installed, you are able to use OpenCV for Tegra, thus benefiting from all of the optimizations provided by NVIDIA. The JetPack SDK also provides many other useful contents, such as CUDA, VisionWorks and GameWorks samples, and the NVIDIA Nsight IDE.

In order to assess how easy it is for a developer freshly introduced to the CV and GPU concepts to take advantage of CUDA, purely using OpenCV optimized functions, a CPU to GPU port of a Sobel filter application was written and tested. From this experience, some interesting results were found, such as the facts that GPU indeed improves performance – and this improvement magnitude depends on a series of factors, such as size of the input image, quality of implementation – or developer experience, algorithms being used and complexity of the application.

Having a myriad of sample source code, it is easy to start developing your own applications, although care is required in order to make the Apalis TK1 System on Module yield its best performance. You can find more development information in the NVIDIA documentation, as well as the OpenCV documentation. Toradex also provides documentation about Linux usage in its developer website, and has a community forum. Hope this information was helpful, see you next time!

Using GPIOs on NanoPi NEO 2 Board with BakeBit Starter Kit

May 21st, 2017 10 comments

NanoPi NEO 2 is a tiny 64-bit ARM development board powered by Allwinner H5 processor. FriendlyELEC sent me a couple of NEO 2 samples together with their BakeBit Start Kit with a NanoHat and various modules via GPIOs, analog input or I2C. I’ve already tested both Armbian with Linux 4.11 and Ubuntu Core Qt with Linux 3.10, and ran a few benchmarks on NanoPi NEO 2. You would normally prefer to use the Armbian image with Linux mainline since it provided better performance, but at the time I was told GPIO support was not there.

Configuring NanoPi NEO 2 board with BakeBit library

So this week-end, when I decided to test GPIO support and BakeBit Starter Kit, I decided to follow this advice, especially nanopi-neo2-ubuntu-core-qte-sd4g-20170329.img.zip image is still the recommended one in the Wiki. So I went with that image.

I’ll use Python examples from Bakebit library, but if you prefer something similar to WiringPi, you may consider using WiringNP library directly instead of using Bakebit. Since NanoHat Hub comes with header with digital I/O (including 2 PWM), analog input, I2C and UART interfaces, I’ll make sure I try samples for all interfaces I have hardware for. FriendlyELEC did not include a module with a UART interface, so I’ll skip that one.

I followed instructions in BakeBit wiki from a terminal which you can access from the serial console or SSH. First, we need to retrieve the source code:

Then we can start the installation:

The last line will install the following dependencies:

  • python2.7           python2.7
  • python-pip         alternative Python package installer
  • git                        fast, scalable, distributed revision control system
  • libi2c-dev           userspace I2C programming library development files
  • python-serial     pyserial – module encapsulating access for the serial port
  • i2c-tools              This Python module allows SMBus access through the I2C /dv
  • python-smbus   Python bindings for Linux SMBus access through i2c-dev
  • minicom             friendly menu driven serial communication program
  • psutil                   a cross-platform process and system utilities module for n
  • WiringNP           a GPIO access library for NanoPi NEO

This will take a while, and after it’s done, the board will automatically reboot.

We can check if everything is properly running, but try out one of the Python scripts:

hmm, python-smbus was supposed to be installed via the installation script. Let’s try to install it manually:

Running the command again with verbose option shows the download URL is not valid:

So I went to https://pypi.python.org/simple/ looking for another python-smbus library in case the name has changed, and I finally installed the pysmbus:

I could go further, but the I2C bus was not detected:

So maybe the driver needs to be loaded. But running sudo modprobe i2c_sunxi it does nothing, and I could notice the .ko file is missing from the image…

So let’s try to build the source code for the board following the Wiki intructions:

We also need to install required build packages…

… download gcc-linaro-aarch64.tar.xz toolchain, and copy it to lichee/brandy/toolchain directory (do not extract it, it will be done by the build script).

Now we can try to build the kernel for NanoPi NEO 2 (and other Allwinner H5 boards).

and it failed with more errors possible related to CROSS_COMPILE flag. There must be a better solution… FriendlyELEC guys might not work on Saturday afternoon, and while I did contact them, I decided to try one of their more recent images with Linux 4.11 available here.

Let’s pick nanopi-neo2_ubuntu-core-xenial_4.11.0_20170518.img.zip since it has a similar name, and is much newer (released 3 days ago). I repeated the installation procedure above, and …

Success! Albeit after 4 to 5 hours of work… Let’s connect hardware to ind out whether it actually works, and not just runs.

Analog Input and Digital Output – Sound Sensor Demo

The simplest demo would be to use the LED module, but let’s do something more fun with the Sound Sensor demo I found in BakerBit Starter Kit printed user’s manual, and which will allow us to use both digital output with the LED module connected to D5 header, and analog input with the Sound sensor module connected to A0 header. Just remember the long LED pin is the positive one.

You can run the code as follows:

I changed the source a bit including the detection threshold, and timing to make it more responsive:

The LED will turn on each time the the sound level (actually analog voltage) is above 1.46V.

PWM and Analog Input – Servo and Rotary Angle Sensor Demo

We can test PWM output using the Servo module connected to D5 header, and control it using the rotary angle sensor module connected the A0 analog input header .

Click to Enlarge

The sample for the demo runs fine, and use the potentiometer is detected:

However, the servo is not moving at all. Raspberry Pi relies on rpi-config to enable things like I2C and other I/Os, and I noticed npi-config in the Wiki for NEO 2. So I ran it, and sure enough PWM was disabled.

So I enabled it, and answered Yes when I was asked to reboot. The only problem is that it would not boot anymore, with the system blocked at:

So maybe something went wrong during the process, so I re-flashed the Ubuntu image, reinstalled BakeBit, and re-enabled PWM0. But before rebooting, I checked the boot directory, and noticed boot.cmd, boot.scr, and the device tree file (sun50i-h5-nanopi-neo2.dtb) had been modified. The DTB looks fine, as I could decode it, and find the pwm section:

Let’s reboot the board. Exact same problem with the boot stuck at “Starting kernel…”. So there’s something wrong with the way npi-config modifies one or more of the files. With hindsight, I should have made a backup of those three files before enabling PWM the second time… I’ll give up on PWM for now, and ask FriendlyELEC to look into it.

I2C and Analog Input – OLED UI controlled with Joystick

The final test I’ll use the I2C OLED display module connected to one of the I2C headers, together with the analog joystick module connected to A0 header.

Click to Enlarge

Let’s run the sample for the demo:

It works, but there’s a bit of a lag, and the sample may have to be improved to better detect various states. I’ll show what I mean in the video below.

The bad parts are that documentation is not up-to-date, enabling PWM will crash the image, and while the Python sample do demonstrate IO capabilities, they should probably be improved to be more responsive. The good part is that we’re getting there, the hardware kit is a really nice, and I think the documentation and software should become much better in June, as FriendlyELEC has shown to be responsive to the community issues.

What? Python sucks? You can use C language with GPIOs too

If Python is not your favorite language, FriendlyELEC also provided some C languages samples in the C directory:

As we’ve seen above, Bakebit library appears to rely on WiringNP, and you’d normally be able to list the GPIOs as follows:

The utility is not too happy about seeing an Allwinner H5 board. But maybe the library in the board is not up-to-date, so I have built it from source:

and run the gpio sample again:

Excellent! It’s not quite a work-out-of-box experience, but NanoPi NEO 2 can be used with (most) GPIOs.

My adventures with NanoPi NEO 2 board are not quite done, as I still have to play with NanoHat PCM5102A audio add-on board, which I may end up combining with a USB microphone to play with Google Assistant SDK, and I’m expecting NanoPi NAS Kit v1.2 shortly. I’ll also update this post once PWM is working.

Getting Started with ESP32-Bit Module and ESP32-T Development Board using Arduino core for ESP32

May 7th, 2017 16 comments

Espressif ESP32 may have launched last year, but prices have only dropped to attractive levels very recently, and Espressif has recently released released ESP-IDF 2.0 SDK with various improvements, so the platform has become  much more interesting than just a few weeks ago. ICStation also sent me ESP32-T development board with ESP32-bit module, so I’ll first see what I got, before trying out Arduino for ESP32 on the board.

ESP32-T development board with ESP-bit Module – Unboxing & Soldering

One thing I missed when I asked for the board is that it was not soldered, and it comes in kit with ESP32-bit module in one package, and ESP32-T breakout board with headers in another package.

Click to Enlarge

The 21.5x15mm module is based on ESP32-DOWNQ6 processor with 32 Mbit (4MB) of flash, a chip antenna, and a u.FL connector.

Click to Enlarge

The module is apparently made by eBox, and also used in Widora board with all information (allegedly) available on eboxmaker.com website, but more on that later.

Click to Enlarge

ESP32-T breakout board comes with a micro USB port for power and programming/debugging via Silabs CP2102 USB to TTL brige, a power LED, a user LED (LED1), a reset button, and a user button named “KEY”. It has two rows of 19-pin headers, and a footprint for ESP32-Bit module.

Click to Enlarge

The back of the board has a footprint for ESP-32S and ESP-WROOM-32 module, which gives the board some more flexibility, as you could try it with various ESP32 modules.

Time to solder the kit. I placed ESP32-Bit on ESP32-T, and kept it in place with some black tape to solder three to four pins on each side first.


I then removed the tape, completed soldering the module, and added the headers.

Click to Enlarge

The final step is to cut the excess pin on the headers, and now we can test the board which I could insert in a breadboard after pushing with some tools…

I connected a micro USB to USB between the board and my computer, and quickly I could see the PWR LED with a solid green, and LED1 blinking.

I could also see a new ESSID on my network: ESP32_eBox, and I could just input the… wait, what is the password? No idea. So I went to the board’s website, and everything is in Chinese with very limited hardware and software information on the ESP32 page. So it was basically useless, and I did not find the password, and other people neither. I asked ICStation who provided the sample, but they were unable to provide an answer before the review.

I could see the serial ouput via /dev/ttyUSB0 (115200 8N1) in Ubuntu 16.04:

Arduino core for ESP32 on ESP32-T (and Other ESP32 Boards)

But nothing really useful. Since the website mentions Arduino, I just decided to go with Arduino core for ESP32 chip released by Espressif, which explains how to use Arduino or PlatformIO IDEs. I opted to go with the Arduino IDE. The first thing is to download and install the latest Arduino IDE.

I’m running Ubuntu on my computer, so I downloaded and installed the Linux 64-bit version:

The next commands install the Arduino ESP32 support and dependencies:

We can now launch the Arduino IDE:

There are several ESP32 to choose from, but nothing about ESP32-T, ESP32-Bit, or Widora. However, I’ve noticed the board’s pinout looks exactly the same as ESP32Dev board shown below.

Click to Enlarge

So I selected ESP32 Dev Module, and set /dev/ttyUSB0 upload speed to 115200.

Click to Enlarge

The next step is to find an easy example to check if everything works, and there are bunch of those in File->Examples, Examples for ESP32 Dev Module section.

Click to Enlarge

I selected GetCHIPID sample, as it just retrieve the Chip ID from the board, and as we’ll see later the Chip ID is actually the MAC Address. I could upload the code, and it indeed returned the Chip ID:

The next sample I tried – WiFi->SimpleWiFiServer – will allow you to test both WiFi connectivity and GPIOs. I modified the sketch to use pin 2 instead of pin 5  in order to control LED1 on the board connected to GPIO2. You’ll also need to set the SSID and password to connect to your WiFi network. Once you’ve compiled and uploaded the sketch to the board, you’ll need to find the board’s IP address. You can do so in your router DHCP list with the board named “espressif” by default, and the MAC address will be the same as the CHIP ID, 24-0A-C4-01-A4-24 in my case. Now you can open the web interface in a web browser to turn on and off LED1 green LED on the board.

You could also use directly http://IP_ADDRESS/H or http://IP_ADDRESS/L to pull the pin high or low. It worked beautifully, but so far, we have not done anything that does not work on the much cheaper ESP8266 boards, and I can see one Bluetooth LE code sample for ESP32 called simpleBLEDevice in Arduino IDE, so let’s try it. It will just broadcast advertise the name of the device, and change it on button press, which could be used to broadcast message to a BLE gateway.

That’s the output from the serial terminal.

The initial name is ESP32 SimpleBLE, and as I press the KEY button on the board, the name will change to “BLE32 at: xxx”. I could detect a Bluetooth ESP32 device with the various names with my Android smartphone.

Since, it’s just advertising the name, there’s no pairing. But that’s a start. To have more insights into Bluetooth, you may also want to check out WiFiBlueToothSwitch.ino sample which shows show to use various mode such as Bluetooth only, Bluetooth + WiFi, WiFi STA, etc… For a more practical use of Bluetooth on ESP32, Experiments with Bluetooth and IBM Watson article may be worth a read. But a faster dual core processor and Bluetooth support are not the only extra features of ESP32 compared to ESP8266, as you also get more GPIOs, hardware PWM, better ADC, a touch interface, a CAN bus, Ethernet, etc…, so there’s more to explore, although I’m not sure all features are fully supported in ESP-IDF SDK and Arduino.

Final Words about ESP32-T and ESP32-Bit

After some initial difficulties, and confusions, I managed to make ESP32-T development kit work, but it’s difficult to recommend it. First, documentation is really poor right now, and while I found out you can use the exact same instructions than for ESP32Dev board, it does not reflect well on the company. Second, the board is sold as a kit that needs to be soldered, which may be a hassle for many, and possibly a fun learning experience for a few. Finally, ESP32-T + ESP32-Bit sells for $15 to $20 on various website, which compares to competitors fully assembled development boards – such as Wemos LoLin32 – now going for less than $10 shipped, and which basically the same features set (ESP32 + 4MB flash) minus the user LED and button, and a u.FL connector for an external antenna.

I’d still like to thank ICStation for giving me the opportunity to test the board. They are now selling it for $14.99 shipped with 15% extra discount possible with Jeanics  coupon (for single order). You’ll also find ESP32-T board on Aliexpress, but pay close attention if you are going to buy there, as it may be sold without ESP32-Bit module. Usually, all prices well below $10 are without the module.

Karl’s Home Automation Project – Part 4: MQTT Bridge Updated to Use YS-IRTM IR Receiver & Transmitter with NodeMCU

April 20th, 2017 1 comment

In a previous article, I wrote about an MQTT bridge by 1technophile. I added a DHT temperature and humidity sensor as well as a light sensor. Previously it included a software decoder to decode the IR signal. I never did test the IR transmitter on the gateway, as I didn’t have the parts. But thanks to IC Station, who sent me over a small YS-IRTM hardware based decoder and NodeMCU that I am writing about today. I have replaced the software based version with the YS-IRTM module in the latest update.

Click to Enlarge

Click to Enlarge

I found this project challenging. I admit I am a little weak in my programming skills. It was difficult to find documentation but I found a forum talking about this device and basics of how it works. When an IR code is recognized it sends 3 hex codes via serial connection on the transmit pin. To transmit, it expects 5 hex codes: A1,F1,xx,xx,xx. A1,F1 tells it to send the following codes. You can also set the baud rate but I left default 9600.

It is simple wiring wise. It only takes 4 dupont wires. It took a bit of coding to get it working but I finally got it to communicate via software serial. I started on a Arduino Uno with the code and then migrated it over to the ESP8266 board. I did have a little trouble when I first moved to the ESP board. I initially thought I might need a level shifter but that didn’t help. I am a little surprised I didn’t need a level shifter because the ESP needs only 3.3 volts. I was getting some weird responses and finally figured out I had to put in a slight delay. Maybe the ESP’s speed comes into play.

The way to use this is fill out your SSID and password and your MQTT server with credentials. Flash the device. You will need to add the necessary libraries. 1technophile has good documentation in his wiki.

Once flashed and ready to find your IR codes you will need to subscribe to the topics with the Windows command below. Give the gateway a moment to connect and point your IR remote at the sensor and press a button to find out code.

In your window, you will get something like this “home/sensors/ir 4,fb,8,” which is my power button for my TV. To test the code:

With this code, the TV will toggle on and off.

Click to Enlarge

After this you can use your favorite home automation project and control your IR devices with automations. You can omit any sensors that you don’t need. You will get some erroneous MQTT data if not all sensors are used. Below are the bits of Arduino code added for the IR module, and here’s the link to the github code:

I plan on 3D printing an enclosure with CR-10 I am reviewing, and I will remove the IR LED, and move it to a more suitable position, as both facing the same way isn’t ideal for my setup.

I would like to thank IC Station for sending the NodeMCU ($5.81 shipped) and IR transmitter and receiver ($3.39 shipped) for review. You can get 15% discount with coupon Karics. I finally have a complete gateway.

ESP8266, Mongoose OS & Grove Sensors – An Alternative Solution for Hackathons

April 12th, 2017 5 comments

CNXSoft: This is a guest post by Cesanta

If you walked into any Hardware hackathon over the last year, you would see they are about innovation and bringing new ideas to this world and most of them are centered around the connected devices nowadays. However, just walk the floor, talk to the teams and you can quickly see an elephant in the room. The Hackathons are about connected devices, but with the ‘recommended’ and frequently sponsored hardware distributed to the teams such as Intel Galileo, Raspberry Pi, etc…. developers may struggle for a long time to even connect it to the cloud!

Not to mention the innovation is usually hindered by a tedious environment setup which takes hours, things to learn about the specific hardware and how it can be programmed using low level languages. So many teams spent most of the time fighting with those issues and oftentimes still do not have their prototype ready and connected by the end of hackathon.

This situation can be improved by using ESP8266 boards with Mongoose OS and SeeedStudio Grove Sensors. The solution brings the following benefits:

  1. Low price:
    • ESP8266 development board is $4-15 depending on the board;
    • Seeed Studio Sensors are priced  $3 to $15 each, but you can also save by purchase them as a part of Grove Starter Kit for $39.
  2. The solution is solderless & plug and play – so anyone can actually use it fast.
  3. With Mongoose OS the firmware logic can be coded within few minutes using JavaScript code
  4. The data can be pushed to any cloud or public MQTT server such as Mosquitto, HiveMQ, AWS IoT, etc…

Let’s jump into the action and get ESP8266 & Seeed Light Sensor up and running with Mongoose OS in a few minutes. This example below shows how to get the hardware (sensor) data and send it to the cloud.

  1. Get your ESP8266 (e.g. NodeMCU) and Seeedstudio Light Sensor and Button ready.
  2. Download and install mOS tool for Mongoose OS. This works in Linux, Mac OS X, or Windows operating systems
  3. Connect the hardware
    • Power the Grove base shield: connect GND and VCC pins to the NodeMCU GND and VCC pins
    • Connect light sensor to slot 7 on the Grove base shield
    • Connect slot 7 to the ADC pin on the NodeMCU board
    • Connect NodeMCU board to your computer
  4. Program the board to retrieve the light sensor data and send it to the cloud (HiveMQ in this example)
    • Start mos tool, switch to the prototyping mode, edit init.js file
    • Click ‘Save and reboot device”
  5. Go to http://www.hivemq.com/demos/websocket-client/, connect and subscribe to the topic “my/topic”
  6. Press a button and see how light sensor reading is sent to the MQTT server

Light Sensor Data Shown on HiveMQ Dashboard – Click to Enlarge

Now you can see how easy it was! Want to play with other Seedstudio sensors from Grove Kits? Check video tutorials for button, motion sensor, moisture sensor, UV sensor, relay, buzzer, etc… including the one below with the light sensor.

Transform Your ESP8266 Board into a USB to Serial Board Easily with Arduino Serial Bypass Sketch

April 7th, 2017 6 comments

USB to serial boards are necessary to program and debug boards, and/or access the serial console, and while they are very cheap, you may be in a situation where you don’t have any around, but you do have some Arduino compatible boards. It’s been possible to transform an Arduino board into a USB to TTL debug for several years using ArduinoSerialBypass.ino sketch, but I’ve been informed this also works on ESP8266 boards such as Wemos D1 Mini.

The sketch could not be simpler:

The code simply makes sure that Tx and Rx pins are set as inputs in order not to disturb the serial connection as explained below:

This code makes the Arduino not interfere with pins 0 and 1 which are connected to RX and TX on the FTDI chip. This allows the data coming from the FTDI USB 2 Serial chip to flow directly to another device. Since RX and TX are labeled from the Arduino’s point of view, don’t cross the wires, but plug the device’s RX wire into the RX pin 0 and the TX wire into the TX pin 0

This should work with any Arduino compatible boards with a USB to serial chip, but it’s nice that it has been confirmed to work on Wemos D1 mini. If you’d rather get a WiFi to serial bridge, that’s what ESPLink firmware is for.

Thanks to Zoobab for the tip.

Categories: Espressif, Hardware Tags: arduino, esp8266, how-to