NXP i.MX RT106F & RT106A/L Cortex-M7 Processors Target Offline Face Recognition & Smart Audio Applications

NXP i.MX RT crossover processors combine real-time capabilities of microcontrollers with the performance of application processors thanks to an Arm Cortex-M7 core clocked at 528 MHz and more.

The performance is indeed impressive as shown by Teensy 4.0 benchmarks, but so far NXP i.MX RT processor targeted general purpose applications. The company has now introduced three new crossover processors designed for AI applications. NXP i.MX RT106F is designed for offline face recognition and expression Identification, while RT106L and RT106A are made for local and cloud-based embedded voice applications.

NXP i.MX RT106F Processor

Highlights of the processor:

CPU – Arm Cortex-M7 @ 600 MHz (3020 CoreMark/1284 DMIPS)
Memory – 1 MB On-Chip SRAM plus up to 512 KB configurable as Tightly Coupled Memory (TCM)
External memory interface options – NAND, eMMC, QuadSPI NOR Flash, and Parallel NOR Flash
Real-time, low-latency response as low as 20 ns
Industry’s lowest dynamic power with an integrated DC-DC converter
Low-power run modes at 24 MHz
Advanced multimedia for GUI and enhanced HMI
- 2D graphics acceleration engine
- Parallel camera sensor interface
- LCD display controller (up to WXGA 1366×768)
- 3x I2S for high-performance, multi-channel audio

NXP provides FreeRTOS for the microcontroller/processor, and software development can be performed with MCUXpresso SDK, IDE and Config Tools. NXP claims their OASIS face processing engine enables face detection, recognition, and anti-spoofing without cloud connectivity at a much lower price than competing Linux based solutions. More details about the processor itself can be found on the product page.

Face Recognition Devkit

NXP i.MX106F Face Recognition Development Kit — Click to Enlarge

The company is working with OEM to develop i.MX RT106F development kit for face recognition applications such as the one pictured above.

The devkit features an ultra-small form-factor, production-ready hardware design running FreeRTOS that allows for a quick out-of-the-box implementation. It can perform face detection, face tracking, face alignment,
and face recognition without Wi-Fi and cloud connectivity in order to address potential privacy concerns.

Devkit specifications:

MCU – NXPi.MX RT106F Crossover Processor with Arm Cortex-M7 core @ 600 MHz, 32 kB I-cache, 32 kB D-cache, FPU, 1MB on-chip SRAM
System Memory – 32 MB SDRAM
Storage – 32 MB Hyperflash
ADC/DAC Conversion – 2x ADC (20-ch), 2 x ACMP
System Control – Secure JTAG, PLL OSC, eDMA, 4x Watch Dog, 6x GP Timer, 4x Quadrature ENC, 4x QuadTimer, 4x FlexPWM, IOMUX
Security
- Hardware – HAB, TRNG, Encrypted XIP out of Flash
- Software – Ciphers & RNG, Secure RTC, Fuse, HAB
Misc – Optional display and keypad, supports RGB and IR, interface for temperature monitoring
Power Supply – 5V USB Type-C port; Low-dropout regulation via DC-to-DC & LDO
Dimensions – 50 x 40 mm

NXP i.MX RT106F Face Recognition-Devkit Block Diagram — Hardware Block Diagram

The kit with come with a full source stack include FreeRTOS operating system and drivers, the face recognition inference engine, GUI API, Bluetooth & WiFi connectivity manager, and more.

Some of the applications are somewhat worrying (does my washing machine really need a camera?) but here they are:

Smart appliances – Washing machines, dryers, ovens, refrigerators, stoves, and dishwashers
Home comfort devices – Thermostats, HVAC and lighting control
Countertop appliances – Microwaves, coffee machines, and rice cookers
Safety/Security/Alarm devices – Alarm panels and automated access
Smart industrial devices – Power tools, ergonomic stations, industrial workstations

The processor and development kits will become available in Q1 2020, but if your company has a project that could benefit from the solution you could request early access in the devkit page.

NXP i.MX RT106A & RT106L

I challenge you to find any differences between NXP i.MX RT106A block diagram above and the one for RT106F above, because unless I’m really tired (Friday evening here), there aren’t any.

The highlights for both RT106A and RT106L voice processors are the same, and only differ against RT106F because of wireless connectivity interfaces:

CPU – Arm Cortex-M7 @ 600 MHz (3020 CoreMark/1284 DMIPS)
Memory – 1 MB On-Chip SRAM plus up to 512 KB configurable as Tightly Coupled Memory (TCM)
External memory interface options – NAND, eMMC, QuadSPI NOR Flash, and Parallel NOR Flash
Real-time, low-latency response as low as 20 ns
Industry’s lowest dynamic power with an integrated DC-DC converter
Low-power run modes at 24 MHz
Advanced multimedia for GUI and enhanced HMI
- 2D graphics acceleration engine
- Parallel camera sensor interface
- LCD display controller (up to WXGA 1366×768)
- 3x I2S for high-performance, multi-channel audio
Wireless connectivity interface for WiFi, Bluetooth, Bluetooth Low Energy, ZigBee and Thread

So I understand RT106A, RT106L and RT106F are the same hardware, software/licenses vary.

NXP i.MX RT106L is designed for local (offline) voice control solutions using Snips with the following software features:

Far-field audio front end
- Acoustic echo cancellation (barge-in)
- Ambient noise reduction
- Beamforming
- Playback processing
- Codecs
Automatic speech recognition engine
Media player/streamer
MQTT, lwIP, TLS
All drivers, including Wi-Fi and Bluetooth

NXP i.MX RT106A is licensed to run NXP’s turnkey voice assistant software solutions with the following features:

Far-field audio front end softDSP
- Acoustic echo cancellation
- Ambient noise reduction
- Beamforming
- Barge-in
- Playback processing
- Codecs
Wake-word inference engine
Media player/streamer
MQTT, lwIP, TLS
Discovery and onboarding
All drivers, including Wi-Fi and Bluetooth

You’ll find more details on their respective product pages here and there.

NXP i.MX RT MCU Alexa Voice Service Solution

i.MX RT Voice Solution Board — Click to Enlarge

There’s also an i.MX RT106A based system reference design for creating products with Alexa built-in called “NXP i.MX RT MCU Alexa Voice Service Solution“.

Specifications:

MCU – NXP i.MX RT106A Crossover Processor with Arm Cortex-M7 core @ 600 MHz, 32 kB I-cache, 32 kB D-cache, FPU, 1MB on-chip SRAM
Storage – 32 MB Hyperflash
Audio – TFA9894 embedded DSP class-D audio amplifier solution associated with speaker protection and boost algorithm; low power: 91% peak efficiency for 600mW sine wave, low battery consumption: <125 mA (Po = 380 mW, average music power)
Wireless Connectivity – WiFi 4 (802.11 b/g/n) and Bluetooth + BLE 4.1
System Control – Secure JTAG, PLL OSC, eDMA, 4x Watch Dog, 6x GP Timer, 4x Quadrature ENC, 4x QuadTimer, 4x FlexPWM, IOMUX
ADC/DAC Conversion – 2x ADC (20-ch), 2 x ACMP
Security
- Hardware – Optional secure element A71CH
- Software – Ciphers & RNG, Secure RTC, Fuse, HAB
Protocols – MQTT, mBedTLS, Alexa for MCU, LWIP
Power Supply – 5V USB Type-C port; Low-dropout regulation via DC-to-DC & LDO
Dimensions – 40 x 30 mm
Qualifications – Amazon AVS Qualified

NXP-i.MX-RT-Voice-Development-Kit-AVS-Block-Diagram — Hardware Block Diagram

The kit also runs Amazon FreeRTOS and supports up to three Integrated low-cost MEMS microphones, and two external digital microphone lines. The software architecture is similar to the face detection kit, but obviously, it removes the computer vision part and replaces it with audio stacks such as Opus, MP3, G.711 and WMA audio codec, the wake word inference engine, audio DSP, and more. The software also supports “IoT” communication protocols like MQTT and includes lwIP lightweight TCP/IP stack.

You’ll find more details on the product page, and it will soon be sold on Mouser for $49. It’s an official Amazon AVS development kit and as such is listed on Amazon Developer site together with MediaTek MT8516 devkit we covered earlier today, and other compatible platforms.

Jean-Luc Aufranc (CNXSoft)

Jean-Luc started CNX Software in 2010 as a part-time endeavor, before quitting his job as a software engineering manager, and starting to write daily news, and reviews full time later in 2011.

Share this:

Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress

Name*

Email*

Website

I agree to the Privacy Policy

The comment form collects your name, email and content to allow us keep track of the comments placed on the website. Please read and accept our website Terms and Privacy Policy to post a comment.

10 Comments

oldest

newest

Jon Smirl

5 years ago

I think the hardware is identical between RT1062, RT106A, RT106L, RT106F. The A, L, F variations include software licenses. Those software licenses license vary between models and have different prices.

Hector

The M7 will NOT do IR facial recognition since it cannot handle IR image processing in a timely (<1s) manner. You will need a minimum 4 cores at 1.4Ghz. Near IR (500nm) is the light range used by professionals since white light is easily fooled.

Jianfeng Qin

4 years ago

No, the M7 MCU based solution is really supported RGB/IR dual camera image processing with all time < 500ms including detection & liveness & recognition algorithm.

willy

Seeing real-time MCUs run at half a gigahertz looks impressive to me. I’m wondering how accurate the realtime processing remains though, when entering a domain where processing performance starts to heavily depend on cache hits and misses.

blu

I guess that for hard-RT scenarios one’d turn off the caches.

dgp

If this meets your real time constraints in the worst case it will still meet your constraints in the best case.

Determining the worst case with a fair degree of certainty happens to be the crux of the problem. Otherwise caching can be slower than non-caching — all it takes is a high-enough miss rate and an average fetch size smaller than a cacheline.

Willy

Most CPUs designed to work with caches cannot fetch less than a cache line anyway. Some memory controllers may at least start to fetch from the requested word. I think that if you disable write-back (or at least write-allocate) leaving the cache enabled cannot degrade performance compared to it being completely disabled.

That’s a possibility. We need a volunteer with a Teensy 4 : )

Since that got my curiosity, I had to track down the relevant interfaces and their docs. M7’s TRM is quite clear on the AXIM capabilities: https://static.docs.arm.com/ddi0489/d/DDI0489D_cortex_m7_trm.pdf — 5.4.1 AXI attributes and transactions. The AXIM interface can do way more than linefills at fetches — it’s capable of various combinations of outstanding linefills and (multi-) word fetches.