NXP i.MX RT106F & RT106A/L Cortex-M7 Processors Target Offline Face Recognition & Smart Audio Applications

Orange Pi Development Boards

NXP i.MX RT crossover processors combine real-time capabilities of microcontrollers with the performance of application processors thanks to an Arm Cortex-M7 core clocked at 528 MHz and more.

The performance is indeed impressive as shown by Teensy 4.0 benchmarks, but so far NXP i.MX RT processor targeted general purpose applications. The company has now introduced three new crossover processors designed for AI applications. NXP i.MX RT106F is designed for offline face recognition and expression Identification, while RT106L and RT106A are made for local and cloud-based embedded voice applications.

NXP i.MX RT106F Processor

NXP iMX RT106F Block DiagramHighlights of the processor:

  • CPU – Arm Cortex-M7 @ 600 MHz (3020 CoreMark/1284 DMIPS)
  • Memory – 1 MB On-Chip SRAM plus up to 512 KB configurable as Tightly Coupled Memory (TCM)
  • External memory interface options – NAND, eMMC, QuadSPI NOR Flash, and Parallel NOR Flash
  • Real-time, low-latency response as low as 20 ns
  • Industry’s lowest dynamic power with an integrated DC-DC converter
  • Low-power run modes at 24 MHz
  • Advanced multimedia for GUI and enhanced HMI
    • 2D graphics acceleration engine
    • Parallel camera sensor interface
    • LCD display controller (up to WXGA 1366×768)
    • 3x I2S for high-performance, multi-channel audio

NXP provides FreeRTOS for the microcontroller/processor, and software development can be performed with MCUXpresso SDK, IDE and Config Tools.  NXP claims their OASIS face processing engine enables face detection, recognition, and anti-spoofing without cloud connectivity at a much lower price than competing Linux based solutions. More details about the processor itself can be found on the product page.

Face Recognition Devkit

NXP i.MX106F Face Recognition Development Kit
Click to Enlarge

The company is working with OEM to develop i.MX RT106F development kit for face recognition applications such as the one pictured above.

The devkit features an ultra-small form-factor, production-ready hardware design running FreeRTOS that allows for a quick out-of-the-box implementation. It can perform face detection, face tracking, face alignment,
and face recognition without Wi-Fi and cloud connectivity in order to address potential privacy concerns.

Devkit specifications:

  • MCU – NXPi.MX RT106F Crossover Processor with Arm Cortex-M7 core @ 600 MHz, 32 kB I-cache, 32 kB D-cache, FPU, 1MB on-chip SRAM
  • System Memory – 32 MB SDRAM
  • Storage – 32 MB Hyperflash
  • ADC/DAC Conversion – 2x ADC (20-ch), 2 x ACMP
  • System Control – Secure JTAG, PLL OSC, eDMA, 4x Watch Dog, 6x GP Timer, 4x Quadrature ENC, 4x QuadTimer, 4x FlexPWM, IOMUX
  • Security
    • Hardware – HAB, TRNG, Encrypted XIP out of Flash
    • Software – Ciphers & RNG, Secure RTC, Fuse, HAB
  • Misc – Optional display and keypad, supports RGB and IR, interface for temperature monitoring
  • Power Supply – 5V USB Type-C port; Low-dropout regulation via DC-to-DC & LDO
  • Dimensions – 50 x 40 mm
NXP i.MX RT106F Face Recognition-Devkit Block Diagram
Hardware Block Diagram

The kit with come with a full source stack include FreeRTOS operating system and drivers, the face recognition inference engine, GUI API, Bluetooth & WiFi connectivity manager, and more.

NXP Face Recognition Software Stack

Some of the applications are somewhat worrying (does my washing machine really need a camera?) but here they are:

  • Smart appliances – Washing machines, dryers, ovens, refrigerators, stoves, and dishwashers
  • Home comfort devices – Thermostats, HVAC and lighting control
  • Countertop appliances – Microwaves, coffee machines, and rice cookers
  • Safety/Security/Alarm devices – Alarm panels and automated access
  • Smart industrial devices – Power tools, ergonomic stations, industrial workstations

The processor and development kits will become available in Q1 2020, but if your company has a project that could benefit from the solution you could request early access in the devkit page.

NXP i.MX RT106A & RT106L

NXP i.MX RT106A
NXP i.MX RT106A Block Diagram

I challenge you to find any differences between NXP i.MX RT106A block diagram above and the one for RT106F above, because unless I’m really tired (Friday evening here), there aren’t any.

The highlights for both RT106A and RT106L voice processors are the same, and only differ against RT106F because of wireless connectivity interfaces:

  • CPU – Arm Cortex-M7 @ 600 MHz (3020 CoreMark/1284 DMIPS)
  • Memory – 1 MB On-Chip SRAM plus up to 512 KB configurable as Tightly Coupled Memory (TCM)
  • External memory interface options – NAND, eMMC, QuadSPI NOR Flash, and Parallel NOR Flash
  • Real-time, low-latency response as low as 20 ns
  • Industry’s lowest dynamic power with an integrated DC-DC converter
  • Low-power run modes at 24 MHz
  • Advanced multimedia for GUI and enhanced HMI
    • 2D graphics acceleration engine
    • Parallel camera sensor interface
    • LCD display controller (up to WXGA 1366×768)
    • 3x I2S for high-performance, multi-channel audio
  • Wireless connectivity interface for WiFi, Bluetooth, Bluetooth Low Energy, ZigBee and Thread

So I understand RT106A, RT106L and RT106F are the same hardware, software/licenses vary.

NXP i.MX RT106L is designed for local (offline) voice control solutions using Snips with the following software features:

  • Far-field audio front end
    • Acoustic echo cancellation (barge-in)
    • Ambient noise reduction
    • Beamforming
    • Playback processing
    • Codecs
  • Automatic speech recognition engine
  • Media player/streamer
  • MQTT, lwIP, TLS
  • All drivers, including Wi-Fi and Bluetooth

NXP i.MX RT106A is licensed to run NXP’s turnkey voice assistant software solutions with the following features:

  • Far-field audio front end softDSP
    • Acoustic echo cancellation
    • Ambient noise reduction
    • Beamforming
    • Barge-in
    • Playback processing
    • Codecs
  • Wake-word inference engine
  • Media player/streamer
  • MQTT, lwIP, TLS
  • Discovery and onboarding
  • All drivers, including Wi-Fi and Bluetooth

You’ll find more details on their respective product pages here and there.

NXP i.MX RT MCU Alexa Voice Service Solution

 

i.MX RT Voice Solution Board
Click to Enlarge

There’s also an i.MX RT106A based system reference design for creating products with Alexa built-in called “NXP i.MX RT MCU Alexa Voice Service Solution“.

Specifications:

  • MCU – NXP i.MX RT106A Crossover Processor with Arm Cortex-M7 core @ 600 MHz, 32 kB I-cache, 32 kB D-cache, FPU, 1MB on-chip SRAM
  • Storage – 32 MB Hyperflash
  • Audio – TFA9894 embedded DSP class-D audio amplifier solution associated with speaker protection and boost algorithm; low power:  91% peak efficiency for 600mW sine wave,  low battery consumption: <125 mA (Po = 380 mW, average music power)
  • Wireless Connectivity – WiFi 4 (802.11 b/g/n) and Bluetooth + BLE 4.1
  • System Control – Secure JTAG, PLL OSC, eDMA, 4x Watch Dog, 6x GP Timer, 4x Quadrature ENC, 4x QuadTimer, 4x FlexPWM, IOMUX
  • ADC/DAC Conversion – 2x ADC (20-ch), 2 x ACMP
  • Security
    • Hardware – Optional secure element A71CH
    • Software – Ciphers & RNG, Secure RTC, Fuse, HAB
  • Protocols – MQTT, mBedTLS, Alexa for MCU, LWIP
  • Power Supply – 5V USB Type-C port; Low-dropout regulation via DC-to-DC & LDO
  • Dimensions – 40 x 30 mm
  • Qualifications – Amazon AVS Qualified
NXP-i.MX-RT-Voice-Development-Kit-AVS-Block-Diagram
Hardware Block Diagram

The kit also runs Amazon FreeRTOS and supports up to three Integrated low-cost MEMS microphones, and two external digital microphone lines. The software architecture is similar to the face detection kit, but obviously, it removes the computer vision part and replaces it with audio stacks such as Opus, MP3, G.711 and WMA audio codec, the wake word inference engine, audio DSP, and more. The software also supports “IoT” communication protocols like MQTT and includes lwIP lightweight TCP/IP stack.

NXP i.MX RT Voice Devkit Software Architecture

You’ll find more details on the product page, and it will soon be sold on Mouser for $49. It’s an official Amazon AVS development kit and as such is listed on Amazon Developer site together with MediaTek MT8516 devkit we covered earlier today, and other compatible platforms.

Support CNX Software - Donate via PayPal or become a Patron on Patreon

9
Leave a Reply

avatar
3 Comment threads
6 Thread replies
0 Followers
 
Most reacted comment
Hottest comment thread
5 Comment authors
bluWillydgpHectorJon Smirl Recent comment authors
  Subscribe  
newest oldest most voted
Notify of
Member

I think the hardware is identical between RT1062, RT106A, RT106L, RT106F. The A, L, F variations include software licenses. Those software licenses license vary between models and have different prices.

Hector
Guest
Hector

The M7 will NOT do IR facial recognition since it cannot handle IR image processing in a timely (<1s) manner. You will need a minimum 4 cores at 1.4Ghz. Near IR (500nm) is the light range used by professionals since white light is easily fooled.

willy
Guest
willy

Seeing real-time MCUs run at half a gigahertz looks impressive to me. I’m wondering how accurate the realtime processing remains though, when entering a domain where processing performance starts to heavily depend on cache hits and misses.

blu
Guest
blu

I guess that for hard-RT scenarios one’d turn off the caches.

dgp
Guest
dgp

If this meets your real time constraints in the worst case it will still meet your constraints in the best case.

blu
Guest
blu

Determining the worst case with a fair degree of certainty happens to be the crux of the problem. Otherwise caching can be slower than non-caching — all it takes is a high-enough miss rate and an average fetch size smaller than a cacheline.

Willy
Guest
Willy

Most CPUs designed to work with caches cannot fetch less than a cache line anyway. Some memory controllers may at least start to fetch from the requested word. I think that if you disable write-back (or at least write-allocate) leaving the cache enabled cannot degrade performance compared to it being completely disabled.

blu
Guest
blu

That’s a possibility. We need a volunteer with a Teensy 4 : )

blu
Guest
blu

Since that got my curiosity, I had to track down the relevant interfaces and their docs. M7’s TRM is quite clear on the AXIM capabilities: https://static.docs.arm.com/ddi0489/d/DDI0489D_cortex_m7_trm.pdf — 5.4.1 AXI attributes and transactions. The AXIM interface can do way more than linefills at fetches — it’s capable of various combinations of outstanding linefills and (multi-) word fetches.