A First Look at ESP32-LyraTD-MSC Audio Mic HDK with Baidu DuerOS Assistant

Earlier this year, Espressif Systems had unveiled their ESP32-LyraTD-MSC Audio MiC HDK (Hardware Development Kit) which features an ESP32-WROOM module, a 4-mic array DSP, 3 microphones, an audio jack, and various I/Os.

I received the board a couple of weeks ago, and while there’s no public information released yet, the company provided me with ESP32-LyraTD-MSC User Guide in English. Eventually, I’d expect Google Assistant and Amazon Alexa to be supported, but in the meantime I had to leverage my (lowly) Chinese language skills to get started since the kit is pre-loaded with firmware connecting to Baidu DuerOS voice assistant.

ESP32-LyraTD-MSC Unboxing

The kit came in a bland Espressif Systems carton box.


Inside the package, I could only find one kit comprised of two boards.

The bottom board read ESP32_MicrosemiDSP_Mainboard-V1, and does not show much apart from marking for connectors, headers and the power switch.

Click to Enlarge

While the top comes with eight buttons (Vol +, Vol -, Mode, Boot, RST, Rec, Play, and Set), three microphones, as well as some configuration switches, which you may not want to touch a first…

Click to Enlarge

We can take the two boards apart to check out the mainboard, and ESP32_MicrosemiDSP_SubBoard_V1 with the microphones and buttons which includes a chip marked “N1309-3216”.

Click to Enlarge

If we have a closer look at the main board, we’ll find ESP32-WROVER module, MicroSemi ZL38063 audio processor which will process the audio from the microphones, and assist ESP32 with wake word recognition, as well as a CP2102N chip for debugging. We also have a micro SD card slot, two micro USB port (one for power, one for UART), an audio jack to connect a speaker, an on/off switch, and various headers for I/O and debugging (e.g. JTAG).

Click to Enlarge

Testing Espressif Systems ESP32 Audio Mic HDK with Baidu DuerOS

As this stage there’s actually little you can do due to the lack of documentation, but I was still able to test the hardware with Baidu DuerOS assistant. The first part of the user manual tells you to flash the firmware, but the requested files are nowhere to be found, and luckily the board was pre-loaded with some version of it.

So what I had to do first is to connect a USB power supply to the POWER micro USB port as well as a pair of speakers. If you plan to modify and flash the firmware (once it becomes available) you’ll also need to connect a micro USB to USB cable between your (Windows) computer and the UART micro USB port.

Click to Enlarge

Now change the power switch to ON, and for the first boot, you should see the blue LED blink. Press the SET button for a few seconds until the board utters something in Chinese (which I could not understand), and install & run IOT Espressif for Android (apk) or ESP-TOUCH for iOS on your smartphone. Skip all the initial steps, and tap on the top left icon, select Add devices, input your WiFi password, and click OK.

Click to Enlarge

After a few seconds, you should see one item added to the “Connected to WiFi Device List”, meaning the kit is now a client on your WiFi network. The blue LED should now be on at all times (no blinking).

Now we can try the voice assistant with “Alexa” wake word, which will cause the board to reply “您哈! 有什么吩咐“ (nin hao! you shenme fenfu) which translates to “Hello! How can I help you?”. We can then repeat “Alexa” with our request in Chinese.  I tried to ask for the time, and weather, and play music in the video below.

The assistant combines female and kid voices for interaction. I actually added one MP3 and one FLAC audio files in the micro SD card hoping it would start playing them, but instead it started some music from then net.

Microsemi ZL38063 Documentation & Tools

That’s all I could do for now, as we’ll need to get more documentation and some source code from Espressif Systems to further experiment with the platform. Although not compulsory, you may also be interested in ZL38063 audio processor resources since it interfaces with ESP32 over SPI for commands and I2S for audio. It may be necessary to change the wake word for example, although Espressif Systems mentioned they could do that themselves, and they’d just need 5,000 audio samples of the wake/hot word. Most of documentation and software tools are not public, so you’d need to request access to those with a company email address.

To my surprise, I managed to access the files using my website address, but sadly can’t share anything since none of the files are publicly available. The process is somewhat cumbersome, as you need to get approval for the account first which takes a few days, then request access to documentation for another day or two. There’s a separate login for software and registration to “Microsemi Software Delivery System (SDS)” is automatic, but again you need to request access to each software/firmware package individually which in my case was accepted within 24 hours. It would be good if Espressif Systems and/or Microsemi themselves could make it easier for developers to access those resources for a processor that was released in 2015.  Some documentation for ZL38063 based Microsemi AcuEdge Development Kit for Amazon AVS (ZLK38AVS) can be found on Github, but I’m not sure whether much of it is usable for the Espressif development kit.

Espressif Audio Mic HDK is not for sale just yet, but the company has sent the kit to several developers, so we should except some progress in the weeks or months ahead. I’ll likely check it out again once on English voice assistant is made to work, and more resources are made public.

Share this:

Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress

ROCK Pi 4C Plus
Subscribe
Notify of
guest
The comment form collects your name, email and content to allow us keep track of the comments placed on the website. Please read and accept our website Terms and Privacy Policy to post a comment.
3 Comments
oldest
newest
bob
bob
6 years ago

Nice way to learn Chinese :p

rudi
6 years ago

hi you got all this Microsemi ZL38063 Documentation & Tools from microsemi? they did never response to me. i know they are at eMbeddedWorld2018 in nuremburg and i will visit they and ask, why they not response to me since 8 weeks. and sure i have leave my company address – but they did not response. not sure, why espressif take this “crazy” DSP without any document and support to developers. usually developers get docu and response from company but i think Microsemi does not need the developer’s so i will not use this DSP in my future design if… Read more »

rudi
6 years ago

Hi
I’m still guessing how you have come to such controversial docus, tools and datasheet 🙂
– after all, these voice processors from microsemi are also used militarily

you happy guy
you have to have a big influence.
for months, the weblogins are not maintained and new ones are not processed.
for a known reason:

https://globenewswire.com/news-release/2018/03/01/1409430/0/en/Microchip-Technology-to-Acquire-Microsemi.html

be happy
best wishes
rudi 😉

Khadas VIM4 SBC