Smart speakers have gain a lot of traction over the last few years, but many of the solutions are based on Google Assistant or Amazon Alexa voice services, with both companies likely tracking your voice searches the same way they track your online searches to provide a “personalized experience” and sell you products or server ads that match your interests.
If you don’t like being tracked that way, a solution is to use an open source voice assistant such as Mycroft, and install it on a Linux computer, Raspberry Pi 3 board, or Android device. The company also introduced Mark I reference hardware platform based on Raspberry Pi 2 in 2015, and while all those hardware options should be fine for the technically inclined, but not really suited to the typical end user, and AFAIK they all lack a microphone array for better hot word detection. So Mycroft has come up with Mark II smart speaker that should work out of the box with a 6-mic array, a speaker, and 4″ touchscreen.
Mycroft Mark II specifications:
- SoC – Xilinx quad-core processor
- Far-field 6-microphone array
- Hardware AEC, beamforming and noise reduction
- Stereo sound with dual 2″ drivers (10W)
- 3.5mm audio out
- Display – 4″ IPS LCD touchscreen
- USB – 1x USB Type A port
- Storage – MicroSD card slot
- Connectivity – Bluetooth and WiFi
- Power Supply – 18W power supply with international adapters
- Dimensions – 196 mm (H) x 105 mm (Ø)
The company did not name the Xilinx processor, but considering Mycroft normally runs on Linux. this can only be one of the Xilinx Ultrascale+ MPSoCs combining four Cortex A53 cores, a dual Cortex-R5 cores, and Ultrascale FPGA.
The device supports many of the same features as other smart speaker with for example built-in support for 140 different skills working with Roku, Twitter, Pandora, Wikipedia, Facebook, Philips Hue, and more. The language is however limited to English for now, although Mycroft community has been working on Spanish, Portuguese, Italian, French, and German language. The principle is also the same as for other smart speakers, with Mark II listening for a custom wake word (aka hot word), and once detected, the device sends the voice command to the cloud in order to process the audio, and send back the answer data. The company claims they do not store any data, contrary to what Google, Amazon, Apple (Siri) or Microsoft (Cortana) do, and only use open source software.
Source code can be found on Github with 5 main components involved in the process:
- PocketSphinx (and soon Precise) for wake-word detection
- Mozilla DeepSpeech for speech to text (starting in March 2018)
- Adapt and Padatious for natural language understanding
- Mimic for text-to-speech
- Python API for the skills framework
Mycroft Mark II smart speaker has recently launched on Kickstarter, and they’ll already surpassed their funding goal ($50,000) by raising close to $100,000 with 28 days to go. Rewards start with a $99 Mark II dev kit comes with all electronics but no housing, which you’ll be able to print using CAD files. If you want the complete assembled system with enclosure, a $129 pledge or greater is required. Shipping adds $15 to the US, and $35 to the rest of the world, with delivery for both the development kit, and smart speaker expected by December 2018. Visit Mycroft.ai for more details about the A.I solution and speaker.