picoLLM is a cross-platform, on-device LLM inference engine

picoLLM Raspberry Pi 5

Large Language Models (LLMs) can run locally on mini PCs or single board computers like the Raspberry Pi 5 but with limited performance due to high memory usage and bandwidth requirements. That’s why Picovoice has developed the picoLLM Inference Engine cross-platform SDK optimized for running compressed large language models on systems running Linux (x86_64), macOS (arm64, x86_64), and Windows (x86_64), Raspberry Pi OS on Pi 5 and 4, Android and iOS mobile operating systems, as well as web browsers such as Chrome, Safari, Edge, and Firefox. Alireza Kenarsari, Picovoice CEO, told CNX Software that “picoLLM is a joint effort of Picovoice deep learning researchers who developed the X-bit quantization algorithm and engineers who built the cross-platform LLM inference engine to bring any LLM to any device and control back to enterprises”. The company says picoLLM delivers better accuracy than GPTQ when using Llama-3.8B MMLU (Massive Multitask Language Understanding) as a […]

Rockchip RK2118G/RK2118M dual-core Star-SE Armv8-M microcontrollers target smart audio applications

Rockchip RK2118G microcontroller block diagram

Rockchip RK2118G and RK2118M smart audio microcontrollers based on a dual-core Star-SE Armv8-M processor, an NPU for smart AI audio processor, three DSPs, 1024KB SRAM, optional DDR memory in package, and a range of peripherals. I first noticed the RK2118M in slides from the Rockchip Developer Conference 2024 last March, but I did not have enough information for an article at the time. Things have now changed since I’ve just received a bunch of datasheets including the one for the RK2118G and RK2118G microcontrollers, which look identical except for the DDR interface and optional built-in 64MB RAM for the RK2118G. The datasheets have only one reference to Arm with the string “Arm-V8M” and nothing else, and Cortex is not mentioned at all. But the slide above reveals the STAR-SE core looks to be an Arm Cortex-M33 core. We also learn the top frequencies for the “STAR-M33″/”STAR-SE” core  (300MHz) and the […]

ESP32-S3-BOX-3 devkit comes with 2.4-inch display, dual microphone, PCIe expansion connector

ESP32-S3-BOX-3

Espressif Systems has launched an update to their ESP32-S3-Box development kit for online and offline voice assistants with the ESP32-S3-BOX-3 devkit that still features a 2.4-inch capacitive touchscreen display with 320×240 resolution, two microphones, a built-in speaker, and a USB-C port, but replaces the PMOD connector by a PCIe connector for various expansion modules. The open-source ESP32-S3 development kit is powered by the ESP32-S3 SoC with AI extensions and can be used to implement all sorts of solutions using the company’s ESP-SR, ESP RainMaker, and Matter solutions such as an offline voice assistant, a chatbot powered by ChatGPT, a handheld gaming console, a tiny robot, a Matter-compatible Smart Home hub, and more. ESP32-S3-BOX-3 specifications: WiSoC – ESP32-S3 dual-core Tensilica LX7 up to 240 MHz with Wi-Fi 4 & Bluetooth 5, AI instructions, 512KB SRAM Memory and Storage – 16MB octal PSRAM and 16MB QSPI flash Display – 2.4-inch capacitive touchscreen […]

Espressif ESP-SR enables on-device speech recognition framework on ESP32-S3 and ESP32 WiSoCs

ESP SR ESP32 on device speech recognition AFE

Espressif ESP-SR is a speech recognition framework enabling on-device speech recognition on ESP32 and ESP32-S3 wireless microcontrollers with the latter being recommended due to its vector extension for AI acceleration and larger, high-speech octal SPI PSRAM. The ESP-SR framework was first released on December 17, 2021 with version 1.0, before the v1.20 update was introduced in March of this year, but I only found out about ESP-SR offline speech recognition solution through a tweet by John Lee showing an ESP-SR demo video by @ThatProject. Comrades of the world, liberate your hands from the chains of typing and touching germy switches! Embrace the revolutionary power of speech recognition with ESP32-S3 + ESP-SR. Let your words flow freely, for the proletariat shall not be silenced by keyboards or bourgeois input… pic.twitter.com/bm3udteB3o — John Lee (@EspressifSystem) July 15, 2023 I initially was confused since ESP32 boards have supported speech recognition for years using […]

Offline voice recognition module supports Arduino programming, custom voice commands

Offline Voice Recognition Arduino

We’ve already covered inexpensive offline voice recognition modules based on US516P6 or TW-ASR ONE microcontrollers that allow people to add smarts to their projects without a network connection for improved privacy and lower latency. Those are great in theory, but at the time (April 2022) documentation was lacking or only in Chinese, and they were fairly hard to use based on some of the comments in my earlier posts. But today, I’ve noticed DFRobot is now selling the “Gravity: Offline Voice Recognition Sensor – I2C & UART” module with support for Arduino programming, and it looks fairly easy to customize as we’ll see further below. Gravity Voice Recognition DF2301QG module specifications: Voice recognition module – WS-2520-TR module with MCU – TBD 121 commonly used fixed voice commands, one-fixed wake word Support for 1 learned wake-word, 17 user-defined commands Audio Output – Built-in speaker and external speaker interface Input – Dual […]

Videostrong HC1 Home Care Hub for the elderly serves as Smart Speaker, Smart Home gateway, video phone

Videostrong HC1 Home Care Hub

Videostrong HC1 Home Care Hub is a Smart Home/IoT gateway designed for the elderly that also serves as a smart speaker with 10-meter far-field voice recognition, a video phone with a video built-in camera and speaker, and a 4K Android TV box. The system is based on an Amlogic S905Y4 quad-core Cortex-A35 processor coupled with up to 4GB RAM and 64GB RAM, supports Ethernet, WiFi, Bluetooth, Zigbee, and LoRa, WiFi, and Bluetooth connectivity, and offers both HDMI 2.1 video output, and HDMI 2.0 video input. HC1 specifications: SoC – Amlogic S905Y4 quad-core Arm Cortex-A35 @ 2.0GHz with Arm Mali-G31 MP2 GPU with OpenGL ES 3.2 support System Memory – 2GB or 4GB RAM Storage – 8GB, 16GB, 32GB, or 64GB eMMC flash Video Output – HDMI 2.1 port up to 4Kp60 Input Built-in 1920×1080 camera with 90° wide angle, manual cover, adjustable angle HDMI 2.0 port up to 4Kp60 Audio […]

Banana BPI-P2 Pro headless SBC features RK3308 CPU, PoE Ethernet, WiFi 5, audio jack

Banana Pi BPI-P2 Pro

Banana Pi BPI-P2 Pro is Rockchip RK3308 quad-core Cortex-A35 SBC for headless applications with a PoE-capable Ethernet port, WiFi 5, a USB port, an audio jack, and two GPIO headers for expansion. You may think the Banana Pi guys have gone crazy by calling such entry-level level SBC “Pro”, but that’s because the company previously released the BPI-P2 Zero and BPI-P2 Maker single board computers based on Allwinner H2+ quad-core Cortex-A7 processor, so the BPI-P2 Pro is indeed an improvement albeit with some caveats. Banana Pi BPI-P2 Pro specifications: SoC – Rockchip RK3308 quad-core Arm Cortex-A35 processor @ up to 1.3 GHz with built-in VAD (Voice Activity Detector) System Memory – 2GB LPDDR2 SDRAM [Update: According to Rockchip RK3308 specifications, the maximum memory capacity is 512MB, so Banana Pi may have meant 2 Gbit instead, meaning 256MB of RAM]. Storage – 8GB eMMC flash, microSD card slot Video Output – […]

Allwinner R128 wireless SoC features 64-bit RISC-V core, Arm Cortex-M33 core, and HiFi 5 audio DSP

Allwinner R128

Allwinner is mostly known for its low-cost Arm processor running Android or Linux, but the Allwinner R128 is a wireless audio SoC with a C906 64-bit RISC-V application core, an Arm Cortex-M33 real-time time core, a HiFi 5 DSP, and built-in WiFi and Bluetooth connectivity. The SoC also comes with 1MB SRAM, up to 16MB flash, up to 32MB PSRAM, display and camera interfaces, support for microphone arrays, and plenty of I/Os that should make it suitable for smart speakers and other voice-controlled home appliances with or without display. Allwinner R128 specifications: Application core – Xuantie C906 64-bit RISC-V core clocked at 600 MHz. DSP – Cadence HiFi 5 audio DSP clocked at 400 MHz Communication core – Arm M33 Star (Cortex-M33 from Arm China?) core clocked at 240 MHz with Trustzone support Memory 1MB SRAM 8MB, 16MB, or 32MB PSRAM (SiP = System-in-Package) OPI PSRAM controller Storage QPI flash […]

EmbeddedTS embedded systems design