Picovoice on-device speech-to-text engines slash the requirements and cost of transcription

Speech-to-text benchmarks accuracy

Picovoice Leopard and Cheetah offline, on-device speech-to-text engines are said to achieve cloud-level accuracy, rely on tiny Speech-to-Text models, and slash the cost of automatic transcription by up to 10 times. Leopard is an on-device speech-to-text engine, while Cheetah is an on-device streaming speech-to-text engine, and both are cross-platform with support for Linux x86_64, macOS (x86_64, arm64), Windows x86_64, Android, iOS, Raspberry Pi 3/4, and NVIDIA Jetson Nano. Looking at the cost is always tricky since companies have different pricing structures, and the table above basically shows the best scenario, where Picovoice is 6 to 20 times more cost-effective than solutions from Microsoft Azure or Google STT. Picovoice Leopard/Cheetah is free for the first 100 hours, and customers can pay a monthly $999 fee for up to 10,000 hours hence the $0.1 per hour cost with PicoVoice. If you were to use only 1000 hours out of your plan that […]

SmartCow Apollo – A Jetson Xavier NX devkit for conversational AI, computer vision

SmartCow Apollo Devkit

SmartCow Apollo is an audio/video AI engineering kit based on NVIDIA Jetson Xavier NX computer module designed for applications with conversational AI capabilities, such as speaker recognition and sentiment analysis. But considering a camera is included, computer vision applications should also be possible. The development kit comes with a 128GB NVMe SSD, four microphones, two speaker terminals, two 3.5mm phone jacks, an 8MP camera module, and a 2.08-inch OLED display with everything housed in a frame that keeps the module and accessories like that camera upright. SmartCow Apollo specifications: NVIDIA Jetson Xavier NX system-on-module CPU – 6-core NVIDIA Carmel ARMv8.2 64-bit CPU with 6MB L2 and 4MB L3 cache GPU – NVIDIA Volta architecture with 384 NVIDIA CUDA cores and 48 Tensor cores Memory – 8 GB or 16GB 128-bit LPDDR4x Storage – 16 GB eMMC 5.1 flash Display 1x Mini DP port 7-pin SPI header for OLED display (included) […]

Khadas Tea – A MagSafe Hi-Fi headphone amplifier to play lossless audio on smartphones (Crowdfunding)

Khadas Tea magnetic headphone amplifier

From this side of the Internet, Khadas is better known for their single board computer, but the company has also made Hi-Fi audio products starting with the Khadas Tone in 2018 as an add-on board for Khadas VIM/VIM2 SBC, followed by Khadas Tone 2 Pro mini desktop Hi-Fi system in 2020. The latest audio product from Khadas is a smartphone accessory with Khadas Tea being a thin MagSafe-compatible magnetic Hi-Fi headphone amplifier based on aptX HD and LDAC capable Qualcomm QCC5125 Bluetooth SoC and ESS ES9281AC Pro DAC that sticks to the back of your phone. Khadas Tea specifications: Bluetooth Audio SoC – Qualcomm QCC5125 Bluetooth 5.0 audio chipset USB DAC – ESS ES9281AC Pro Amplifier – RT6863D (Buffer Stage) Audio I/O 3.5mm headphone jack Built-in stereo microphone for making and receiving calls over Bluetooth Sampling Rate USB: up to 32bit 384KHz @ PCM, or DSD 256 (Native) Bluetooth: up […]

Solar-powered Bluetooth headset with Powerfoyle nano-material band remains charged at all times

Solar powered Bluetooth headset

Blue Tiger Solare is a solar-powered Bluetooth headset that you may never need to charge thanks to a Powerfoyle solar cell headband comprised of a “nano-material that transforms any outdoor and indoor light into clean, endless energy”. Solare Bluetooth 5.1 headset is said to be military-grade (MIL-STD-810), offers 97% noise cancellation, and is mostly designed for “road warriors” who may require a Bluetooth headset that’s charging continuously. I initially thought it would probably work better for hikers, bicycle and motorbike riders, than car drivers unless we’re talking about convertibles, but Blue Tiger caters to professional truck drivers. Solare highlights: Bluetooth 5.1 with up to ~90 meters range High-quality speaker Microphone with 97% noise cancellation Works with Sir and Google Assistant Endless Battery Life with Powerfoyle solar cell flexible headband Temperature Range – -40°C to +50°C Certifications IPX4 ingress protection rating MIL-STD-810 for extreme environments and ruggedness Solare solar-powered Bluetooth headset […]

Mico – A USB microphone based on Raspberry Pi RP2040 MCU

Mico Raspberry Pi RP2040 USB Microphone

Raspberry Pi RP2040 dual-core Cortex-M0+ microcontroller has found its way into Mico, a compact USB microphone with a PDM microphone providing better quality than cheap USB microphones going for one or two dollars or even 5 cents shipped for new Aliexpress users. The project started when Mahesh Venkitachalam (Elecronut Labs) was doing audio experiments with Machine Learning on the Raspberry Pi, and found out USB microphone dongles were extremely noisy with poor (distance) sensitivity, so he completed the project with a high-quality I2S microphone instead. He then had the idea of making his own USB microphone and found out Sandeep Mistry had already developed a Microphone Library for Pico, so he mostly had to work on the hardware that’s how Mico Raspberry Pi RP2040 USB microphone came to be. Mico specifications: MCU – Raspberry Pi RP2040 dual-core Cortex-M0+ microcontroller @ up to 133 MHz with 264KB SRAM Storage – 128Mbit […]

PicoVoice offline Voice AI engine gets free tier for up to 3 users

PicoVoice Console Custom Wake Word

PicoVoice offline Voice AI engine has now a free tier that allows people to create custom wake words and voice commands easily for up to three users on any hardware including Raspberry Pi and Arduino boards. I first learned about PicoVoice about a year ago when the offline voice AI engine was showcased on a Raspberry Pi fitted with ReSpeaker 4-mic array to showcase the company’s Porcupine custom wake word engine, and Rhino Speech-to-Intent engine. The demo would support 9 wake words with Alexa, Bumblebee, Computer, Hey Google, Hey Siri, Jarvis, Picovoice, Porcupine, and Terminator. More importantly, the solution allows you to easily create your own custom words in minutes from a web interface by simply typing the selected wake word, with no need for hundreds of voice samples or waiting weeks to get it done. So I tried “Hey You” first, but I was told it was too short, […]

Espressif introduces ESP32-S3-BOX AI development kit for online and offline voice applications

ESP32-S3-Box

Espressif Systems has very recently introduced the ESP32-S3-BOX AI voice devkit designed for the development of applications with offline and online voice assistants, and whose design I find similar to the M5Stack Core2 devkit, but the applications will be different. The ESP32-S3-BOX features the latest ESP32-S3 processor with WiFi and BLE connectivity, AI capabilities, as well as a 2.4-inch capacitive touchscreen display, a 2-mic microphone array, a speaker, and I/O connectors with everything housed in a plastic enclosure with a stand. ESP32-S3-BOX specifications: WiSoC – ESP32-S3 dual-core Tensilica LX7 up to 240 MHz with Wi-Fi & Bluetooth 5, AI instructions, 512KB SRAM Memory and Storage – 8MB octal PSRAM and 16MB QSPI flash Display – 2.4-inch capacitive touchscreen display with 320×240 resolution Audio – Dual microphone, speaker USB – 1x USB Type-C port for power and debugging (JTAG/serial) Expansion – 2x Pmod-compatible headers for up to 16x GPIOs Misc Power […]

Giveaway Week – Bluetrum AB32VG1 RISC-V board

Bluetrum AB32VG1 RISC-V audio board

The next prize for our once-in-a-year giveaway week is Bluetrum AB32VG1 RISC-V development board based on Bluetrum AB5301A MCU. The board is designed for Bluetooth audio applications as well as general-purpose applications and comes with Arduino headers for expansion. A USB Type-C cable is also included for power and programming. I tested the RISC-V board with RT-Thread open-source operating systems, and let’s say it was quite challenging because development tools need to be improved, but RT-Thread developers are committed to providing a better experience to developers. What’s important is that in the end, after some efforts, I managed to blink an LED and play some music through a pair of speakers connected to the 3.5mm audio jack, and the USB port for power. At the time of review, there was no Bluetooth sample, and a quick look at the documentation shows it may not be available just yet. If you’d […]

Exit mobile version