Picovoice Cobra Voice Activity Detection Engine shown to outperform Google WebRTC VAD

Picovoice Cobra Voice Activity Detection (VAD) engine has just been publicly released with support for Raspberry Pi, BeagleBone, NVIDIA Jetson Nano, Linux 64-bit, macOS 64-bit, Windows 64-bit, Android, iOS, and web browsers that support WebAssembly. Support for other Cortex-M and Cortex-A based SoCs can also be made available but only to enterprise customers.

Picovoice already offered custom wake word detection with an easy and quick web-based training and offline voice recognition for Raspberry Pi, and even later ported their voice engine to Arduino. Cobra VAD is a new release, and, like other VADs, aims to detect the presence of a human voice within an audio stream.

Picovoice Cobra can be found on Github, but note this is not an open-source solution, and instead, libpv_cobra.so dynamic library is provided for various targets, together with header files and demos in C, Python, Rust, and WebAssembly, as well as demo apps for iOS and Android.

The easiest and fastest way to try it out is via the demo embedded in the announcement. Just click on the microphone, and then make some noise and/or talk to see how it performs.

Any noise that’s not audible speech should be filtered, even under noisy environments, of course within some limits.

The company also published a voice activity benchmark to compare it to other solutions like Google WebRTC VAD access through py-webrtcvad Python program. The chart below, provided by Picovoice, shows the receiver operating characteristic (ROC) curve of the WebRTC and Cobra engine with a Signal-To-Noise ratio of 0dB. The chart is a little confusing, but the takeaway is that a larger area under the curve is better.

Cobra VAD engine is also said to be efficient, with a real-time factor of 0.05, or about 5% on a Raspberry Pi Zero, and 0.0006 on a more powerful Intel Core i7-1185G7 Tiger Lake laptop.

Jean-Luc Aufranc (CNXSoft)

Jean-Luc started CNX Software in 2010 as a part-time endeavor, before quitting his job as a software engineering manager, and starting to write daily news, and reviews full time later in 2011.

Share this:

Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress. We also use affiliate links in articles to earn commissions if you make a purchase after clicking on those links.

7 Replies to “Picovoice Cobra Voice Activity Detection Engine shown to outperform Google WebRTC VAD”

Is there any comparison with open source products?

Andrzej says:

October 29, 2021 at 00:49

Is the any working open source product like picovoice?

Reply
1. Jeroen says:
  
  October 29, 2021 at 12:24
  
  yes there are quite a few open source voice recogniction / assistants out there, but i haven’t gotten around to testing them, that’s why i asked.
  
  Reply
  1. zepan says:
    
    October 29, 2021 at 16:43
    
    maybe you can look at https://github.com/sipeed/Maix-Speech
    
    Reply
    1. Jean-Luc Aufranc (CNXSoft) says:
      
      October 29, 2021 at 16:58
      
      Is there a VAD engine too? I call only see speech recognition and text-to-speech for now.
    2. zepan says:
      
      October 29, 2021 at 17:16
      
      it is speech recognition engine for low end device, inlcude continuous digit recognition, KWS, and LVCSR, 1/10 memory cost compare to normal opensource speech recognition engine.

Who will spy better ?

Boardcon LGA3576 Rockchip RK3576 System-on-Module designed for AI and IoT applications

Jeroen says:

October 28, 2021 at 12:24

Is there any comparison with open source products?

1. Andrzej says:
  
  October 29, 2021 at 00:49
  
  Is the any working open source product like picovoice?
  
  1. Jeroen says:
    
    October 29, 2021 at 12:24
    
    yes there are quite a few open source voice recogniction / assistants out there, but i haven’t gotten around to testing them, that’s why i asked.
    
    1. zepan says:
      
      October 29, 2021 at 16:43
      
      maybe you can look at https://github.com/sipeed/Maix-Speech
      
      1. Jean-Luc Aufranc (CNXSoft) says:
        
        October 29, 2021 at 16:58
        
        Is there a VAD engine too? I call only see speech recognition and text-to-speech for now.
      2. zepan says:
        
        October 29, 2021 at 17:16
        
        it is speech recognition engine for low end device, inlcude continuous digit recognition, KWS, and LVCSR, 1/10 memory cost compare to normal opensource speech recognition engine.
Jack says:

October 30, 2021 at 09:15

Who will spy better ?

7 Replies to “Picovoice Cobra Voice Activity Detection Engine shown to outperform Google WebRTC VAD”

Leave a Reply Cancel reply

Leave a Reply