GreenWaves GAP8 is a Low Power RISC-V IoT Processor Optimized for Artificial Intelligence Applications

GreenWaves Technologies, a fabless semiconductor startup based in Grenoble, France, has designed GAP8 IoT application processor based on RISC-V architecture, and optimized for image and audio algorithms including convolutional neural network (CNN) inference with high energy efficiency thanks to an 8-core computational cluster combined with a convolution hardware accelerator. The design is based on RISC-V based Parallel Ultra Low Power (PULP) computing open-source platform.

The new processor targets industrial and consumer products integrating artificial intelligence, and advanced classification such as image recognition, counting people and objects, machine health monitoring, home security, speech recognition, consumer robotics, wearables and smart toys.

Some of GAP8 processor specifications:

  • 1x extended RISC-V fabric controller core with 16 kB data and 4 kB instruction cache for system control
  • 8x extended RISC-V compute cores with 64 kB shared data memory and 16 kB shared instruction cache
  • 1x Hardware optimized synchronization unit
  • 1x Hardware Convolution Engine (HWCE)
  • Multi channel 1D/2D DMA, specialized multi-channel micro DMA for autonomous peripheral support
  • Programmable Voltage Regulator
  • Real Time Clock
  • 2x programmable clocks
  • Secured execution support with Memory Protection Unit
  • 512 kB State Retentive L2 Memory
  • Optional external high speed low power SDRAM up to 16 MB, through HyperBus
  • 32 kHz external quartz, Up to 250 MHz internal clock
  • I/O interfaces
    • 128 Mb/s LVDS IEEE compliant
    • Serial I/Q
    • UART
    • Quad SPI Master + additional SPI Master, SPI Slave
    • 1x I2S
    • 1x I2C
    • 1x Camera parallel interface
    • HyperBus (External Flash and RAM)
    • Up to 32 GPIOs
    • 4x PWM
  • Supply Voltage
    • 1.2 V down to 1V core VDD supply
    • 1.8 V to 3.3 V for I/Os
  • aQFN 84 package

The processor is capable of delivering up to 8 GOPS at a few tens of mW, or up to 200 MOPS at 1 mW thanks to partially a cycle 5×5 convolution. The company compared the (theoretical) performance differences between GP8 to STM32H7 (Cortex M7) MCU for a CNN graph, and we can clearly see the massive advantage the new processor has for that particular tasks.

TargetClockTimeCyclesActive Power
STM32 F7216 MHz99.1 ms21 405 60060 mW (STM32 H7)
GAP815.4 Mhz99.1 ms1 527 2323.7 mW
GAP8175 Mhz8.7 ms1 527 23270 mW

If GAP8 is configured to run at 15.4 MHz it can complete the task as fast as STM32 F7, but using only a fraction of the power, or run the task over 10 times faster when clocked at 175 MHz with a only slightly higher active power. Another way to look at power consumption, is the company’s claim that the processor can classify a QVGA image every three minutes for 10 years on a small 3.6 Wh battery.

Some typical use cases include:

  • Always-on face detection with a few mWs of power
  • Indoor people counting / presence detection with years of autonomy
  • Sub $15 machine vision and voice control solutions for consumer robotics
  • Single chip processing for 4 microphone voice capture and 10-word speaker independent keyword spotting

The company also offers a development kit comprised of GAPDUINO board, a sensor board, and a QVGA camera module. Beside GAP8 IoT processor, the Arduino compatible board features the following:

  • Memory / Storage – 256Mbits SPI flash, I2C EEPROM, HyperBus combo DRAM/Flash 512Mbits Flash + 64Mbits DRAM
  • Camera connector for external camera (e.g. Himax HM01B0)
  • USB port
  • Misc – Reset button, Configurable I/O voltage
  • Battery holder (SAF17500), DC connector
  • Arduino Uno compatible Master/Shield

GAP8 can be programming like any MCU thanks to GAP8 SDK including:

  • The RISC-V GCC/GDB tool chain with extensions to the optimizer for the extra instructions that we have added to GAP8
  • The MCU/Fabric Controller side tools which include 2 OS choices (this list will be extended in the future): PULP OS, or Arm Mbed OS (for RISC-V/GAP8)
  • Cluster side development tools – GAP8 AutoTiler to generate C code to automate the movement of data between L2 or external memory.
  • Code generators for the cluster – GAP8 Generator Library including different algorithms developed using the GAP8 AutoTiler. It includes CNN layers, FFT, Matrix Operations, FIR Filters, and more.

You can find more details about the GAP8 processor, and/or pre-order the development kit (199 Euros) scheduled to ship in April 2018 on GreenWaves website. The company is also attending Embedded World in Germany at the RISC-V Foundation booth (Hall 3A, Booth 3A-419).

Thanks to TLS for the tip

Support CNX Software - Donate via PayPal or cryptocurrencies, become a Patron on Patreon, or buy review samples
Notify of
newest most voted