Principal investigators:

Benoit R. COTTEREAU (CNRS research director at IPAL and CerCo laboratories, see: https://ipal.cnrs.fr/benoit-cottereau-personal-page/) for France.

Gim Hee Lee (associate professor at the National University of Singapore, NUS) and Angela Wang (associate professor at the Singapore University of Technology and Design, SUTD) for Singapore.

Funding:

~1.8 million euros (~950 k€ for France)

Duration:

4 years (01/10/2023 – 30/09/2027)

Objectives:

The aim of this project is to take inspiration from biology to design fast, robust and energy efficient artificial vision systems dealing with motion, depth, image restoration, odometry, mapping and localization under degraded visual conditions. These systems will be based on Event-based cameras and Spiking neural networks (SNNs) (see figure 1).

Space_SNN_Figure

Figure 1: Global architecture of the proposed artificial vision systems. Visual information will be captured from event-based camera(s) placed on moving vehicles (drones, cars…) or agents. The spikes generated by these cameras will be processed by spiking neural networks (SNNs) to extract the spatio-temporal properties (optic flow, depth or semantic classes) of the surrounding environment.

Contrary to classical cameras which transmit the luminance (or RGB) values of all the pixels in the visual scene at a fixed frequency (e.g., 60Hz), event-based cameras reproduce processing in biological retinas and only transmit a signal (or ‘spike’) when a change is detected in the inputs. They are thus more robust to visually degraded conditions (illumination change, nighttime, rain, snow,…). The best way to process their output spikes is to use spiking neural networks which, as in biological systems, only transmit the information when their membrane potential is reached. This binary process makes SNN extremely energy efficient and easy to implement on neuromorphic chips. During the SPACE-SNN project, the proposed artificial vision systems (see figure 3) will be used in 4 work packages:

Work Package 1 – Nighttime motion and depth estimation:

Here we will develop robust, fast and energy efficient SNNs to process motion and depth under degraded visual conditions. We will first tackle the problem of flow parsing, which is to estimate the 3D motion of elements in a scene (either objects or people) from moving cameras (e.g. when the cameras are placed on a vehicle or moving agents). We will subsequently develop systems for motion in depth extraction under visually degraded conditions. The last part of this work package will be dedicated to the detection and tracking of biological motion and to action recognition.

Work Package 2 – Vision restoration:

Here we will use event-based cameras and SNN to improve the quality of visual information. We will first explore how event-based cameras and SNNs can be used for video super-resolution, i.e. to upscale and improve the resolution of low resolution videos. Then, we will develop night-time light effect suppression approaches based on SNNs. Finally, we will be interested in the reconstruction of occluded videos.

Work Package 3 – Hybrid approaches for visual SLAM:

We will develop hybrid systems based on event-based and conventional cameras for simultaneous localization and mapping (SLAM). We will first develop SNNs that process event-based data for visual scene recognition in daytime and night scenes. We will subsequently improve these systems to perform SLAM. Finally, we will consider hybrid systems (i.e., based on event-based and conventional cameras) for SLAM.

Work Package 4 – Computing architecture for accelerating event-based vision:

We will develop innovative neuromorphic architecture with the goal of accelerating event-based vision processing. This involves maximizing computational efficiency while minimizing energy consumption. We first intend to employ a memory-centric neuromorphic paradigm, positioning the execution of the event-based vision algorithm close to or within memory banks. Subsequently, we will design the architecture with a finely tuned dataflow customized for the specific demands of the vision application.

Importantly, our models will also be evaluated with respect to biological data (their responses will be compared to those observed in neural recordings and/or in behavioural experiments). This approach in computational neurosciences has two main objectives. First, it will permit to validate our SNNs as models of human visual perception (notably under degraded conditions). Secondly, the comparison with biology can provide leads to further improve the performances of our SNNs and their robustness by refining their architecture.

This inter-disciplinary project is at the interface between AI, computational neurosciences and computer vision and fits into IPAL theme 5 (‘Efficient AI’, see: https://ipal.cnrs.fr/research/).

Other investigators:

France: Timothée Masquelier (CNRS research director at CerCo laboratory) and Nicolas Cuperlier (associate professor at CYU Cergy Paris University).

Students:

Onur Ates (PhD student working on WP1: depth and motion estimation with event-based cameras and spiking neural networks).