The increasing focus of the research community towards lightweight and small footprint neural network models is closing the gap between inference performance in cluster-scale models and tiny devices. In the recent past, researchers have shown how it is possible to achieve state-of-the-art performance in different domains (e.g. sound event detection, object detection, image classification) with small footprints and low computational cost architectures. However, these studies lack a comprehensive analysis of the input space used (e.g. for images) and present the results on standard RGB benchmarks. In this manuscript, we investigate the role of smart vision sensors (SVSs) in deep learning-based object detection pipelines. In particular, we combine the motion bitmaps with standard color spaces representations (namely, RGB, YUV, and grayscale) and show how SVSs can be used optimally for an IoT end-node. In conclusion, we report that, overall, the best-performing input space is grayscale augmented with the motion bitmap. These results are promising for real-world applications since many SVSs provide both image formats at low power consumption.

On the Role of Smart Vision Sensors in Energy-Efficient Computer Vision at the Edge

Alberto Ancilotto;Francesco Paissan;Elisabetta Farella
2022-01-01

Abstract

The increasing focus of the research community towards lightweight and small footprint neural network models is closing the gap between inference performance in cluster-scale models and tiny devices. In the recent past, researchers have shown how it is possible to achieve state-of-the-art performance in different domains (e.g. sound event detection, object detection, image classification) with small footprints and low computational cost architectures. However, these studies lack a comprehensive analysis of the input space used (e.g. for images) and present the results on standard RGB benchmarks. In this manuscript, we investigate the role of smart vision sensors (SVSs) in deep learning-based object detection pipelines. In particular, we combine the motion bitmaps with standard color spaces representations (namely, RGB, YUV, and grayscale) and show how SVSs can be used optimally for an IoT end-node. In conclusion, we report that, overall, the best-performing input space is grayscale augmented with the motion bitmap. These results are promising for real-world applications since many SVSs provide both image formats at low power consumption.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/331370
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact