IRIS Institutional Research Information System

On-device training on resource-constrained hardware, such as microcontrollers with limited memory and fixed-function convolutional accelerators, remains an open challenge in embedded computer vision. Standard backpropagation is often impractical due to its high memory requirements and reliance on operations unsupported by typical inference-optimized accelerators. Recent forward-only learning methods, such as Forward-Forward and PEPITA, offer lightweight alternatives by eliminating the backward pass, enabling training on ultra-low-power devices. However, these methods tend to degrade in performance on more complex tasks involving deeper networks and larger output spaces. In this work, we introduce the Contextual Convolution Block, a novel architectural module that enhances the representational capacity of forward-only networks by injecting ground truth class information during training. This allows the network to specialize convolutional kernels for specific classes without relying on gradients or weight transport. We further present an optimized implementation of this block using an im2col-based formulation, enabling efficient training on severely constrained devices. Our method significantly improves the scalability of forward-only training approaches, achieving stronger performance on complex classification tasks while preserving compatibility with embedded hardware limitations.

Contextual Convolutions for Scalable Forward-Only Learning on Tiny Devices

Abbassi Mehdi;Ancilotto Alberto;Farella Elisabetta

2025-01-01

Abstract

On-device training on resource-constrained hardware, such as microcontrollers with limited memory and fixed-function convolutional accelerators, remains an open challenge in embedded computer vision. Standard backpropagation is often impractical due to its high memory requirements and reliance on operations unsupported by typical inference-optimized accelerators. Recent forward-only learning methods, such as Forward-Forward and PEPITA, offer lightweight alternatives by eliminating the backward pass, enabling training on ultra-low-power devices. However, these methods tend to degrade in performance on more complex tasks involving deeper networks and larger output spaces. In this work, we introduce the Contextual Convolution Block, a novel architectural module that enhances the representational capacity of forward-only networks by injecting ground truth class information during training. This allows the network to specialize convolutional kernels for specific classes without relying on gradients or weight transport. We further present an optimized implementation of this block using an im2col-based formulation, enabling efficient training on severely constrained devices. Our method significantly improves the scalability of forward-only training approaches, achieving stronger performance on complex classification tasks while preserving compatibility with embedded hardware limitations.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2025

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/364928

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

social impact