We introduce a probabilistic dynamic quantization method for neural networks that combines the adaptability of dynamic quantization with the low memory footprint of static approaches. Our technique uses a surrogate probabilistic model of pre-activation statistics to estimate quantization parameters before layer execution, enabling input-adaptive quantization without storing intermediate activations. This design reduces the working memory overhead of conventional dynamic quantization while retaining robustness to distribution shifts. Our technique is evaluated across a diverse set of vision tasks and architectures, retaining accuracy comparable to dynamic quantization while operating at the memory cost of static quantization. It achieves an optimal balance between accuracy and computational cost when compared to conventional quantization strategies.

Probabilistic dynamic quantization for memory constrained devices

Santini Gabriele;Paissan Francesco;Farella Elisabetta
2025-01-01

Abstract

We introduce a probabilistic dynamic quantization method for neural networks that combines the adaptability of dynamic quantization with the low memory footprint of static approaches. Our technique uses a surrogate probabilistic model of pre-activation statistics to estimate quantization parameters before layer execution, enabling input-adaptive quantization without storing intermediate activations. This design reduces the working memory overhead of conventional dynamic quantization while retaining robustness to distribution shifts. Our technique is evaluated across a diverse set of vision tasks and architectures, retaining accuracy comparable to dynamic quantization while operating at the memory cost of static quantization. It achieves an optimal balance between accuracy and computational cost when compared to conventional quantization strategies.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/364927
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact