Network training algorithms have heavily concentrated on the learning of connection weights. Little effort has been made to learn the amplitude of the activation functions, which defines the range of values that the function can take. This paper introduces novel algorithms to learn the amplitudes of non-linear activations in layered networks, without any assumption on their analytical form. Three instances of the algorithms are developed: (i) a common amplitude is shared among all the non-linear units; (ii) each layer has its own amplitude; (iii) neuron-specific amplitudes are allowed. Experimental results validate the approach to a large extent, showing a dramatic improvement in performance over the nets with fixed amplitudes

Activation Functions with Learnable Amplitude

Trentin, Edmondo
1999

Abstract

Network training algorithms have heavily concentrated on the learning of connection weights. Little effort has been made to learn the amplitude of the activation functions, which defines the range of values that the function can take. This paper introduces novel algorithms to learn the amplitudes of non-linear activations in layered networks, without any assumption on their analytical form. Three instances of the algorithms are developed: (i) a common amplitude is shared among all the non-linear units; (ii) each layer has its own amplitude; (iii) neuron-specific amplitudes are allowed. Experimental results validate the approach to a large extent, showing a dramatic improvement in performance over the nets with fixed amplitudes
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/1883
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact