Activation Functions

Activation Functions

Neural network nonlinearities — input range [−4, 4]
ReLU
max(0, x)
RANGE[0, ∞)
MONOTONICYes
ISSUEDying neurons
GELU
x · Φ(x)
RANGE(−0.17, ∞)
MONOTONICNo
USED INTransformers
tanh
(eˣ − e⁻ˣ) / (eˣ + e⁻ˣ)
RANGE(−1, 1)
MONOTONICYes
CENTEREDZero
Sigmoid
1 / (1 + e⁻ˣ)
RANGE(0, 1)
MONOTONICYes
ISSUEVanishing grad