- categories: Data Science, Definition
Definition:
Key Properties:
- Range: .
- Derivative:
Peaks at , where , and decreases symmetrically as increases. - Monotonicity: is monotonically increasing for all .
- Asymptotes: as and as .
- Symmetry:
This symmetry makes it useful in probabilistic models. - Relationship to Log-Odds:
If , then:
where represents the log-odds of . - Saturation: When is large, gradients vanish, which can slow training in deep networks.
Connection to the Softplus Function:
The softplus function is defined as:
- Gradient Connection: The sigmoid function is the derivative of the softplus function:
- Softplus Approximation to ReLU:
The softplus function is a smooth approximation of the ReLU function, while the sigmoid is more tightly linked to probabilities. - Range vs. Output Behavior:
- maps to , suited for probabilities.
- maps to , suited for non-negative outputs (e.g., certain loss functions).
Applications Comparison:
- Sigmoid is used for probabilistic interpretations and outputs.
- Softplus is used in settings requiring smooth, non-negative activations, such as Poisson regression models.