Shannon Entropy

Definition:
Shannon entropy quantifies the uncertainty in a discrete probability distribution. For a random variable with possible outcomes and corresponding probabilities , the entropy is defined as:

Intuition:
Entropy measures the “average information content” or uncertainty of a random variable.

  • High entropy means more unpredictability (e.g., uniform distribution).
  • Low entropy means less unpredictability (e.g., highly skewed distribution).

Key Properties:

  1. Range:

    • if one outcome has probability 1 (perfect certainty).
    • for a uniform distribution over outcomes (maximum uncertainty).
  2. Additivity for Independent Variables:
    If and are independent:

  3. Non-Negativity:

  4. Invariance Under Reordering:
    The entropy value is unaffected by the ordering of probabilities.

Special Cases:

  • For a binary variable with probabilities and :

    Relation to Other Concepts:
  1. Cross-Entropy: Shannon entropy is a special case of cross-entropy when comparing a distribution to itself.
  2. Kullback-Leibler Divergence: Measures the difference between two distributions and relates to entropy: