Definition:
Maximum Likelihood Estimation is a statistical method for estimating the parameters of a probabilistic model. MLE seeks the parameter values that maximize the likelihood of the observed data under the model.

Let be the observed data, and let the model’s probability density function or mass function be , where are the parameters to be estimated. The likelihood function is:

The MLE maximizes with respect to :

Log-Likelihood:
Since the likelihood is a product, it is often more convenient to maximize the log-likelihood:

The MLE is equivalent to maximizing :

Steps to Compute MLE:

  1. Write down the likelihood function or log-likelihood for the model.
  2. Differentiate with respect to and set the derivative to zero to find critical points:
  3. Solve for , and verify it is a maximum (e.g., using the second derivative test or inspecting behavior at boundaries).

Examples:

  1. Bernoulli Distribution:
    Observations , where , and .

    • Likelihood:
    • Log-likelihood:
    • Derivative:
    • Solving gives:

      (The sample mean is the MLE for .)
  2. Normal Distribution:
    Observations , and .

    • Likelihood:
    • Log-likelihood:
    • Derivatives and solutions:

Properties of MLE:

  1. Consistency:
    converges to the true parameter as .

  2. Asymptotic Normality:
    For large , is approximately normally distributed:

    where is the Fisher information.

  3. Efficient:
    MLE achieves the Cramér-Rao lower bound asymptotically, making it an efficient estimator under regularity conditions.

Applications:

  • Parameter estimation for probability distributions.
  • Training models in machine learning (e.g., Logistic Regression, Gaussian Mixture Models).
  • Hypothesis testing (e.g., likelihood ratio tests).