Definition

Matrix completion is the problem of recovering a full matrix from a subset of its observed entries. The task is to fill in the missing entries of a matrix while satisfying certain assumptions, such as low rank or sparsity.

Formally, let be the matrix to be completed, and denote the set of observed indices:

The goal is to reconstruct such that:

  1. matches the observed values for , and
  2. satisfies additional assumptions (e.g., low rank).

Applications

  1. [[Recommender Systems:
    Filling in missing entries in user-item rating matrices (e.g., Netflix or Spotify recommendations).

  2. Image Processing:
    Recovering corrupted images by completing the pixel intensity matrix.

  3. Sensor Networks:
    Estimating missing sensor measurements using spatial or temporal correlations.

  4. Collaborative Filtering:
    Predicting preferences in social networks.

Key Assumptions

  1. Low Rank:
    Many matrix completion problems assume that the underlying matrix has a low rank. This means:

  2. Sufficient Observations:
    For exact recovery, the number of observed entries must satisfy:

  3. Incoherence:
    The observed entries should be uniformly distributed across the matrix.


Mathematical Formulation

Optimization Problem

The matrix completion problem is often posed as:

where is the observed matrix with entries .

Relaxation Using Nuclear Norm

Since rank minimization is NP-hard, it is relaxed using the nuclear norm :

Here, is the sum of singular values of .


Algorithms

1. Singular Value Thresholding (SVT)

  • Iteratively approximates the matrix by applying singular value decomposition (SVD) and thresholding the singular values.
  • Steps:
    1. Initialize a guess for .
    2. Perform SVD: .
    3. Threshold the singular values in .
    4. Update to satisfy observed constraints.

2. Alternating Least Squares (ALS)

  • Assumes , where and .
  • Iteratively solves:

where projects onto the observed entries.

3. Gradient Descent

  • Optimizes a loss function directly:
  • Uses gradient-based methods to minimize the objective.

4. Probabilistic Matrix Factorization (PMF)

  • Models the matrix as a probabilistic model and uses Bayesian inference for completion.