Matrix Completion

categories: Linear algebra, Method, Problem

Definition

Matrix completion is the problem of recovering a full matrix from a subset of its observed entries. The task is to fill in the missing entries of a matrix while satisfying certain assumptions, such as low rank or sparsity.

Formally, let $M \in R^{m \times n}$ be the matrix to be completed, and $Ω$ denote the set of observed indices:

Ω = {(i, j) ∣ M_{ij} is observed} .

The goal is to reconstruct $M$ such that:

$M_{ij}$ matches the observed values for $(i, j) \in Ω$ , and
$M$ satisfies additional assumptions (e.g., low rank).

Applications

[[Recommender Systems:
Filling in missing entries in user-item rating matrices (e.g., Netflix or Spotify recommendations).
Image Processing:
Recovering corrupted images by completing the pixel intensity matrix.
Sensor Networks:
Estimating missing sensor measurements using spatial or temporal correlations.
Collaborative Filtering:
Predicting preferences in social networks.

Key Assumptions

Low Rank:
Many matrix completion problems assume that the underlying matrix $M$ has a low rank. This means:
$rank (M) ≪ min (m, n) .$
Sufficient Observations:
For exact recovery, the number of observed entries must satisfy:
$∣Ω∣ \geq O (r (m + n)) (for rank r) .$
Incoherence:
The observed entries should be uniformly distributed across the matrix.

Mathematical Formulation

Optimization Problem

The matrix completion problem is often posed as:

minimize rank (M), subject to M_{ij} = O_{ij}, \forall (i, j) \in Ω,

where $O$ is the observed matrix with entries $O_{ij}$ .

Relaxation Using Nuclear Norm

Since rank minimization is NP-hard, it is relaxed using the nuclear norm $∥ M ∥_{*}$ :

minimize ∥ M ∥_{*}, subject to M_{ij} = O_{ij}, \forall (i, j) \in Ω.

Here, $∥ M ∥_{*}$ is the sum of singular values of $M$ .

Algorithms

1. Singular Value Thresholding (SVT)

Iteratively approximates the matrix by applying singular value decomposition (SVD) and thresholding the singular values.
Steps:
1. Initialize a guess for $M$ .
2. Perform SVD: $M_{k} = U_{k} Σ_{k} V_{k}^{T}$ .
3. Threshold the singular values in $Σ_{k}$ .
4. Update $M$ to satisfy observed constraints.

2. Alternating Least Squares (ALS)

Assumes $M = U V^{T}$ , where $U \in R^{m \times r}$ and $V \in R^{n \times r}$ .
Iteratively solves:

U = argmin_{U} ∥ P_{Ω} (U V^{T} - O) ∥_{F}^{2}, V = argmin_{V} ∥ P_{Ω} (U V^{T} - O) ∥_{F}^{2},

where $P_{Ω}$ projects onto the observed entries.

3. Gradient Descent

Optimizes a loss function directly: $M min \frac{1}{2} ∥ P_{Ω} (M - O) ∥_{F}^{2} + λ ∥ M ∥_{*} .$
Uses gradient-based methods to minimize the objective.

4. Probabilistic Matrix Factorization (PMF)

Models the matrix as a probabilistic model and uses Bayesian inference for completion.

Evgeny's Notes

Explorer

Recent posts

Installing the Homebrew Channel App on an LG TV (Ubuntu)

Obsidian + Zettelkasten + PARA

About this site