Definition

Column-Row Factorization (CR) refers to the representation of a matrix as the product of two lower-rank matrices, emphasizing the column and row structures of . Specifically, can be expressed as:

where:

  • contains selected columns of .
  • contains coefficients encoding the linear combination of rows of to reconstruct .
    Here, is the rank of or an approximation to it.

Intuition

CR factorization separates a matrix into:

  1. A subset of its columns (), capturing essential structural information.
  2. A matrix () that linearly combines these columns to form the rows of .

This factorization is particularly useful when the matrix is rank-deficient or approximately low-rank, enabling efficient storage and computations.

Key Properties

  1. Rank Preservation: If has Rank , the product also has rank .
  2. Efficient Approximation: By selecting , CR factorization can approximate with reduced storage.
  3. Column-Row Interpretation:
    • provides representative columns.
    • encodes how the rows of can be reconstructed using these columns.
  4. Existence: Exists for any matrix ; can be chosen using pivoted column selection or other optimization methods like QR decomposition.

Applications

  1. Dimensionality Reduction: Provides a compact representation of data matrices for feature selection and compression.
  2. Low-Rank Matrix Approximation: Useful in scenarios like PCA, where approximate low-rank structures are desired.
  3. Computational Efficiency: Reduces the cost of operations on large matrices, particularly in numerical linear algebra and machine learning tasks.
  4. Data Reconstruction: In applications like collaborative filtering, CR factorization aids in reconstructing missing entries in matrices.
  5. Interpretability: The use of actual columns and rows from the data makes the factorization interpretable.