- categories: Data Science, Method
Definition:
Linear regression models the relationship between a dependent variable and one or more independent variables . For data points, the model predicts:
where:
- is the design matrix of features (including a column of ones for the intercept),
- is the vector of model parameters,
- is the vector of predictions.
The goal is to minimize the residual sum of squares (RSS):
Normal Equations:
The solution to minimizing is obtained by solving the normal equations:
The closed-form solution for is:
(if is invertible).
Derivation:
-
Start with the loss function:
-
Expand :
-
Take the gradient with respect to :
-
Set :
Key Properties:
-
Existence of Solution:
- If is invertible, has a unique solution.
- If is singular, regularization (e.g., Ridge Regression) may be used.
-
Geometric Interpretation:
The normal equations project onto the Column Space of , yielding the best linear approximation of . -
Computational Complexity:
Solving via the normal equations involves matrix inversion, with complexity . Gradient-based methods or QR decomposition can be more efficient for large .
Example:
Given and :
-
Compute :
-
Compute :
-
Solve for :
Result: .