Model fitting is the process of identifying mathematical or computational models that best explain observed data. Regression, a common type of model fitting, focuses specifically on estimating relationships between variables. These techniques form the foundation of data-driven engineering and scientific discovery.

Fundamental Concepts

Model Components

A typical model fitting process involves:

  1. Observed Data: A set of measurements or observations for
  2. Model Function: A mathematical relationship parameterized by
  3. Error Measure: A quantification of the discrepancy between observations and model predictions
  4. Optimization Problem: Finding the parameters that minimize the error measure

Types of Models

Models can be classified based on their structure:

  1. Linear Models:

    • Linear in parameters , not necessarily linear in
    • Examples: linear regression, polynomial regression, basis function models
  2. Nonlinear Models: is nonlinear in parameters

    • Examples: exponential models, logistic models, power laws, neural networks
  3. Parametric vs. Nonparametric Models:

    • Parametric: Fixed form with finite number of parameters
    • Nonparametric: Flexible form, potentially infinite parameters (e.g., kernel methods)

Mathematical Formulation

General Problem Statement

Given data points for , find the parameter vector that minimizes:

where is an error function measuring the discrepancy between observed and predicted .

Common Error Functions

  1. Squared Error:

    • Leads to least squares regression
    • Sensitive to outliers
    • Optimal for Gaussian noise
  2. Absolute Error:

    • Leads to least absolute deviations (LAD) regression
    • More robust to outliers
    • Optimal for Laplace distributed noise
  3. Huber Loss: A hybrid approach

    • for
    • for
    • Combines robustness of absolute error with smoothness of squared error

Linear Regression

Simple Linear Regression

The simplest form of regression models the relationship between two variables with a straight line:

where represents random error.

Multiple Linear Regression

Extends simple linear regression to multiple independent variables:

Matrix Formulation

The multiple linear regression can be expressed in matrix notation:

where:

  • is the vector of observed dependent variables
  • is the design matrix containing the independent variables
  • is the parameter vector
  • is the error vector

Least Squares Solution

Under the least squares criterion, the optimal parameter vector is:

This closed-form solution exists when is invertible.

Polynomial Regression

A specific case of linear regression where basis functions are powers of :

This is linear in parameters despite modeling nonlinear relationships in .

Nonlinear Regression

Formulation

Nonlinear regression involves models where the parameters appear nonlinearly:

Common examples include:

  • Exponential models:
  • Growth models:
  • Sinusoidal models:

Optimization Approaches

Unlike linear regression, nonlinear regression typically requires iterative numerical optimization:

  1. Gauss-Newton Method: Specialized for least squares problems
  2. Levenberg-Marquardt Algorithm: Robust modification of Gauss-Newton
  3. Gradient Descent: Simple but potentially slow convergence
  4. Trust Region Methods: Restrict optimization steps to regions where the model is trusted

Regularization Techniques

Regularization addresses overfitting by adding penalties to the objective function:

Ridge Regression (L2 Regularization)

  • Shrinks parameters toward zero
  • Handles multicollinearity well
  • Closed-form solution:

Lasso Regression (L1 Regularization)

  • Produces sparse solutions (feature selection)
  • No closed-form solution, requires quadratic programming
  • Effective for high-dimensional data

Elastic Net

Combines L1 and L2 regularization:

Model Evaluation and Selection

Performance Metrics

  1. Mean Squared Error (MSE):
  2. Root Mean Squared Error (RMSE):
  3. Mean Absolute Error (MAE):
  4. Coefficient of Determination (R²):
    • Represents the proportion of variance explained by the model
    • Ranges from 0 to 1 for linear models (can be negative for nonlinear models)
    • Higher values indicate better fit

Cross-Validation

Techniques to assess model performance on unseen data:

  1. k-fold Cross-Validation: Divides data into k subsets, trains on k-1 subsets and tests on the remaining one, rotating through all subsets
  2. Leave-One-Out Cross-Validation: Special case of k-fold where k equals the number of data points
  3. Train-Test Split: Divides data into training and testing sets (typically 70-30% or 80-20%)
  4. Hold-out Validation: Sets aside a portion of data for final evaluation

Model Selection Criteria

Metrics that balance goodness-of-fit with model complexity:

  1. Akaike Information Criterion (AIC):
    • is the number of parameters
    • is the likelihood function value
  2. Bayesian Information Criterion (BIC):
    • Penalizes complexity more severely than AIC
  3. Adjusted R²:
    • Adjusts R² based on the number of predictors

Parameter Uncertainty and Confidence

Parameter Covariance Matrix

For least squares estimation, the parameter covariance matrix is:

where is the variance of the error term.

Confidence Intervals

For parameter , the 95% confidence interval is:

where and is the t-statistic with degrees of freedom.

Prediction Intervals

For a new observation at , the prediction interval is:

Specialized Regression Techniques

Weighted Least Squares

Incorporates varying reliability of observations:

where are weights, often inversely proportional to the variance of observations.

Robust Regression

Methods less sensitive to outliers:

  1. M-Estimation: Minimizes a robust loss function
  2. MM-Estimation: Combines high breakdown point with efficiency
  3. Least Trimmed Squares: Minimizes the sum of the smallest residuals

Bayesian Regression

Incorporates prior knowledge about parameters:

  • Prior: represents knowledge before seeing data
  • Likelihood: represents how well parameters explain data
  • Posterior: represents updated knowledge after seeing data

Implementation with Optimization Algorithms

Model fitting is fundamentally an optimization problem. Common approaches include:

  1. Direct Methods: For linear models with closed-form solutions
  2. Iterative Methods: Required for nonlinear models
    • Gauss-Newton: Efficient for moderate nonlinearity
    • Levenberg-Marquardt: More robust, combines Gauss-Newton with steepest descent
    • Trust Region Methods: Controls step size based on model reliability
    • Stochastic Gradient Descent: Effective for large datasets

Applications in Engineering

  1. System Identification: Determining mathematical models of dynamic systems
  2. Empirical Modeling: Creating models based on experimental data
  3. Parameter Estimation: Determining physical parameters from measurements
  4. Response Surface Methodology: Optimizing processes using experimental designs
  5. Calibration: Adjusting simulation models to match observed behavior