Multivariate optimisation involves finding the minimum or maximum of a function with multiple variables (). Most real-world engineering problems involve multiple design variables, making multivariate optimisation essential.

Mathematical Formulation

For a function where , we seek:

  • for minimization, or
  • for maximization

Key Concepts

Gradient Vector

The gradient vector () is the vector of first partial derivatives:

The gradient points in the direction of steepest increase of the function.

Example:

Hessian Matrix

The Hessian matrix () is the matrix of second partial derivatives:

The Hessian characterizes the local curvature of the function.

Example:

Note that this is a symmetric matrix.

Optimality Conditions

Necessary Conditions

For a smooth function, a necessary condition for a local minimum or maximum is:

This means the gradient equals the zero vector at the optimal point.

Sufficient Conditions

To determine the nature of a stationary point (where ):

  • If the Hessian is positive definite (all eigenvalues positive), then is a local minimum
  • If the Hessian is negative definite (all eigenvalues negative), then is a local maximum
  • If the Hessian has both positive and negative eigenvalues, then is a saddle point
  • If the Hessian is singular (has zero eigenvalues), higher-order tests are needed

Special Cases

Linear Functions

For :

  • No extrema exist unless constrained
  • Gradient is constant
  • Hessian is the zero matrix

Quadratic Functions

For where is symmetric:

  • If is positive definite, unique minimum at
  • If is negative definite, unique maximum at
  • If is indefinite, saddle point at
  • If is singular, no extrema or infinitely many extrema

Numerical Methods

Gradient-Based Methods

  • Steepest Descent:

    • Update:
    • Simple but can be slow, especially for ill-conditioned problems
    • Step size determined by line search
  • Conjugate Gradient:

    • Uses conjugate directions to improve convergence
    • Particularly effective for quadratic functions
    • Avoids the “zig-zag” behavior of steepest descent

Hessian-Based Methods

  • Newton’s Method:

    • Update:
    • Quadratic converg
  • Quasi-Newton Methods:

    • Approximates the Hessian or its inverse
    • BFGS (Broyden-Fletcher-Goldfarb-Shanno) is the most popular
    • Avoids explicit computation of second derivatives
  • Trust Region Methods:

    • Restricts the optimization step within a “trusted” region
    • Combines advantages of gradient and Hessian approaches
    • Robust for non-convex functions

Challenges in Multivariate Optimisation

Scaling

When different variables have vastly different scales, optimization algorithms may perform poorly. Solutions include:

  • Variable transformation
  • Preconditioning
  • Normalized coordinates

Ill-Conditioning

When the Hessian has a high condition number (ratio of largest to smallest eigenvalue), convergence can be slow. This occurs in:

  • Problems with highly interdependent variables
  • Problems with vastly different sensitivities to different variables

Multiple Local Optima

Unlike in the convex case, general nonlinear functions may have multiple local optima, making it difficult to find the global optimum. Strategies include:

  • Multiple random starts
  • Global optimization methods
  • Problem reformulation

Applications

Multivariate optimization appears in countless engineering applications:

  • Structural design (multiple dimensional parameters)
  • Control system tuning (multiple control parameters)
  • Machine learning (model parameters)
  • Financial portfolio optimization (multiple assets)
  • Chemical process optimization (multiple process variables)